real numbers in Intkey

Kevin Thiele K.Thiele at CBIT.UQ.EDU.AU
Fri Sep 3 01:52:45 CEST 2004


- From: Eric Zurcher

> The proposal you described earlier sounds interesting in that it 
> apparently attempts to allow for different levels of measurement 
> precision by the key designer vs. the key user. I can see the value of 
> that: the designer might have encoded a range of (for example) 
> 3.35-3.85 mm for a taxon; but a user attempting to identify a specimen 
> from the large end of the range, yet able to measure only to the 
> nearest 0.5 mm, might enter a value of "4", which would falsely 
> exclude the taxon.
>
> But the problem I see with this is: how can you reliably infer this 
> precision? In my hypothetical example, the key user can measure to the 
> nearest 0.5 mm. If she enters a value of, say, "4.5", how can Lucid 
> "know" that this reflects a range of 4.25-4.75, rather than a range of 
> 4.45-4.55? There is also a problem with non-fractional values: Does 
> "200" infer a range of 150-250, 195-205, or 199.5-200.5?
>
> I don't see any reliable way get around this problem aside from 
> burdening the user with the need to indicate the estimated error 
> associated with their measurements.

Hi Eric,

I suppose we are making what seems to me to be a standard decimal 
assumption: viz, that if a value is entered with no decimal points (e.g. 
4) then we assume a precision of +/- 0.5 (that is, 3.5-4.5); if a value 
is entered with one decimal point (e.g. 4.0) we assume +/- 0.05 (that 
is, 3.95-4.05).

You're quite right that we can't account for cases where someone wishes 
to nominate another precision - e.g. if someone enters 4.5 and wants to 
state a precision of +/- 0.25, as in your example. Requiring that a user 
nominate their precision precisely would be ideal but would a pain all 
round.

The solution we've implemented is imperfect with respect to such defined 
precisions, but it still seems better to me than the key chucking out a 
taxon coded as 4-5 because a user enters 3.99

The last issue you raise is interesting. In Lucid (as I think in DELTA) 
we require that all numeric data for a given numeric character be 
recorded using the same units - we don't allow one taxon to be recorded 
as having leaves 1-2 mm long (using mm as the unit of measurement) and 
another as having leaves 1-2 m long (using m as the unit). Hence, if 
leaf length is recorded in mm and taxon x is 1-2 and taxon y is 100-200, 
we would treat 1-2 as including 0.5-2.5 (if the user entered an integer 
value) and taxon y as 99.5-200.5. But of course, one could expect a 
larger error for the larger taxon. This would be better handled using a 
percentage error directive such as you have. But this is a separate 
issue and requires a separate solution (unless, getting back to the 
point above, we could allow x to be recorded as 1-2 mm and y as 1-2 m, 
in which case our method would also allow for scaleability of the error. 
But this would be clumsy).

Cheers - Kevin



More information about the delta-l mailing list