real numbers in Intkey
Kevin Thiele
K.Thiele at CBIT.UQ.EDU.AU
Fri Sep 3 01:52:45 CEST 2004
- From: Eric Zurcher
> The proposal you described earlier sounds interesting in that it
> apparently attempts to allow for different levels of measurement
> precision by the key designer vs. the key user. I can see the value of
> that: the designer might have encoded a range of (for example)
> 3.35-3.85 mm for a taxon; but a user attempting to identify a specimen
> from the large end of the range, yet able to measure only to the
> nearest 0.5 mm, might enter a value of "4", which would falsely
> exclude the taxon.
>
> But the problem I see with this is: how can you reliably infer this
> precision? In my hypothetical example, the key user can measure to the
> nearest 0.5 mm. If she enters a value of, say, "4.5", how can Lucid
> "know" that this reflects a range of 4.25-4.75, rather than a range of
> 4.45-4.55? There is also a problem with non-fractional values: Does
> "200" infer a range of 150-250, 195-205, or 199.5-200.5?
>
> I don't see any reliable way get around this problem aside from
> burdening the user with the need to indicate the estimated error
> associated with their measurements.
Hi Eric,
I suppose we are making what seems to me to be a standard decimal
assumption: viz, that if a value is entered with no decimal points (e.g.
4) then we assume a precision of +/- 0.5 (that is, 3.5-4.5); if a value
is entered with one decimal point (e.g. 4.0) we assume +/- 0.05 (that
is, 3.95-4.05).
You're quite right that we can't account for cases where someone wishes
to nominate another precision - e.g. if someone enters 4.5 and wants to
state a precision of +/- 0.25, as in your example. Requiring that a user
nominate their precision precisely would be ideal but would a pain all
round.
The solution we've implemented is imperfect with respect to such defined
precisions, but it still seems better to me than the key chucking out a
taxon coded as 4-5 because a user enters 3.99
The last issue you raise is interesting. In Lucid (as I think in DELTA)
we require that all numeric data for a given numeric character be
recorded using the same units - we don't allow one taxon to be recorded
as having leaves 1-2 mm long (using mm as the unit of measurement) and
another as having leaves 1-2 m long (using m as the unit). Hence, if
leaf length is recorded in mm and taxon x is 1-2 and taxon y is 100-200,
we would treat 1-2 as including 0.5-2.5 (if the user entered an integer
value) and taxon y as 99.5-200.5. But of course, one could expect a
larger error for the larger taxon. This would be better handled using a
percentage error directive such as you have. But this is a separate
issue and requires a separate solution (unless, getting back to the
point above, we could allow x to be recorded as 1-2 mm and y as 1-2 m,
in which case our method would also allow for scaleability of the error.
But this would be clumsy).
Cheers - Kevin
More information about the delta-l
mailing list