[delta-l] DELTA standard - dashes
sgbotsford at gmail.com
Sun Jan 29 18:53:44 CET 2012
ALL parsing tokens should accept reasonable alternative encodings -- if it
visible looks alike, it should be treated alike.
So, for example hyphen, en-dash em-dash should be equivalent back tick,
vertical tick and single quotes should be equivalent. double quotes and
guillemets and angle quotes;
If $ is used as a marker, then other currencies should be accepted too.
In terms of programming the easiest way would be to have a 'table of
equivalencies' and so the first pass through substitutes an arbitrary token
for these whenever they occur. This allows easy customization depending on
what keyboard is local.
Along with this however you need either some error checking so that the
same character cannot be used for two different tokens. E.g. if
range operator = dash => -, em-dash, en-dash, U1234, etc
choice of options = pipe => |, solidus, double-dagger, en-dash
SOMETHING better fuss.
Sherwood of Sherwood's Forests
Sherwood's Forests -- http://Sherwoods-Forests.com
50042 Range Rd 31
Warburg, Alberta T0C 2T0
On Sun, Jan 29, 2012 at 10:18 AM, Thomas Kluyver <takowl at gmail.com> wrote:
> I found when copying and pasting examples from the DELTA standard that the
> examples of the 'to' separator all use the N dash (–, unicode U+2013),
> while the files I have use the hyphen-minus (-, U+002D, the standard dash
> on computer keyboards).
> I expect (and hope) that this is simply a mistake in the spec: the N dash
> is not an ASCII character, so it would be tricky to parse it reliably.
> However, for files encoded with windows-1252 (which is standard for more
> modern DELTA files), it is possible to store an N-dash.
> Can anyone confirm that code parsing DELTA files should only allow
> hyphen-minus for this separator? And if so, could the spec be updated to
> use hyphen-minus in examples?
> delta-l mailing list
> delta-l at science.uu.nl
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the delta-l