Merging DELTA data sets

Mike Dallwitz miked at ento.csiro.au
Fri Dec 16 08:42:55 CET 1994


- From: Phillip Hoover

> A student has produced two different DELTA data files sets for two 
> different genera but now wishes to combine them to use with Intkey or 
> for the production of printed keys. What is the most effective way to 
> do this? In the past I would get them to create a another subdirectory 
> and enter new data files. Normally students are only dealing with 6-10 
> taxa and a dozen characters so it is a fairly easy job.

The new DELTA system will incorporate a simple method of merging data 
sets. At present, it can only be done by the following procedure. Note 
that there is little point in combining data sets unless they share at 
least some characters.

1. Reorder both character lists, and the items etc., so that the shared 
characters are at the starts of the lists.

2. To the end of the second list, add dummy characters equal in number 
to the non-shared characters in the other list. It is easiest to make 
these UM characters, of the form
    #n. / 1. / 2. /
It may be possible to make your text editor generate the required 
sequence of numbers automatically (TED, supplied with the DELTA 
programs, can do this via the T command). Don't forget to increase the 
NUMBER OF CHARACTERS in the SPECS.

3. Reorder the second list, and the corresponding items etc., so that 
the dummy characters are immediately after the shared characters; i.e., 
so that they are aligned with the non-shared characters of the first 
list.

4. From the second list, delete the shared and dummy characters. Append 
the remaining characters to the first list. (Check that the sequence of 
character numbers in the combined list is correct.)

5. Form a combined SPECS from the SPECS of the first list, and the 
reordered SPECS.NEW of the second.

6. Form a combined ITEMS from the items of the first list, and the 
reordered ITEMS.NEW of the second. If the two sets of items do not 
describe the same entities (e.g. each set of items describes the species 
of a different genus), the second items file (less the directives at its 
start) can simply be appended to the first. If the two sets of items 
describe the same entities, each item from the second list must be 
inserted (less its item name) after the corresponding item in the first. 
(A TED box program, ADDLIST, is available for this purpose.)

7. TIDY the combined items. If the two sets of items described the same 
entities, it will be necessary to use an ACCEPT DUPLICATE VALUES 
directive (presumably the shared characters were coded the same in both 
sets of items).

Example

The comments are included to identify the data set from which the 
information came. Assume that the character lists have already been 
ordered (step 1) so that the shared characters are at the start of each, 
and let these shared characters be characters 1 and 2.

CHARS 1
#1. a <1>/ 1. p/ 2. q/
#2. b <1>/ 1. r/ 2. s/
#3. c <1>/ 1. t/ 2. u/
#4. d <1>/ 1. v/ 2. w/
ITEMS 1
#A/ 1<1>,1 2<1>,1 3<1>,1 4<1>,1
#B/ 1<1>,1 2<1>,2 3<1>,1 4<1>,2

CHARS 2
#1. a <2>/ 1. p/ 2. q/
#2. b <2>/ 1. r/ 2. s/
#3. e <2>/ 1. x/ 2. y/
ITEMS 2
#A/ 1<2>,1 2<2>,1 3<2>,1
#B/ 1<2>,1 2<2>,2 3<2>,1

Step 2.

CHARS 2
#1. a <2>/ 1. p/ 2. q/
#2. b <2>/ 1. r/ 2. s/
#3. e <2>/ 1. x/ 2. y/
#4. / 1. / 2. /
#5. / 1. / 2. /

Step 3.

CHARS 2
#1. a <2>/ 1. p/ 2. q/
#2. b <2>/ 1. r/ 2. s/
#3. / 1. / 2. /
#4. / 1. / 2. /
#5. e <2>/ 1. x/ 2. y/
ITEMS 2
#A/ 1<2>,1 2<2>,1 5<2>,1
#B/ 1<2>,1 2<2>,2 5<2>,1

Step 4.

CHARS combined
#1. a <1>/ 1. p/ 2. q/
#2. b <1>/ 1. r/ 2. s/
#3. c <1>/ 1. t/ 2. u/
#4. d <1>/ 1. v/ 2. w/
#5. e <2>/ 1. x/ 2. y/

Step 6.

ITEMS combined
#A/ 1<1>,1 2<1>,1 3<1>,1 4<1>,1
1<2>,1 2<2>,1 5<2>,1
#B/ 1<1>,1 2<1>,2 3<1>,1 4<1>,2
1<2>,1 2<2>,2 5<2>,1

Step 7.

ITEMS combined
#A/ 1<2>,1 2<2>,1 3<1>,1 4<1>,1 5<2>,1
#B/ 1<2>,1 2<2>,2 3<1>,1 4<1>,2 5<2>,1

Mike Dallwitz
CSIRO Division of Entomology, GPO Box 1700, Canberra ACT 2601, Australia
Internet md at ento.csiro.au  Phone +61 6 246 4075  Fax +61 6 246 4000



More information about the delta-l mailing list