JOINKEYS is the DFSORT feature that joins two files based on matching key values. You code two JOINKEYS statements—one for each input file—each with FIELDS= defining the join key (position, length, format, order). Records from the two files that have the same key value are combined into one output record; you use REFORMAT FIELDS= to specify which fields from each file go into that output. By default, only records that have a match in both files are kept (inner join). You can keep unmatched records from one or both files by using JOIN UNPAIRED,F1 (left outer), JOIN UNPAIRED,F2 (right outer), or JOIN UNPAIRED,F1,F2 (full outer). This page covers the two-file join model, JOINKEYS and REFORMAT syntax, inner vs outer joins, and how the join keys and DD names work.
JOINKEYS reads two input files (often called F1 and F2), matches records that have the same value in the join key you define for each file, and produces a single output stream of joined records. It is like a SQL JOIN: you have a key (e.g. customer ID) in both files; records with the same key are paired and combined into one record. The two files can have different record lengths and different layouts; the keys can be in different positions in each file. You need two JOINKEYS statements (one per file), a REFORMAT statement to define the joined record layout, and usually SORT FIELDS=COPY or a SORT to order the result.
You code JOINKEYS twice—once for the first file and once for the second. Each statement identifies the file (by DD name) and the join key:
1234JOINKEYS F1=JOIN1,FIELDS=(1,5,CH,A) JOINKEYS F2=JOIN2,FIELDS=(1,5,CH,A) REFORMAT FIELDS=(F1:1,80,F2:11,70) SORT FIELDS=COPY
F1=JOIN1 (or FILE=F1 with default SORTJNF1): the first file is allocated to DD name JOIN1. FIELDS=(1,5,CH,A) means the join key for this file is bytes 1–5, character, ascending. F2=JOIN2: the second file is on DD JOIN2, and its join key is also bytes 1–5, character, ascending. Records from JOIN1 and JOIN2 that have the same value in bytes 1–5 are paired. REFORMAT FIELDS=(F1:1,80,F2:11,70) builds the output record: first 80 bytes from file 1, then 70 bytes from file 2 starting at position 11 (so the output is 80 + 70 = 150 bytes). SORT FIELDS=COPY keeps the joined records in the order produced by the join (no re-sort).
Default DD names if you do not specify F1= or F2= are SORTJNF1 and SORTJNF2. So you can use JOINKEYS FILE=F1,FIELDS=(...) and JOINKEYS FILE=F2,FIELDS=(...) and allocate SORTJNF1 and SORTJNF2 in JCL.
The FIELDS= on each JOINKEYS define the key for that file. The keys do not have to be in the same position in both files—only the values must match. So F1 could have the key at 1–10 and F2 at 20–29; you would code FIELDS=(1,10,CH,A) for F1 and FIELDS=(20,10,CH,A) for F2. The format (CH, PD, etc.) and order (A or D) must be such that matching values compare equal. Typically both are the same format and ascending.
REFORMAT FIELDS= lists the pieces that make up the joined record. Each item is either F1:position,length (bytes from file 1) or F2:position,length (bytes from file 2). They are concatenated in the order listed. So FIELDS=(F1:1,20,F2:1,30) produces a 50-byte record: first 20 bytes of the F1 record, then first 30 bytes of the F2 record. For unmatched records in an outer join, the "missing" side is usually filled with blanks or zeros depending on the product. See your DFSORT documentation for exact REFORMAT options (constants, etc.).
If you do not code JOIN UNPAIRED, only records that have a matching key in both files are written. That is an inner join. Records in F1 with no match in F2 are dropped; records in F2 with no match in F1 are dropped.
JOIN UNPAIRED,F1 means: include all records from file 1 (F1). When an F1 record has a match in F2, output the joined record. When an F1 record has no match in F2, still output a record—with the F1 data and empty or default values for the F2 part. So you get every F1 record; F2 data is added when a match exists. Records that exist only in F2 (no match in F1) are not included.
JOIN UNPAIRED,F2 means: include all records from file 2 (F2). Every F2 record appears in the output; F1 data is added when there is a match. F1-only records are not included.
JOIN UNPAIRED,F1,F2 means: include all records from both files. Matched pairs produce one joined record; F1-only records produce a record with F1 data and empty F2; F2-only records produce a record with empty F1 and F2 data.
If both input files are already sorted by the join key, you can specify SORTED (and often NOSEQCK) on the JOINKEYS statements so DFSORT does not re-sort them, which can improve performance. The key order must match (e.g. both ascending by the same key).
You have two stacks of cards: one has names and addresses, the other has names and phone numbers. The join key is the name. JOINKEYS lines up the two stacks by name and, for each name that appears in both stacks, makes one new card with the address and the phone number. That's the inner join. If you say "and keep every card from the first stack even if there's no phone number" (JOIN UNPAIRED,F1), you get a card for every person in the first stack—with a blank for phone when there's no match. So JOINKEYS = match two stacks by a key and combine them into one stack.
1. How many JOINKEYS statements do you need to join two files?
2. By default (no JOIN UNPAIRED), what kind of join does JOINKEYS perform?
3. What does REFORMAT FIELDS= do in a JOINKEYS application?
4. JOIN UNPAIRED,F1 means what?
5. What DD names does JOINKEYS use for the two input files by default?