What is JOINKEYS in DFSORT?

JOINKEYS is the DFSORT feature that joins records from two files based on matching key fields. You specify one JOINKEYS for each file (with FIELDS= for the join key), then REFORMAT FIELDS= to define the output layout. By default only matching records are kept (inner join); JOIN UNPAIRED F1, F2, or both gives left, right, or full outer join.

How do I do an inner join with JOINKEYS?

Code two JOINKEYS statements (one per file) with FIELDS= defining the join key for each file, then REFORMAT FIELDS= to build the output record. Do not code JOIN UNPAIRED. Only records with matching keys in both files are written. Follow with SORT FIELDS=COPY or SORT FIELDS= as needed.

What is REFORMAT in JOINKEYS?

REFORMAT FIELDS= specifies which fields from the first file (F1:position,length) and the second file (F2:position,length) are placed in the joined output record. The order and length of the output record are determined by the REFORMAT list. REFORMAT is required when using JOINKEYS to define the joined record layout.

What is a left outer join in JOINKEYS?

Code JOIN UNPAIRED,F1. All records from file 1 (F1) are included; records from file 2 (F2) are included only when they match a key in F1. So F1 records without a match in F2 appear in the output with empty or default F2 fields; F2-only records are not included.

Can the two files have different record lengths or key positions?

Yes. Each JOINKEYS specifies its own FIELDS=(position,length,format,order) for that file. The keys must be comparable (same logical value) but can be in different positions in each file. REFORMAT then picks any positions from each file for the output.

JOINKEYS - DFSORT Join Two Files by Key Inner Left Right Outer

JOINKEYS

JOINKEYS is the DFSORT feature that joins two files based on matching key values. You code two JOINKEYS statements—one for each input file—each with FIELDS= defining the join key (position, length, format, order). Records from the two files that have the same key value are combined into one output record; you use REFORMAT FIELDS= to specify which fields from each file go into that output. By default, only records that have a match in both files are kept (inner join). You can keep unmatched records from one or both files by using JOIN UNPAIRED,F1 (left outer), JOIN UNPAIRED,F2 (right outer), or JOIN UNPAIRED,F1,F2 (full outer). This page covers the two-file join model, JOINKEYS and REFORMAT syntax, inner vs outer joins, and how the join keys and DD names work.

Control Statements

What JOINKEYS Does

JOINKEYS reads two input files (often called F1 and F2), matches records that have the same value in the join key you define for each file, and produces a single output stream of joined records. It is like a SQL JOIN: you have a key (e.g. customer ID) in both files; records with the same key are paired and combined into one record. The two files can have different record lengths and different layouts; the keys can be in different positions in each file. You need two JOINKEYS statements (one per file), a REFORMAT statement to define the joined record layout, and usually SORT FIELDS=COPY or a SORT to order the result.

JOINKEYS Syntax: Two Files, Two Statements

You code JOINKEYS twice—once for the first file and once for the second. Each statement identifies the file (by DD name) and the join key:

text

1
2
3
4
  JOINKEYS F1=JOIN1,FIELDS=(1,5,CH,A)
  JOINKEYS F2=JOIN2,FIELDS=(1,5,CH,A)
  REFORMAT FIELDS=(F1:1,80,F2:11,70)
  SORT FIELDS=COPY

F1=JOIN1 (or FILE=F1 with default SORTJNF1): the first file is allocated to DD name JOIN1. FIELDS=(1,5,CH,A) means the join key for this file is bytes 1–5, character, ascending. F2=JOIN2: the second file is on DD JOIN2, and its join key is also bytes 1–5, character, ascending. Records from JOIN1 and JOIN2 that have the same value in bytes 1–5 are paired. REFORMAT FIELDS=(F1:1,80,F2:11,70) builds the output record: first 80 bytes from file 1, then 70 bytes from file 2 starting at position 11 (so the output is 80 + 70 = 150 bytes). SORT FIELDS=COPY keeps the joined records in the order produced by the join (no re-sort).

Default DD names if you do not specify F1= or F2= are SORTJNF1 and SORTJNF2. So you can use JOINKEYS FILE=F1,FIELDS=(...) and JOINKEYS FILE=F2,FIELDS=(...) and allocate SORTJNF1 and SORTJNF2 in JCL.

Join Key: Position and Length

The FIELDS= on each JOINKEYS define the key for that file. The keys do not have to be in the same position in both files—only the values must match. So F1 could have the key at 1–10 and F2 at 20–29; you would code FIELDS=(1,10,CH,A) for F1 and FIELDS=(20,10,CH,A) for F2. The format (CH, PD, etc.) and order (A or D) must be such that matching values compare equal. Typically both are the same format and ascending.

REFORMAT FIELDS=

REFORMAT FIELDS= lists the pieces that make up the joined record. Each item is either F1:position,length (bytes from file 1) or F2:position,length (bytes from file 2). They are concatenated in the order listed. So FIELDS=(F1:1,20,F2:1,30) produces a 50-byte record: first 20 bytes of the F1 record, then first 30 bytes of the F2 record. For unmatched records in an outer join, the "missing" side is usually filled with blanks or zeros depending on the product. See your DFSORT documentation for exact REFORMAT options (constants, etc.).

Inner Join (Default)

If you do not code JOIN UNPAIRED, only records that have a matching key in both files are written. That is an inner join. Records in F1 with no match in F2 are dropped; records in F2 with no match in F1 are dropped.

Left Outer Join: JOIN UNPAIRED,F1

JOIN UNPAIRED,F1 means: include all records from file 1 (F1). When an F1 record has a match in F2, output the joined record. When an F1 record has no match in F2, still output a record—with the F1 data and empty or default values for the F2 part. So you get every F1 record; F2 data is added when a match exists. Records that exist only in F2 (no match in F1) are not included.

Right Outer Join: JOIN UNPAIRED,F2

JOIN UNPAIRED,F2 means: include all records from file 2 (F2). Every F2 record appears in the output; F1 data is added when there is a match. F1-only records are not included.

Full Outer Join: JOIN UNPAIRED,F1,F2

JOIN UNPAIRED,F1,F2 means: include all records from both files. Matched pairs produce one joined record; F1-only records produce a record with F1 data and empty F2; F2-only records produce a record with empty F1 and F2 data.

SORTED and NOSEQCK

If both input files are already sorted by the join key, you can specify SORTED (and often NOSEQCK) on the JOINKEYS statements so DFSORT does not re-sort them, which can improve performance. The key order must match (e.g. both ascending by the same key).

Explain It Like I'm Five

You have two stacks of cards: one has names and addresses, the other has names and phone numbers. The join key is the name. JOINKEYS lines up the two stacks by name and, for each name that appears in both stacks, makes one new card with the address and the phone number. That's the inner join. If you say "and keep every card from the first stack even if there's no phone number" (JOIN UNPAIRED,F1), you get a card for every person in the first stack—with a blank for phone when there's no match. So JOINKEYS = match two stacks by a key and combine them into one stack.

Exercises

Write JOINKEYS for two files on DD JOIN1 and JOIN2, both with a 10-byte character key at position 1. Assume inner join and REFORMAT that outputs F1 bytes 1–50 and F2 bytes 1–40.
What is the difference between inner join and JOIN UNPAIRED,F1? When would you use each?
If F1 has key at 1–5 and F2 has key at 10–14, how do you code the two JOINKEYS FIELDS=?
Why is REFORMAT required in a JOINKEYS application?

Quiz

Test Your Knowledge

1. How many JOINKEYS statements do you need to join two files?

One
Two—one for each file
Three
Only when using REFORMAT

2. By default (no JOIN UNPAIRED), what kind of join does JOINKEYS perform?

Full outer join
Left outer join
Inner join—only records with matching keys in both files
Right outer join

3. What does REFORMAT FIELDS= do in a JOINKEYS application?

Sorts the joined data
Specifies which fields from F1 and F2 go into the joined output record
Filters records
Defines the join key

4. JOIN UNPAIRED,F1 means what?

Drop all F1 records
Include all records from file 1 even if they have no match in file 2 (left outer join)
Include only F1
Full outer join

5. What DD names does JOINKEYS use for the two input files by default?

SORTIN and SORTOUT
SORTJNF1 and SORTJNF2
F1 and F2
JOIN1 and JOIN2