An inner join in DFSORT JOINKEYS means you keep only records that have a matching key in both input files. Records that exist in one file but have no matching key in the other file are dropped. This is the default behavior when you use JOINKEYS and do not code JOIN UNPAIRED. You need two JOINKEYS statements (one per file) with the join key for each, and a REFORMAT statement to define the joined output layout. The two files can have different record lengths and key positions; duplicate keys in either file produce multiple output rows (one per combination). This page is a deep dive on inner joins: when to use them, control statement order, key position and format, REFORMAT layout, JCL DD names, and one-to-one vs one-to-many behavior.
With an inner join, the output contains one row for each matched pair of records. A “pair” is one record from file 1 (F1) and one record from file 2 (F2) that have the same value in the join key. If a key appears only in F1 or only in F2, no output row is produced for that key. So the result is a subset of both files: only the keys that exist in both files, with the combined data you specify in REFORMAT.
To get an inner join, you do not code JOIN UNPAIRED. As soon as you add JOIN UNPAIRED,F1 or JOIN UNPAIRED,F2 (or both), you are doing an outer join and unmatched records from one or both files are included. So for a pure inner join, the control cards are: two JOINKEYS, REFORMAT, and optionally SORT FIELDS=COPY or SORT FIELDS=—no JOIN statement.
| Join type | What to code | Result |
|---|---|---|
| Inner (default) | No JOIN UNPAIRED | Only records with matching key in both files |
| Left outer | JOIN UNPAIRED,F1 | All F1 records; F2 data when match exists |
| Right outer | JOIN UNPAIRED,F2 | All F2 records; F1 data when match exists |
| Full outer | JOIN UNPAIRED,F1,F2 | All records from both files |
The usual order is: first JOINKEYS for file 1 (F1), then JOINKEYS for file 2 (F2), then REFORMAT FIELDS=, then SORT FIELDS=COPY (or SORT FIELDS= if you want to re-sort the joined data). The two input files are read from the DD names you specify—default SORTJNF1 and SORTJNF2—or from F1= and F2= if you override.
1234JOINKEYS F1=SORTJNF1,FIELDS=(1,10,CH,A) JOINKEYS F2=SORTJNF2,FIELDS=(1,10,CH,A) REFORMAT FIELDS=(F1:1,80,F2:11,70) SORT FIELDS=COPY
File 1 (SORTJNF1) and file 2 (SORTJNF2) are joined on the first 10 bytes (character, ascending). Only records with the same 10-byte key in both files produce output. REFORMAT builds the output record: 80 bytes from file 1 (positions 1–80), then 70 bytes from file 2 (positions 11–80 of F2). Total output length is 80 + 70 = 150 bytes. SORT FIELDS=COPY keeps the order produced by the join without re-sorting.
The join key does not have to be in the same position in both files. You specify the position and length for each file in its own JOINKEYS. For example, file 1 might have customer ID at 1–8 and file 2 at 15–22. Then you code FIELDS=(1,8,CH,A) for F1 and FIELDS=(15,8,CH,A) for F2. The values must match; the positions can differ. The format (CH, PD, etc.) and sort order (A or D) should be such that equal values compare equal—typically both keys use the same format and ascending order.
1234JOINKEYS F1=MASTER,FIELDS=(1,8,CH,A) JOINKEYS F2=TRAN,FIELDS=(20,8,CH,A) REFORMAT FIELDS=(F1:1,100,F2:1,50) SORT FIELDS=COPY
The join key for the first file (MASTER) is bytes 1–8; for the second file (TRAN) it is bytes 20–27. Records with the same key value (e.g. same customer ID) are paired. Output is 100 bytes from F1 and 50 bytes from F2 (150 bytes total). Only records that match on that key appear in the output.
REFORMAT FIELDS= is required. It lists which parts of each file go into the output record. Each item is F1:position,length or F2:position,length. They are concatenated in order. So FIELDS=(F1:1,50,F2:1,40) produces a 90-byte record: first 50 bytes of the F1 record, then first 40 bytes of the F2 record. You can skip the key in the output (e.g. F2:11,70 to avoid duplicating the key from F2) or include it from one or both sides. The output record length is the sum of all the lengths in the REFORMAT list.
If each key appears at most once in each file, you get at most one output record per key (one-to-one). If file 1 has two records with key "A001" and file 2 has one record with key "A001", you get two output records: the single F2 record is paired with each of the two F1 records. If both files have two records with "A001", you get four output records (every combination). So the number of output rows per key is the product of the number of F1 records with that key and the number of F2 records with that key. Plan your REFORMAT and downstream processing for possible row expansion when duplicates exist.
By default, the first join file is read from SORTJNF1 and the second from SORTJNF2. In JCL you provide two DD statements pointing to your input datasets. You do not use SORTIN for JOINKEYS—SORTIN is for single-file sort/copy. Example JCL:
1234567//STEP1 EXEC PGM=SORT //SORTJNF1 DD DSN=MY.MASTER,DISP=SHR //SORTJNF2 DD DSN=MY.TRAN,DISP=SHR //SORTOUT DD DSN=MY.JOINED,DISP=(NEW,CATLG),... //SYSIN DD * ... JOINKEYS and REFORMAT ... /*
If you use F1=MASTER and F2=TRAN in the JOINKEYS statements, then you would use DD names MASTER and TRAN instead of SORTJNF1 and SORTJNF2. SORTOUT is the joined output.
If both input files are already sorted by the join key in the same order (e.g. ascending), you can specify SORTED on the JOINKEYS statements so DFSORT does not re-sort them, which can improve performance. You may also specify NOSEQCK to skip sequence checking. The key order (A or D) and key definition must match the physical order of the files. See your DFSORT manual for exact syntax (e.g. JOINKEYS F1=...,FIELDS=(...),SORTED).
Use an inner join when you want only records that exist in both files. Typical cases: match a transaction file to a master file and keep only transactions that have a valid master; match an order file to a product file and keep only orders for products that exist; combine two extracts that share a common key and you care only about keys present in both. If you need to keep records that have no match (e.g. “all customers and their orders, including customers with no orders”), use a left outer join (JOIN UNPAIRED,F1) instead.
You have two lists: one list of kids and their favorite color, and one list of kids and their favorite snack. You want one list that has kid, color, and snack—but only for kids who appear on both lists. So you look at the first list and for each name you look on the second list. If the name is on both lists, you write one line with the name, color, and snack. If a name is only on the first list or only on the second list, you skip it. That’s the inner join: only the names that are on both lists get a line in the result.
1. How do you request an inner join with JOINKEYS?
2. What happens to a record in file 1 that has no matching key in file 2?
3. If file 1 has two records with key "A001" and file 2 has one record with key "A001", how many output records do you get for that key with an inner join?
4. What DD names are used for the two join inputs if you do not specify F1= or F2=?
5. Why is REFORMAT required for an inner join with JOINKEYS?