A one-to-one join in DFSORT JOINKEYS means that for each value of the join key, there is at most one record in the first file (F1) and at most one record in the second file (F2). So each key produces at most one matched pair and thus at most one output row (for an inner join), or exactly one output row per F1 record in a left join and one per F2 in a right join. DFSORT does not have a special “one-to-one” control statement—the behavior comes from your data having unique keys in each file. This page explains what one-to-one means, how to ensure it (including deduplication before the join), and how it differs from one-to-many and many-to-many.
In relational terms, the join key in file 1 is unique (primary key or unique key), and the join key in file 2 is unique. So there is a one-to-one relationship between the two files on that key. When you join them, each key that appears in both files produces exactly one row (one F1 record × one F2 record). Each key that appears only in one file (for outer joins) produces one row per unpaired record. So the output row count is predictable: for an inner join, it equals the number of keys that exist in both files; for a left join, it equals the number of records in F1; for a right join, the number in F2; for a full outer join, the number of distinct keys that appear in either file (with one row per key when keys are unique in each file).
| Type | F1 records per key | F2 records per key | Output rows per key (matched) |
|---|---|---|---|
| One-to-one | At most one per key | At most one per key | At most one row per key (matched) |
| One-to-many | One per key | Many per key | Many rows per key (one F1 × many F2) |
| Many-to-one | Many per key | One per key | Many rows per key (many F1 × one F2) |
| Many-to-many | Many per key | Many per key | Product of counts per key |
DFSORT JOINKEYS does not enforce uniqueness. If either file has duplicate keys, you will get multiple output rows per key (one per combination). To get one-to-one results, ensure each file has at most one record per join key before the join. A common approach is to run a separate SORT step on each file (or only on the file that has duplicates) with the same key as the join key and SUM FIELDS=NONE. That keeps one record per key (the first after sort, unless you use other options). Write the result to a temporary or permanent dataset and use it as SORTJNF1 or SORTJNF2.
12345678910111213141516171819* Step 1: Dedupe FILE2 on key 1-10, keep one per key //STEP1 EXEC PGM=SORT //SORTIN DD DSN=MY.FILE2,DISP=SHR //SORTOUT DD DSN=MY.FILE2.UNIQ,DISP=(NEW,PASS),... //SYSIN DD * SORT FIELDS=(1,10,CH,A) SUM FIELDS=NONE /* * Step 2: Join FILE1 (unique on 1-10) with deduped FILE2 //STEP2 EXEC PGM=SORT //SORTJNF1 DD DSN=MY.FILE1,DISP=SHR //SORTJNF2 DD DSN=MY.FILE2.UNIQ,DISP=SHR //SORTOUT DD DSN=MY.JOINED,DISP=(NEW,CATLG),... //SYSIN DD * JOINKEYS F1=SORTJNF1,FIELDS=(1,10,CH,A) JOINKEYS F2=SORTJNF2,FIELDS=(1,10,CH,A) REFORMAT FIELDS=(F1:1,80,F2:11,70) SORT FIELDS=COPY /*
After step 1, MY.FILE2.UNIQ has at most one record per key. If FILE1 is already unique on 1–10, the join in step 2 is one-to-one: at most one output row per key.
If your master file and lookup file are already unique on the join key (e.g. customer ID in both), you can run JOINKEYS directly. No dedupe step is needed. The result will be one-to-one: one row per key for inner join, one row per F1 for left join, and so on.
With an inner join (no JOIN statement) and one-to-one data, the output has exactly one row for each key that exists in both files. So the output record count equals the number of keys that appear in both F1 and F2. If F1 has 1000 unique keys and F2 has 800 unique keys and 700 keys are common, you get 700 output rows.
With a left join (JOIN UNPAIRED,F1) and one-to-one data, every F1 record produces exactly one output row. So the output record count equals the number of records in F1. Matched keys get F1+F2 data; unmatched F1 keys get F1 data and blank F2. There is no row expansion because each F1 record matches at most one F2 record.
With a right join (JOIN UNPAIRED,F2), output row count equals the number of records in F2 when data is one-to-one. With a full outer join (JOIN UNPAIRED,F1,F2), you get one row per F1 record plus one row per F2 record that has no match in F1—but because matched pairs produce one row each (not two), the total is: (number of F1 records) + (number of F2-only keys × records per key). With strict one-to-one (one record per key in each file), that simplifies to one row per distinct key that appears in either file (matched keys once, F1-only keys once, F2-only keys once).
If F1 has two records with key "X" and F2 has one record with key "X", you get two output rows for "X" (one-to-many). If both have two records each with key "X", you get four output rows (many-to-many). So the output can grow. Downstream programs that assume one row per key may break (duplicate keys, wrong totals). Always ensure or verify key uniqueness if you expect one-to-one behavior.
You have a list of kids (one line per kid) and a list of their favorite colors (one line per kid). You want to match them by name. If each list has each name only once, then for each name you get one line: the kid and their color. That’s one-to-one. If the color list had the same kid twice (two favorite colors?), you’d get two lines for that kid. So one-to-one means “each name appears only once on each list,” and then you get one line per name in the result.
1. What defines a one-to-one join in DFSORT JOINKEYS?
2. If file 1 has one record with key "A001" and file 2 has one record with key "A001", how many output rows do you get for that key with an inner join?
3. How can you guarantee one-to-one behavior if your input files might have duplicate keys?
4. In a one-to-one left join, how many output rows do you get?
5. When is a one-to-one join the right choice?