A one-to-many join in DFSORT JOINKEYS means that for at least one of the two files, there are multiple records per join key. Typically one file (e.g. master) has one record per key and the other (e.g. transactions) has many records per key. The join then produces one output row per pair: so one master record and five transaction records with the same key produce five output rows, each with the same master data and a different transaction. DFSORT does not have a special “one-to-many” statement—you use the same JOINKEYS and REFORMAT; the multiplicity comes from the data. This page explains one-to-many (and many-to-one) behavior, row count, typical use cases, and how to plan REFORMAT and downstream processing when the output expands.
For each value of the join key, if file 1 has A records and file 2 has B records with that key, the join produces A × B output rows for that key (for inner join). So when A = 1 and B = 5, you get 5 rows (one-to-many). When A = 3 and B = 2, you get 6 rows (many-to-many for that key). The “one” and “many” refer to how many records per key are in each file: one side has one, the other has many. The output repeats the “one” side’s data on every row that pairs with a “many” side record.
| Scenario | F1 records per key | F2 records per key | Output rows per key |
|---|---|---|---|
| One F1, one F2 per key | 1 | 1 | 1 row per key |
| One F1, five F2 per key | 1 | 5 | 5 rows per key (one-to-many) |
| Three F1, two F2 per key | 3 | 2 | 6 rows per key (many-to-many) |
Classic use: customer master (one record per customer ID) and orders (many records per customer ID). You want one row per order with customer name and address on each row. So F1 = master (one per key), F2 = orders (many per key). JOINKEYS on customer ID with REFORMAT that includes master fields and order fields. Result: one output row per order, with master data repeated for each order of that customer.
12345JOINKEYS F1=MASTER,FIELDS=(1,8,CH,A) JOINKEYS F2=ORDERS,FIELDS=(15,8,CH,A) REFORMAT FIELDS=(F1:1,80,F2:1,100) JOIN UNPAIRED,F1 SORT FIELDS=COPY
Master has one record per customer (key at 1–8). Orders have customer ID at 15–22, with many orders per customer. Every master record appears (left join); for each master record, there are as many output rows as there are order records with that customer ID. Each row is 80 + 100 = 180 bytes: customer data from F1 and order data from F2. Customers with no orders get one row with F2 portion blank.
For an inner join, total output rows = sum over all keys of (F1 count for key × F2 count for key). So if you have 100 customers and 500 orders (each order has one customer ID), and every order has a valid customer, you get 500 rows—one per order. For a left join, you get one row per F1 record that has no match (F2 blank) plus the matched pairs: so (number of F1 records with no F2 match) + (sum over keys of F1 count × F2 count). With one master per key and many orders per key, that is: unpaired masters + total orders. So output can be larger than either input file.
If the “many” is in F1 and the “one” is in F2 (e.g. orders with product ID in F1 and product master in F2), you still get one row per pair: many rows per key, with the F2 (master) data repeated on each row. So many-to-one is the same mechanism; only which file is “one” and which is “many” differs. REFORMAT still defines F1: and F2: segments; the output layout is the same.
When one or both files have many records per key, the join must read and match all of them. So sort work and memory usage depend on the total number of records, and the output can be large. Ensure SORTOUT and any downstream datasets are sized for the expanded row count. Using SORTED and NOSEQCK on JOINKEYS (when inputs are already sorted by the join key) can reduce work.
In one-to-one, each key has at most one record in each file, so at most one output row per key. In one-to-many, at least one file has multiple records per key, so you get multiple output rows per key. If you expected one-to-one but one file has duplicates, you will see more rows than expected and possibly duplicate “master” data on multiple rows. If that is not desired, dedupe the “many” side (or the side that should be unique) before the join, or accept the expansion and handle it downstream.
JCL is the same as for any JOINKEYS: SORTJNF1 and SORTJNF2 (or F1= and F2=), SORTOUT. Only the data and the resulting row count differ.
You have one list of kids (one line per kid) and a list of every time they raised their hand (many lines per kid). You want to make a new list where each “hand raise” has the kid’s name on it. So for each hand raise you look up the kid and write one line: the kid’s name and that hand raise. The same kid’s name appears on many lines—once for each time they raised their hand. That’s one-to-many: one kid (master) matched to many hand raises (detail), so many lines in the result.
1. In a one-to-many join, what happens when file 1 has one record with key "X" and file 2 has five records with key "X"?
2. Which file is "one" and which is "many" in a one-to-many join?
3. In a left join with one-to-many (F1 one, F2 many), how many output rows do you get for a key that exists in both files?
4. What is a typical use case for a one-to-many join?
5. Can one-to-many produce more output rows than the total number of records in both input files?