What is a one-to-many join in DFSORT JOINKEYS?

A one-to-many join is when one file has at most one record per join key and the other file has multiple records per key. The join produces one output row per pair: so one F1 record with key X and five F2 records with key X produce five output rows. Each row has the same F1 data and a different F2 record. DFSORT does not have a special one-to-many statement; the behavior is determined by the data.

How do I do a one-to-many join with JOINKEYS?

You use the same JOINKEYS syntax as for any join: two JOINKEYS (F1 and F2 with FIELDS= for the join key), REFORMAT FIELDS=, and optionally JOIN UNPAIRED for outer joins. The one-to-many result happens automatically when one file has unique keys and the other has duplicate keys. No extra options are needed.

How many output rows do I get per key in a one-to-many join?

For a key that appears in both files: output rows = (number of F1 records with that key) × (number of F2 records with that key). So if F1 has one record and F2 has five with key X, you get 5 rows. If F1 has two and F2 has three, you get 6 rows (many-to-many for that key). So one-to-many means one side has one record per key and the other has many, giving "many" rows per key.

What is the difference between one-to-many and many-to-many?

One-to-many: one file has one record per key, the other has many per key; so output rows per key = the many count. Many-to-many: both files can have multiple records per key; output rows per key = (F1 count for that key) × (F2 count for that key). So many-to-many can produce many more rows (e.g. 3 × 4 = 12 for one key).

When should I use a one-to-many join?

Use it when you need one row per detail record with master data repeated—e.g. one row per order with customer name and address on each row, or one row per transaction with product description. The master file (one per key) is joined to the detail file (many per key). Left join keeps every master record and repeats it for each matching detail; inner join keeps only keys that have at least one detail.

DFSORT One-to-Many Joins - Multiple Rows Per Key JOINKEYS

One-to-Many Joins

A one-to-many join in DFSORT JOINKEYS means that for at least one of the two files, there are multiple records per join key. Typically one file (e.g. master) has one record per key and the other (e.g. transactions) has many records per key. The join then produces one output row per pair: so one master record and five transaction records with the same key produce five output rows, each with the same master data and a different transaction. DFSORT does not have a special “one-to-many” statement—you use the same JOINKEYS and REFORMAT; the multiplicity comes from the data. This page explains one-to-many (and many-to-one) behavior, row count, typical use cases, and how to plan REFORMAT and downstream processing when the output expands.

JOINKEYS / one-to-many

What One-to-Many Produces

For each value of the join key, if file 1 has A records and file 2 has B records with that key, the join produces A × B output rows for that key (for inner join). So when A = 1 and B = 5, you get 5 rows (one-to-many). When A = 3 and B = 2, you get 6 rows (many-to-many for that key). The “one” and “many” refer to how many records per key are in each file: one side has one, the other has many. The output repeats the “one” side’s data on every row that pairs with a “many” side record.

Rows per key (inner join): F1 count × F2 count
Scenario	F1 records per key	F2 records per key	Output rows per key
One F1, one F2 per key	1	1	1 row per key
One F1, five F2 per key	1	5	5 rows per key (one-to-many)
Three F1, two F2 per key	3	2	6 rows per key (many-to-many)

Master–Detail Example

Classic use: customer master (one record per customer ID) and orders (many records per customer ID). You want one row per order with customer name and address on each row. So F1 = master (one per key), F2 = orders (many per key). JOINKEYS on customer ID with REFORMAT that includes master fields and order fields. Result: one output row per order, with master data repeated for each order of that customer.

Example: customer master and orders

text

1
2
3
4
5
  JOINKEYS F1=MASTER,FIELDS=(1,8,CH,A)
  JOINKEYS F2=ORDERS,FIELDS=(15,8,CH,A)
  REFORMAT FIELDS=(F1:1,80,F2:1,100)
  JOIN UNPAIRED,F1
  SORT FIELDS=COPY

Master has one record per customer (key at 1–8). Orders have customer ID at 15–22, with many orders per customer. Every master record appears (left join); for each master record, there are as many output rows as there are order records with that customer ID. Each row is 80 + 100 = 180 bytes: customer data from F1 and order data from F2. Customers with no orders get one row with F2 portion blank.

Row Count: Planning Output Size

For an inner join, total output rows = sum over all keys of (F1 count for key × F2 count for key). So if you have 100 customers and 500 orders (each order has one customer ID), and every order has a valid customer, you get 500 rows—one per order. For a left join, you get one row per F1 record that has no match (F2 blank) plus the matched pairs: so (number of F1 records with no F2 match) + (sum over keys of F1 count × F2 count). With one master per key and many orders per key, that is: unpaired masters + total orders. So output can be larger than either input file.

Many-to-One

If the “many” is in F1 and the “one” is in F2 (e.g. orders with product ID in F1 and product master in F2), you still get one row per pair: many rows per key, with the F2 (master) data repeated on each row. So many-to-one is the same mechanism; only which file is “one” and which is “many” differs. REFORMAT still defines F1: and F2: segments; the output layout is the same.

Duplicate Keys and Performance

When one or both files have many records per key, the join must read and match all of them. So sort work and memory usage depend on the total number of records, and the output can be large. Ensure SORTOUT and any downstream datasets are sized for the expanded row count. Using SORTED and NOSEQCK on JOINKEYS (when inputs are already sorted by the join key) can reduce work.

When to Use One-to-Many

Reporting: one row per transaction or order with master data (customer, product, account) on each row.
Extracts: flat file for loading or export where each detail row must carry its master attributes.
Any requirement where “one master, many details” must become “many rows, each with master + one detail.”

One-to-Many vs One-to-One

In one-to-one, each key has at most one record in each file, so at most one output row per key. In one-to-many, at least one file has multiple records per key, so you get multiple output rows per key. If you expected one-to-one but one file has duplicates, you will see more rows than expected and possibly duplicate “master” data on multiple rows. If that is not desired, dedupe the “many” side (or the side that should be unique) before the join, or accept the expansion and handle it downstream.

JCL: No Change

JCL is the same as for any JOINKEYS: SORTJNF1 and SORTJNF2 (or F1= and F2=), SORTOUT. Only the data and the resulting row count differ.

Explain It Like I'm Five

You have one list of kids (one line per kid) and a list of every time they raised their hand (many lines per kid). You want to make a new list where each “hand raise” has the kid’s name on it. So for each hand raise you look up the kid and write one line: the kid’s name and that hand raise. The same kid’s name appears on many lines—once for each time they raised their hand. That’s one-to-many: one kid (master) matched to many hand raises (detail), so many lines in the result.

Exercises

Customer master has 200 records (one per customer). Orders has 1500 records (many per customer). How many output rows do you get with an inner join? With a left join (assuming every order has a valid customer)?
If product master (F1) has one record per product and sales (F2) has 10 records per product on average, and there are 100 products, roughly how many output rows do you get from an inner join?
Write REFORMAT for a one-to-many join: F1 is 60 bytes (master), F2 is 80 bytes (detail). Output should be master then detail. What is the output record length?
When would output row count exceed the number of records in file 1? When would it exceed the number in file 2?

Quiz

Test Your Knowledge

1. In a one-to-many join, what happens when file 1 has one record with key "X" and file 2 has five records with key "X"?

You get one output row (DFSORT keeps only the first match)
You get five output rows—the one F1 record is paired with each of the five F2 records, so five rows
You get an error because of duplicate keys
You get one row with F2 data concatenated

2. Which file is "one" and which is "many" in a one-to-many join?

F1 is always "one" and F2 is always "many"
It depends on the data: the file that has one record per key is "one," the file that has multiple records per key is "many." You can have one-to-many (e.g. F1 one, F2 many) or many-to-one (F1 many, F2 one)
The smaller file is "one"
You must specify with a JOIN option

3. In a left join with one-to-many (F1 one, F2 many), how many output rows do you get for a key that exists in both files?

One (left join keeps only one per key)
As many as there are F2 records with that key—each F1 record is written once per matching F2 record; unpaired F1 still produce one row each
The sum of F1 and F2 records for that key
At most one

4. What is a typical use case for a one-to-many join?

Only for debugging
Master–detail: e.g. one customer master (F1) joined to many orders (F2) on customer ID, so you get one row per order with customer data repeated on each row
Only when both files are small
Only with inner join

5. Can one-to-many produce more output rows than the total number of records in both input files?

No; output is always less than or equal to the sum of both files
Yes; for example if F1 has 100 records (one per key) and F2 has 500 records (many per key), you can get 500 or more output rows because each F1 row can match multiple F2 rows
Only with full outer join
Only when using REFORMAT