MainframeMaster

One-to-Many Joins

A one-to-many join in DFSORT JOINKEYS means that for at least one of the two files, there are multiple records per join key. Typically one file (e.g. master) has one record per key and the other (e.g. transactions) has many records per key. The join then produces one output row per pair: so one master record and five transaction records with the same key produce five output rows, each with the same master data and a different transaction. DFSORT does not have a special “one-to-many” statement—you use the same JOINKEYS and REFORMAT; the multiplicity comes from the data. This page explains one-to-many (and many-to-one) behavior, row count, typical use cases, and how to plan REFORMAT and downstream processing when the output expands.

JOINKEYS / one-to-many
Progress0 of 0 lessons

What One-to-Many Produces

For each value of the join key, if file 1 has A records and file 2 has B records with that key, the join produces A × B output rows for that key (for inner join). So when A = 1 and B = 5, you get 5 rows (one-to-many). When A = 3 and B = 2, you get 6 rows (many-to-many for that key). The “one” and “many” refer to how many records per key are in each file: one side has one, the other has many. The output repeats the “one” side’s data on every row that pairs with a “many” side record.

Rows per key (inner join): F1 count × F2 count
ScenarioF1 records per keyF2 records per keyOutput rows per key
One F1, one F2 per key111 row per key
One F1, five F2 per key155 rows per key (one-to-many)
Three F1, two F2 per key326 rows per key (many-to-many)

Master–Detail Example

Classic use: customer master (one record per customer ID) and orders (many records per customer ID). You want one row per order with customer name and address on each row. So F1 = master (one per key), F2 = orders (many per key). JOINKEYS on customer ID with REFORMAT that includes master fields and order fields. Result: one output row per order, with master data repeated for each order of that customer.

Example: customer master and orders

text
1
2
3
4
5
JOINKEYS F1=MASTER,FIELDS=(1,8,CH,A) JOINKEYS F2=ORDERS,FIELDS=(15,8,CH,A) REFORMAT FIELDS=(F1:1,80,F2:1,100) JOIN UNPAIRED,F1 SORT FIELDS=COPY

Master has one record per customer (key at 1–8). Orders have customer ID at 15–22, with many orders per customer. Every master record appears (left join); for each master record, there are as many output rows as there are order records with that customer ID. Each row is 80 + 100 = 180 bytes: customer data from F1 and order data from F2. Customers with no orders get one row with F2 portion blank.

Row Count: Planning Output Size

For an inner join, total output rows = sum over all keys of (F1 count for key × F2 count for key). So if you have 100 customers and 500 orders (each order has one customer ID), and every order has a valid customer, you get 500 rows—one per order. For a left join, you get one row per F1 record that has no match (F2 blank) plus the matched pairs: so (number of F1 records with no F2 match) + (sum over keys of F1 count × F2 count). With one master per key and many orders per key, that is: unpaired masters + total orders. So output can be larger than either input file.

Many-to-One

If the “many” is in F1 and the “one” is in F2 (e.g. orders with product ID in F1 and product master in F2), you still get one row per pair: many rows per key, with the F2 (master) data repeated on each row. So many-to-one is the same mechanism; only which file is “one” and which is “many” differs. REFORMAT still defines F1: and F2: segments; the output layout is the same.

Duplicate Keys and Performance

When one or both files have many records per key, the join must read and match all of them. So sort work and memory usage depend on the total number of records, and the output can be large. Ensure SORTOUT and any downstream datasets are sized for the expanded row count. Using SORTED and NOSEQCK on JOINKEYS (when inputs are already sorted by the join key) can reduce work.

When to Use One-to-Many

  • Reporting: one row per transaction or order with master data (customer, product, account) on each row.
  • Extracts: flat file for loading or export where each detail row must carry its master attributes.
  • Any requirement where “one master, many details” must become “many rows, each with master + one detail.”

One-to-Many vs One-to-One

In one-to-one, each key has at most one record in each file, so at most one output row per key. In one-to-many, at least one file has multiple records per key, so you get multiple output rows per key. If you expected one-to-one but one file has duplicates, you will see more rows than expected and possibly duplicate “master” data on multiple rows. If that is not desired, dedupe the “many” side (or the side that should be unique) before the join, or accept the expansion and handle it downstream.

JCL: No Change

JCL is the same as for any JOINKEYS: SORTJNF1 and SORTJNF2 (or F1= and F2=), SORTOUT. Only the data and the resulting row count differ.

Explain It Like I'm Five

You have one list of kids (one line per kid) and a list of every time they raised their hand (many lines per kid). You want to make a new list where each “hand raise” has the kid’s name on it. So for each hand raise you look up the kid and write one line: the kid’s name and that hand raise. The same kid’s name appears on many lines—once for each time they raised their hand. That’s one-to-many: one kid (master) matched to many hand raises (detail), so many lines in the result.

Exercises

  1. Customer master has 200 records (one per customer). Orders has 1500 records (many per customer). How many output rows do you get with an inner join? With a left join (assuming every order has a valid customer)?
  2. If product master (F1) has one record per product and sales (F2) has 10 records per product on average, and there are 100 products, roughly how many output rows do you get from an inner join?
  3. Write REFORMAT for a one-to-many join: F1 is 60 bytes (master), F2 is 80 bytes (detail). Output should be master then detail. What is the output record length?
  4. When would output row count exceed the number of records in file 1? When would it exceed the number in file 2?

Quiz

Test Your Knowledge

1. In a one-to-many join, what happens when file 1 has one record with key "X" and file 2 has five records with key "X"?

  • You get one output row (DFSORT keeps only the first match)
  • You get five output rows—the one F1 record is paired with each of the five F2 records, so five rows
  • You get an error because of duplicate keys
  • You get one row with F2 data concatenated

2. Which file is "one" and which is "many" in a one-to-many join?

  • F1 is always "one" and F2 is always "many"
  • It depends on the data: the file that has one record per key is "one," the file that has multiple records per key is "many." You can have one-to-many (e.g. F1 one, F2 many) or many-to-one (F1 many, F2 one)
  • The smaller file is "one"
  • You must specify with a JOIN option

3. In a left join with one-to-many (F1 one, F2 many), how many output rows do you get for a key that exists in both files?

  • One (left join keeps only one per key)
  • As many as there are F2 records with that key—each F1 record is written once per matching F2 record; unpaired F1 still produce one row each
  • The sum of F1 and F2 records for that key
  • At most one

4. What is a typical use case for a one-to-many join?

  • Only for debugging
  • Master–detail: e.g. one customer master (F1) joined to many orders (F2) on customer ID, so you get one row per order with customer data repeated on each row
  • Only when both files are small
  • Only with inner join

5. Can one-to-many produce more output rows than the total number of records in both input files?

  • No; output is always less than or equal to the sum of both files
  • Yes; for example if F1 has 100 records (one per key) and F2 has 500 records (many per key), you can get 500 or more output rows because each F1 row can match multiple F2 rows
  • Only with full outer join
  • Only when using REFORMAT