MainframeMaster

Left Joins

A left join (or left outer join) in DFSORT JOINKEYS means you keep every record from the first file (F1), and add data from the second file (F2) when a matching key exists. If an F1 record has no matching key in F2, it still appears in the output; the part of the record that would have come from F2 is filled with blanks (for character fields) or low values (for numeric fields). You request a left join by coding JOIN UNPAIRED,F1 after your two JOINKEYS statements and REFORMAT. This page covers when to use a left join, control statement order, how unmatched F1 records are formatted, duplicate keys, and how it differs from inner and full outer joins.

JOINKEYS / JOIN UNPAIRED,F1
Progress0 of 0 lessons

What a Left Join Produces

With a left join, the output has one row for every record in file 1. For each F1 record: if there is at least one record in F2 with the same join key, the output contains one row per match (F1 data plus that F2 record). If there is no F2 record with that key, the output contains one row with the F1 data and the F2 portion blank or zero-filled. So the number of output rows is at least the number of F1 records; it can be more when there are multiple F2 matches per key (one-to-many from F2 side).

JOIN UNPAIRED,F1

To get a left join, you code JOIN UNPAIRED,F1. “Unpaired” means “without a match in the other file.” So JOIN UNPAIRED,F1 means “include unpaired F1 records in the output”—i.e. do not drop F1 records that have no match in F2. That is exactly the left outer join: the left (first) file is preserved in full.

Join type and JOIN syntax
Join typeWhat to codeResult
Inner (default)No JOIN statementOnly records with key in both F1 and F2
Left outerJOIN UNPAIRED,F1All F1 records; F2 data when match exists, else blank
Right outerJOIN UNPAIRED,F2All F2 records; F1 data when match exists
Full outerJOIN UNPAIRED,F1,F2All records from both files

Control Statement Order and Example

The order is: JOINKEYS for F1, JOINKEYS for F2, REFORMAT FIELDS=, then JOIN UNPAIRED,F1, then SORT FIELDS=COPY or SORT FIELDS=. The JOIN statement must come after REFORMAT.

Example: left join, key in first 10 bytes

text
1
2
3
4
5
JOINKEYS F1=SORTJNF1,FIELDS=(1,10,CH,A) JOINKEYS F2=SORTJNF2,FIELDS=(1,10,CH,A) REFORMAT FIELDS=(F1:1,80,F2:11,70) JOIN UNPAIRED,F1 SORT FIELDS=COPY

File 1 (SORTJNF1) and file 2 (SORTJNF2) are joined on the first 10 bytes. Every F1 record is written. When there is a match, the output has 80 bytes from F1 and 70 bytes from F2 (positions 11–80 of F2). When there is no match, the 80 bytes from F1 are real data and the 70 bytes from F2 are filled with blanks. Total record length is 150 bytes for every output record.

How Unmatched F1 Records Are Filled

For an F1 record with no matching key in F2, DFSORT still builds the full REFORMAT record. The F1:position,length segments contain the actual F1 data. The F2:position,length segments are filled with: blanks for character (CH) format, and low values (typically zeros) for numeric formats such as PD or BI. So your output record length is constant; you can tell “no match” by checking whether the F2 portion is blank or zero, or by using a REFORMAT indicator (see below).

Using a REFORMAT Indicator to Detect Unpaired F1

If you need to know in the output whether a row was matched or not, you can add a one-byte indicator in REFORMAT. Use ? (question mark) in the REFORMAT FIELDS= list: DFSORT will put a value (e.g. '1' for unpaired F1, '2' for unpaired F2, ' ' for paired) in that byte. Then downstream you can test that byte. For example REFORMAT FIELDS=(F1:1,80,F2:11,70,?) adds one byte at the end; when the record is from an unpaired F1, that byte is set so you can INCLUDE or OMIT on it in OUTFIL if you need separate files for matched vs unmatched.

Example: REFORMAT with indicator

text
1
2
REFORMAT FIELDS=(F1:1,80,F2:11,70,?) JOIN UNPAIRED,F1

The last position (151 if F1 is 80 and F2 is 70) holds the pairing indicator. You can then use OUTFIL INCLUDE=(151,1,CH,EQ,C'1') for unpaired F1-only records, or the opposite for matched records, if you need to split output.

Join Key in Different Positions

The join key can be in different positions in each file. Specify the key for F1 in the first JOINKEYS (e.g. FIELDS=(1,8,CH,A)) and for F2 in the second (e.g. FIELDS=(20,8,CH,A)). Values must match; positions can differ. REFORMAT can pick any ranges from F1 and F2 for the output; for unpaired F1, the F2 range is still blank-filled.

Example: keys at different positions

text
1
2
3
4
5
JOINKEYS F1=MASTER,FIELDS=(1,8,CH,A) JOINKEYS F2=TRAN,FIELDS=(20,8,CH,A) REFORMAT FIELDS=(F1:1,100,F2:1,50) JOIN UNPAIRED,F1 SORT FIELDS=COPY

Master file key at 1–8, transaction file key at 20–27. Every master record appears; when a transaction matches, its first 50 bytes are appended; when no transaction matches, those 50 bytes are blanks.

One-to-Many: Multiple F2 Rows per Key

If file 2 has multiple records with the same key (e.g. several orders for one customer), each F1 record with that key is paired with each of those F2 records. So one F1 record can produce several output rows. For an F1 record with no match, you still get exactly one row (with blank F2). So left join preserves F1 row count only when there is at most one F2 match per key; otherwise the output expands.

JCL: DD Names

By default the first file is SORTJNF1 and the second SORTJNF2. You do not use SORTIN for JOINKEYS. Example:

text
1
2
3
4
5
6
7
//STEP1 EXEC PGM=SORT //SORTJNF1 DD DSN=MY.MASTER,DISP=SHR //SORTJNF2 DD DSN=MY.TRAN,DISP=SHR //SORTOUT DD DSN=MY.LEFTJOIN,DISP=(NEW,CATLG),... //SYSIN DD * ... JOINKEYS, REFORMAT, JOIN UNPAIRED,F1, SORT FIELDS=COPY ... /*

Left Join vs Inner Join

With an inner join (no JOIN statement), only records that have a match in both files are written; F1-only and F2-only records are dropped. With a left join (JOIN UNPAIRED,F1), all F1 records are written; F2-only records are still dropped. So use left join when you need a complete list from the “left” (first) file and optional data from the second—for example “all customers with their order total, including customers with zero orders.”

When to Use a Left Join

  • You need every record from the first (master) file and optional data from the second (e.g. transactions, details).
  • You want to find F1 records that have no match in F2 (e.g. customers with no orders) by checking the blank-filled F2 portion or a REFORMAT indicator.
  • You are building a report or extract that must list all entities from one file, with related data from another when it exists.

Explain It Like I'm Five

You have a list of all the kids in your class (file 1) and a list of kids who got a sticker today (file 2). You want one list that has every kid and “yes” or “no” for the sticker. So you go through the first list—every kid—and for each one you look on the sticker list. If they got a sticker, you write their name and “yes.” If they are not on the sticker list, you still write their name and “no” (or leave the sticker part blank). Nobody from the first list is left out. That’s the left join: the first list is complete; the second list only fills in when there’s a match.

Exercises

  1. Write control statements for a left join of FILE1 and FILE2 (6-byte character key at position 1 in both), with REFORMAT outputting 60 bytes from F1 and 40 bytes from F2. Include JOIN UNPAIRED,F1.
  2. File 1 has 100 records; file 2 has 80. Every key in F2 exists in F1, but 30 keys in F1 do not exist in F2. Approximately how many output records do you get with a left join? (Assume one record per key in each file.)
  3. How would you identify “F1 records with no match in F2” in the output without using a REFORMAT indicator? How would you do it with a REFORMAT indicator?
  4. What is the difference between JOIN UNPAIRED,F1 and not coding any JOIN statement?

Quiz

Test Your Knowledge

1. How do you request a left outer join in DFSORT JOINKEYS?

  • Code JOIN LEFT
  • Code JOIN UNPAIRED,F1 after the two JOINKEYS and REFORMAT; this keeps all F1 records and adds F2 data when a match exists
  • Code JOIN INNER,F1
  • Use INCLUDE with F1 only

2. What happens to an F1 record that has no matching key in F2?

  • It is dropped
  • It is written with the F2 portion of the REFORMAT layout filled with blanks (or low values); the F1 portion contains the actual data
  • It causes an error
  • It is written to a separate file only

3. If file 1 has three records with keys A001, A002, A003 and file 2 has only A001 and A003, how many output records do you get with a left join?

  • Two (only the matches)
  • Three—one for each F1 record; A001 and A003 have F2 data, A002 has blank F2
  • Five
  • One

4. Where does the JOIN statement appear in the control statement order?

  • Before JOINKEYS
  • After both JOINKEYS and REFORMAT; JOIN UNPAIRED,F1 comes after REFORMAT FIELDS=
  • Inside REFORMAT
  • Only with OUTFIL

5. Can you get both a left join and right join in one pass?

  • No, you must run two jobs
  • Yes—code JOIN UNPAIRED,F1,F2 for a full outer join, which includes all F1 and all F2 records; for only left or only right you use JOIN UNPAIRED,F1 or JOIN UNPAIRED,F2
  • Only with ICETOOL
  • Yes, by using two REFORMAT statements