MainframeMaster

Full Outer Joins

A full outer join in DFSORT JOINKEYS means you keep every record from both the first file (F1) and the second file (F2). Matched pairs appear with both F1 and F2 data as defined by REFORMAT. Records that exist only in F1 appear with the F2 portion filled with blanks or low values. Records that exist only in F2 appear with the F1 portion filled with blanks or low values. You request a full outer join by coding JOIN UNPAIRED,F1,F2 after your two JOINKEYS statements and REFORMAT. This page covers when to use a full outer join, control statement order, how to tell matched vs unpaired rows, and comparison with inner, left, and right joins.

JOINKEYS / JOIN UNPAIRED,F1,F2
Progress0 of 0 lessons

What a Full Outer Join Produces

The output contains: (1) one row for each matched pair of records (same key in F1 and F2), with both F1 and F2 data; (2) one row for each F1 record whose key does not appear in F2, with F1 data and F2 portion blank-filled; (3) one row for each F2 record whose key does not appear in F1, with F2 data and F1 portion blank-filled. So no record from either file is dropped. The total row count can be larger than the sum of the two file counts when there are duplicate keys (one F1 record matching multiple F2 records produces multiple rows).

JOIN UNPAIRED,F1,F2

To get a full outer join, you code JOIN UNPAIRED,F1,F2. Including both F1 and F2 means “include unpaired records from the first file and include unpaired records from the second file.” So you get the effect of a left join and a right join in one pass: the complete union of both files with match data when present.

Join type and JOIN syntax
Join typeWhat to codeResult
Inner (default)No JOIN statementOnly records with key in both F1 and F2
Left outerJOIN UNPAIRED,F1All F1; F2 when match, else blank
Right outerJOIN UNPAIRED,F2All F2; F1 when match, else blank
Full outerJOIN UNPAIRED,F1,F2All F1 and all F2; blank where no match

Control Statement Order and Example

The order is: JOINKEYS for F1, JOINKEYS for F2, REFORMAT FIELDS=, then JOIN UNPAIRED,F1,F2, then SORT FIELDS=COPY or SORT FIELDS=.

Example: full outer join

text
1
2
3
4
5
JOINKEYS F1=SORTJNF1,FIELDS=(1,10,CH,A) JOINKEYS F2=SORTJNF2,FIELDS=(1,10,CH,A) REFORMAT FIELDS=(F1:1,80,F2:11,70) JOIN UNPAIRED,F1,F2 SORT FIELDS=COPY

Every F1 record and every F2 record contributes to the output. Matched pairs: 80 bytes from F1 and 70 from F2. Unpaired F1: 80 bytes from F1, 70 bytes blank. Unpaired F2: 80 bytes blank, 70 bytes from F2. Total record length is 150 bytes for every row.

Identifying Matched vs Unpaired Rows

You can detect whether a row came from a matched pair or from an unpaired F1 or F2 by: (1) checking if the F1 or F2 portion is blank (or zero for numeric fields); (2) using a REFORMAT indicator. For the indicator, add ? in REFORMAT FIELDS=; DFSORT will set that byte to a value indicating paired, unpaired F1, or unpaired F2 (see your DFSORT manual for the exact values). Then use OUTFIL INCLUDE/OMIT to split output—e.g. one file for matched, one for F1-only, one for F2-only.

Example: splitting output by match type

text
1
2
3
4
5
REFORMAT FIELDS=(F1:1,80,F2:11,70,?) JOIN UNPAIRED,F1,F2 OUTFIL FNAMES=MATCHED,INCLUDE=(151,1,CH,EQ,C' ') OUTFIL FNAMES=F1ONLY,INCLUDE=(151,1,CH,EQ,C'1') OUTFIL FNAMES=F2ONLY,INCLUDE=(151,1,CH,EQ,C'2')

Position 151 (if 80+70+1) holds the indicator. Exact indicator values (blank, '1', '2') may vary by DFSORT version; consult your installation’s documentation. The idea is to use that byte to route rows to different output files.

How Unpaired Records Are Filled

For unpaired F1 records, the F2 portion of REFORMAT is filled with blanks (CH) or low values (numeric). For unpaired F2 records, the F1 portion is filled with blanks or low values. So the record layout is always the same; only the “other side” is empty.

Duplicate Keys and Row Count

If F1 has two records with key "X" and F2 has three records with key "X", you get 2 × 3 = 6 matched rows for "X". If a key appears only in F1 (e.g. one record), you get one row (F1 data, blank F2). If a key appears only in F2 (e.g. two records), you get two rows (F2 data, blank F1). So the total number of output rows is the sum of: all matched combinations, plus one per unpaired F1 record, plus one per unpaired F2 record.

When to Use a Full Outer Join

  • Reconciliation: you need to see all keys from both systems—matched, in file 1 only, and in file 2 only.
  • Exception or audit reporting: list all masters and all transactions in one dataset and flag which have no match.
  • Data quality: find keys that exist in one file but not the other (orphans, missing references).
  • Any requirement where no record from either file may be dropped.

Full Outer vs Left and Right

Left join keeps all F1 and drops F2-only records. Right join keeps all F2 and drops F1-only records. Full outer keeps both: it is the union of “all F1 rows (with F2 when matched)” and “all F2 rows (with F1 when matched),” with duplicate matches (same key in both) represented as matched rows, not as separate unpaired rows.

JCL: DD Names

Same as other JOINKEYS: SORTJNF1 and SORTJNF2 (or F1= and F2=), SORTOUT for the joined output.

text
1
2
3
4
5
6
7
//STEP1 EXEC PGM=SORT //SORTJNF1 DD DSN=MY.FILE1,DISP=SHR //SORTJNF2 DD DSN=MY.FILE2,DISP=SHR //SORTOUT DD DSN=MY.FULLJOIN,DISP=(NEW,CATLG),... //SYSIN DD * ... JOINKEYS, REFORMAT, JOIN UNPAIRED,F1,F2, SORT FIELDS=COPY ... /*

Explain It Like I'm Five

You have two lists: kids in the class (file 1) and kids who got a sticker (file 2). You want one big list that has everyone—every kid in the class and every kid who got a sticker. If a kid is on both lists, you write their name and “in class” and “got sticker.” If a kid is only on the class list, you write their name and “in class” and leave the sticker part blank. If a kid is only on the sticker list (maybe a visitor), you write their name and leave the class part blank and “got sticker.” Nobody is left out from either list. That’s the full outer join.

Exercises

  1. Write control statements for a full outer join of two files with 8-byte character key at position 1, REFORMAT outputting 60 bytes from F1 and 40 bytes from F2. Include JOIN UNPAIRED,F1,F2.
  2. File 1 has 100 records; file 2 has 80. Fifty keys are in both files, 50 only in F1, 30 only in F2. Assuming one record per key in each file, how many output rows do you get?
  3. How would you use OUTFIL to write matched rows to one file and unpaired (F1-only and F2-only) to another? (Assume you have a REFORMAT indicator at a known position.)
  4. When would you choose a full outer join over a left join plus a separate right join run?

Quiz

Test Your Knowledge

1. How do you request a full outer join in DFSORT JOINKEYS?

  • Code JOIN FULL
  • Code JOIN UNPAIRED,F1,F2 after the two JOINKEYS and REFORMAT; this keeps all records from both files
  • Code JOIN UNPAIRED,F1 and JOIN UNPAIRED,F2 on separate lines
  • Use two passes: left join then right join

2. What appears in the output for a key that exists only in F1?

  • Nothing; it is dropped
  • One row with F1 data and the F2 portion of the record filled with blanks (or low values)
  • One row with F2 data only
  • Multiple rows

3. What appears in the output for a key that exists only in F2?

  • Nothing; it is dropped
  • One row with F2 data and the F1 portion of the record filled with blanks (or low values)
  • One row with F1 data only
  • Multiple rows

4. Can the same key produce both a "matched" row and an "unpaired" row in a full outer join?

  • Yes; you get one row per F1 record and one per F2 record, so a key in both files can appear in both matched and unpaired rows
  • No; each record is either matched (one row with F1+F2) or unpaired (one row with one side blank). A key in both files produces only matched row(s)
  • Only when there are duplicates
  • Only with REFORMAT indicator

5. When is a full outer join useful?

  • Only for debugging
  • When you need a complete picture: every record from both files, with the other file’s data when a match exists—e.g. reconciliation, exception reporting, or “all masters and all transactions” in one dataset
  • Only when both files are small
  • Only with ICETOOL