What is an inner join in DFSORT JOINKEYS?

An inner join keeps only records that have a matching key in both files. You get one output record per matched pair (or per combination when there are multiple records with the same key in either file). Do not code JOIN UNPAIRED—that is the default. Records in file 1 with no match in file 2, and records in file 2 with no match in file 1, are dropped.

How do I do an inner join with two files in DFSORT?

Code two JOINKEYS statements (one per file) with FIELDS= defining the join key for each file, then REFORMAT FIELDS= to build the output record. Do not code JOIN UNPAIRED. Allocate the two files to SORTJNF1 and SORTJNF2 (or use F1= and F2=). Follow with SORT FIELDS=COPY or SORT FIELDS= as needed. Only matching records are written.

What happens to unmatched records in an inner join?

Unmatched records are dropped. A record in file 1 with no record in file 2 with the same key does not appear in the output. A record in file 2 with no match in file 1 is also dropped. So the output contains only pairs that matched on the join key.

Can the join key be in different positions in the two files?

Yes. Each JOINKEYS specifies FIELDS=(position, length, format, order) for that file. The key in file 1 can be at 1–10 and the key in file 2 at 20–29. The values must match; the positions can differ. REFORMAT can then pick any fields from each file for the output, including or excluding the key.

If both files have duplicate keys, how many output records result?

You get a row for each combination of matching records. If file 1 has 3 records with key "X" and file 2 has 2 records with key "X", you get 3 × 2 = 6 output records for key "X". So inner join can produce more records than either file when there are duplicates on the key.

Inner Joins

An inner join in DFSORT JOINKEYS means you keep only records that have a matching key in both input files. Records that exist in one file but have no matching key in the other file are dropped. This is the default behavior when you use JOINKEYS and do not code JOIN UNPAIRED. You need two JOINKEYS statements (one per file) with the join key for each, and a REFORMAT statement to define the joined output layout. The two files can have different record lengths and key positions; duplicate keys in either file produce multiple output rows (one per combination). This page is a deep dive on inner joins: when to use them, control statement order, key position and format, REFORMAT layout, JCL DD names, and one-to-one vs one-to-many behavior.

JOINKEYS

Progress0 of 0 lessons

What an Inner Join Produces

With an inner join, the output contains one row for each matched pair of records. A “pair” is one record from file 1 (F1) and one record from file 2 (F2) that have the same value in the join key. If a key appears only in F1 or only in F2, no output row is produced for that key. So the result is a subset of both files: only the keys that exist in both files, with the combined data you specify in REFORMAT.

Default: No JOIN UNPAIRED

To get an inner join, you do not code JOIN UNPAIRED. As soon as you add JOIN UNPAIRED,F1 or JOIN UNPAIRED,F2 (or both), you are doing an outer join and unmatched records from one or both files are included. So for a pure inner join, the control cards are: two JOINKEYS, REFORMAT, and optionally SORT FIELDS=COPY or SORT FIELDS=—no JOIN statement.

Join type vs JOIN UNPAIRED
Join type	What to code	Result
Inner (default)	No JOIN UNPAIRED	Only records with matching key in both files
Left outer	JOIN UNPAIRED,F1	All F1 records; F2 data when match exists
Right outer	JOIN UNPAIRED,F2	All F2 records; F1 data when match exists
Full outer	JOIN UNPAIRED,F1,F2	All records from both files

Control Statement Order and Example

The usual order is: first JOINKEYS for file 1 (F1), then JOINKEYS for file 2 (F2), then REFORMAT FIELDS=, then SORT FIELDS=COPY (or SORT FIELDS= if you want to re-sort the joined data). The two input files are read from the DD names you specify—default SORTJNF1 and SORTJNF2—or from F1= and F2= if you override.

Example: inner join with keys at position 1

text

1
2
3
4
  JOINKEYS F1=SORTJNF1,FIELDS=(1,10,CH,A)
  JOINKEYS F2=SORTJNF2,FIELDS=(1,10,CH,A)
  REFORMAT FIELDS=(F1:1,80,F2:11,70)
  SORT FIELDS=COPY

File 1 (SORTJNF1) and file 2 (SORTJNF2) are joined on the first 10 bytes (character, ascending). Only records with the same 10-byte key in both files produce output. REFORMAT builds the output record: 80 bytes from file 1 (positions 1–80), then 70 bytes from file 2 (positions 11–80 of F2). Total output length is 80 + 70 = 150 bytes. SORT FIELDS=COPY keeps the order produced by the join without re-sorting.

Join Key in Different Positions

The join key does not have to be in the same position in both files. You specify the position and length for each file in its own JOINKEYS. For example, file 1 might have customer ID at 1–8 and file 2 at 15–22. Then you code FIELDS=(1,8,CH,A) for F1 and FIELDS=(15,8,CH,A) for F2. The values must match; the positions can differ. The format (CH, PD, etc.) and sort order (A or D) should be such that equal values compare equal—typically both keys use the same format and ascending order.

Example: key at different positions

text

1
2
3
4
  JOINKEYS F1=MASTER,FIELDS=(1,8,CH,A)
  JOINKEYS F2=TRAN,FIELDS=(20,8,CH,A)
  REFORMAT FIELDS=(F1:1,100,F2:1,50)
  SORT FIELDS=COPY

The join key for the first file (MASTER) is bytes 1–8; for the second file (TRAN) it is bytes 20–27. Records with the same key value (e.g. same customer ID) are paired. Output is 100 bytes from F1 and 50 bytes from F2 (150 bytes total). Only records that match on that key appear in the output.

REFORMAT: Building the Joined Record

REFORMAT FIELDS= is required. It lists which parts of each file go into the output record. Each item is F1:position,length or F2:position,length. They are concatenated in order. So FIELDS=(F1:1,50,F2:1,40) produces a 90-byte record: first 50 bytes of the F1 record, then first 40 bytes of the F2 record. You can skip the key in the output (e.g. F2:11,70 to avoid duplicating the key from F2) or include it from one or both sides. The output record length is the sum of all the lengths in the REFORMAT list.

One-to-One vs One-to-Many

If each key appears at most once in each file, you get at most one output record per key (one-to-one). If file 1 has two records with key "A001" and file 2 has one record with key "A001", you get two output records: the single F2 record is paired with each of the two F1 records. If both files have two records with "A001", you get four output records (every combination). So the number of output rows per key is the product of the number of F1 records with that key and the number of F2 records with that key. Plan your REFORMAT and downstream processing for possible row expansion when duplicates exist.

JCL: Allocating the Two Files

By default, the first join file is read from SORTJNF1 and the second from SORTJNF2. In JCL you provide two DD statements pointing to your input datasets. You do not use SORTIN for JOINKEYS—SORTIN is for single-file sort/copy. Example JCL:

text

1
2
3
4
5
6
7
  //STEP1    EXEC PGM=SORT
  //SORTJNF1  DD DSN=MY.MASTER,DISP=SHR
  //SORTJNF2  DD DSN=MY.TRAN,DISP=SHR
  //SORTOUT   DD DSN=MY.JOINED,DISP=(NEW,CATLG),...
  //SYSIN     DD *
  ... JOINKEYS and REFORMAT ...
  /*

If you use F1=MASTER and F2=TRAN in the JOINKEYS statements, then you would use DD names MASTER and TRAN instead of SORTJNF1 and SORTJNF2. SORTOUT is the joined output.

Pre-Sorted Input: SORTED and NOSEQCK

If both input files are already sorted by the join key in the same order (e.g. ascending), you can specify SORTED on the JOINKEYS statements so DFSORT does not re-sort them, which can improve performance. You may also specify NOSEQCK to skip sequence checking. The key order (A or D) and key definition must match the physical order of the files. See your DFSORT manual for exact syntax (e.g. JOINKEYS F1=...,FIELDS=(...),SORTED).

When to Use an Inner Join

Use an inner join when you want only records that exist in both files. Typical cases: match a transaction file to a master file and keep only transactions that have a valid master; match an order file to a product file and keep only orders for products that exist; combine two extracts that share a common key and you care only about keys present in both. If you need to keep records that have no match (e.g. “all customers and their orders, including customers with no orders”), use a left outer join (JOIN UNPAIRED,F1) instead.

Explain It Like I'm Five

You have two lists: one list of kids and their favorite color, and one list of kids and their favorite snack. You want one list that has kid, color, and snack—but only for kids who appear on both lists. So you look at the first list and for each name you look on the second list. If the name is on both lists, you write one line with the name, color, and snack. If a name is only on the first list or only on the second list, you skip it. That’s the inner join: only the names that are on both lists get a line in the result.

Exercises

Write JOINKEYS control statements for an inner join of two files (DD names FILE1 and FILE2), both with a 6-byte character key at position 1. REFORMAT should output 60 bytes from file 1 and 40 bytes from file 2 (starting at position 1 of each).
If file 1 has 100 records and file 2 has 80 records, and every key in file 1 exists in file 2 but 20 keys in file 2 do not exist in file 1, approximately how many output records do you get from an inner join? (Assume at most one record per key in each file.)
What is the difference between inner join and left outer join when a record in file 1 has no match in file 2?
Why do you not code JOIN UNPAIRED for an inner join?

Quiz

Test Your Knowledge

1. How do you request an inner join with JOINKEYS?

Code JOIN INNER
Do not code JOIN UNPAIRED—the default is inner join; only records with matching keys in both files are written
Code JOIN UNPAIRED,F1,F2
Use INCLUDE only

2. What happens to a record in file 1 that has no matching key in file 2?

It is written with blank F2 data
It is dropped—inner join keeps only records that have a match in both files
It is written to a separate file
It causes an error

3. If file 1 has two records with key "A001" and file 2 has one record with key "A001", how many output records do you get for that key with an inner join?

One
Two—each F1 record is matched with the F2 record, so you get two joined records (one-to-many)
Three
None

4. What DD names are used for the two join inputs if you do not specify F1= or F2=?

SORTIN and SORTOUT
SORTJNF1 and SORTJNF2
JOIN1 and JOIN2
F1 and F2

5. Why is REFORMAT required for an inner join with JOINKEYS?

It is optional
REFORMAT FIELDS= defines which fields from F1 and F2 go into the joined output record; without it DFSORT does not know the output layout
Only for outer joins
To sort the result

Inner Joins

What an Inner Join Produces

Default: No JOIN UNPAIRED

Control Statement Order and Example

Example: inner join with keys at position 1

Join Key in Different Positions

Example: key at different positions

REFORMAT: Building the Joined Record

One-to-One vs One-to-Many

JCL: Allocating the Two Files

Pre-Sorted Input: SORTED and NOSEQCK

When to Use an Inner Join

Explain It Like I'm Five

Exercises

Quiz

Test Your Knowledge

Related Concepts

JOINKEYS

REFORMAT statement

Left joins

Related Pages