How do I compare two files with DFSORT?

Use JOINKEYS to compare two files by key. Sort (or ensure) both files are in key order. Run JOINKEYS with F1 and F2 pointing to the two files and the same key for each. Use UNPAIRED,F,F to get all records (matched and unmatched) or UNPAIRED,ONLY to get only adds and deletes. Use REFORMAT and OUTFIL to split output into records-only-in-file1, only-in-file2, and matched.

What is UNPAIRED,ONLY in JOINKEYS?

UNPAIRED,ONLY means only records that do not have a matching key in the other file are written to the join output. So you get records in F1 not in F2 (deletes) and records in F2 not in F1 (adds). No matched records are included. It is used for difference or delta reporting between two key-ordered files.

Do both files need to be sorted for JOINKEYS comparison?

Yes. JOINKEYS assumes both inputs are in ascending order by the join key. It performs a merge-style join. If either file is not in key order, matches will be incorrect or missed. Sort both files by the same key before the JOINKEYS step, or feed already-ordered datasets.

How do I get three separate files: only in file1, only in file2, and in both?

Run JOINKEYS with UNPAIRED,F,F so all records are present. In REFORMAT include a one-byte indicator that you set (via BUILD or constants) to indicate source: e.g. 1=only F1, 2=only F2, 3=both. Then use OUTFIL with INCLUDE=(indicator,1,CH,EQ,C'1'), FNAMES=ONLYF1, and similar for 2 and 3 to write three DDs. Or use SPLITBY on that indicator.

Can DFSORT compare full record content, not just keys?

JOINKEYS matches on key and can REFORMAT to include full records from both files in one output record. To find records where content differs, you need a second step: a program or ICETOOL/exit that compares the two copies byte-by-byte and keeps or flags differences. DFSORT itself does not have a single "compare bytes" operator; you build it from JOINKEYS plus a comparator.

File Comparison

Comparing two datasets is a common mainframe task: you need to know what was added, deleted, or changed between two versions of a file (e.g. before and after a load, or two extracts from different systems). DFSORT does not have a single "compare" verb, but you can achieve full file comparison using JOINKEYS. JOINKEYS joins two files on a key; by using UNPAIRED options and REFORMAT you can produce separate outputs for records only in the first file, only in the second file, and in both. This page explains how to compare two files by key (adds and deletes), how to use UNPAIRED,ONLY and UNPAIRED,F,F, how to split the join output into three files, and how to extend the approach to find content changes (not just key presence) by comparing full records or selected fields after the join.

Real World Use Cases

Progress0 of 0 lessons

Why Compare Two Files?

In batch processing you often have two versions of the same logical file: yesterday's extract and today's, or the output of a program run before and after a change. You need to know: which keys were added (in the new file but not the old), which were deleted (in the old but not the new), and optionally which records have the same key but different content (changes). This supports audit, reconciliation, and change-driven processing. DFSORT's JOINKEYS facility is designed for joining two files on a key; that same mechanism gives you a powerful way to compare two files by key and to feed a downstream step that compares content.

File comparison methods with DFSORT
What you want	DFSORT approach	Purpose
Key-based (adds/deletes)	JOINKEYS UNPAIRED	Find records in one file but not the other by join key
Key + content (changes)	JOINKEYS + REFORMAT + comparator	Match on key, then compare full record or fields to find changed records
Same key, different data	JOINKEYS REFORMAT	Output both versions side by side for downstream comparison

Key-Based Comparison: JOINKEYS

JOINKEYS reads two files (F1 and F2), both assumed to be in ascending order by a join key. It matches records with the same key and can output: (1) only matched pairs, (2) only unmatched records (records in F1 with no match in F2, and records in F2 with no match in F1), or (3) all records (matched and unmatched). For file comparison you typically use the unmatched or all options so you can identify adds and deletes.

JOINKEYS UNPAIRED options

The UNPAIRED control on the JOINKEYS statement determines which records appear in the join output. The main values for comparison are:

UNPAIRED,F,F — Include all records: every F1 record and every F2 record. Matched keys appear as paired records (depending on REFORMAT); unmatched F1 records are included; unmatched F2 records are included. So you get a single stream that you can later split by a source indicator.
UNPAIRED,ONLY — Include only unmatched records. Records that exist in F1 but not in F2 (deletes when F1=old, F2=new) and records that exist in F2 but not in F1 (adds). No matched records are written. This is the usual choice when you only care about the delta (adds and deletes).
UNPAIRED,F — Unmatched records from F1 only (records in F1 that have no match in F2).
UNPAIRED,NONE (default) — Only matched pairs. Unmatched records from either file are dropped. Not typically used for comparison; used for inner join.

For a classic "before vs after" comparison: treat F1 as the old file and F2 as the new file. Then records only in F1 are deletes (removed in the new), and records only in F2 are adds (new in the new). Use UNPAIRED,ONLY to get just those two groups, or UNPAIRED,F,F and then split by an indicator to get three files: only-F1, only-F2, and matched.

Ensuring Both Files Are in Key Order

JOINKEYS assumes both inputs are sorted in ascending order by the join key. If either file is not in that order, the join will miss matches or produce wrong unmatched output. So before the JOINKEYS step you must either: (1) sort both files by the comparison key and pass the sorted datasets as F1 and F2, or (2) use a single DFSORT step that does SORT FIELDS= on both (e.g. via two SORTIN-like inputs and a merge) so that the outputs are in key order. In practice, many shops run a separate SORT for each file and then run JOINKEYS with the two sorted results. The JOINKEYS step does not sort; it only joins.

JOINKEYS Control Statements for Comparison

You need: (1) JOINKEYS F1=... with the key for the first file, (2) JOINKEYS F2=... with the key for the second file (same key length and type as F1), (3) optionally REFORMAT to define the output record, and (4) optionally SORT or COPY and OUTFIL to format or split the join output. Example: compare two files on a 20-byte key at position 1, and output only unmatched records (adds and deletes).

text

1
2
3
4
5
6
  JOINKEYS F1=SORTIN,FIELDS=(1,20,CH)
  JOINKEYS F2=SORTIN2,FIELDS=(1,20,CH)
  REFORMAT FIELDS=(F1:1,80,F2:1,80)
  JOIN UNPAIRED,ONLY
  SORT FIELDS=COPY
  OUTFIL FNAMES=DIFF

Here SORTIN is the first file (e.g. old), SORTIN2 is the second (e.g. new). Both must be sorted by 1,20,CH. REFORMAT FIELDS=(F1:1,80,F2:1,80) writes an 160-byte record: first 80 bytes from F1, next 80 from F2. For UNPAIRED,ONLY, when a record is only in F1 the F2 part is blank (or low values); when only in F2 the F1 part is blank. So you can tell adds vs deletes by checking which half has data. In practice you often add a one-byte indicator in REFORMAT (e.g. from a constant or from the join) and then use OUTFIL INCLUDE/OMIT or SPLITBY to write separate add and delete files.

Producing Three Outputs: Only in File1, Only in File2, In Both

To get three separate datasets—records only in the first file, only in the second, and in both—use UNPAIRED,F,F so the join output contains every record. In REFORMAT you must include a field that indicates which case each record is: only F1, only F2, or both. DFSORT does not automatically add such a field; you can use a user exit (MODS) to set it, or in some setups you can infer it from the presence of data in F1 vs F2 portions (e.g. if F2 portion is blank then record was only in F1). A common approach is to use REFORMAT to build a record that includes a one-byte code: e.g. 1=only F1, 2=only F2, 3=both. That requires the exit or post-processing. Alternatively, use two JOINKEYS steps: one with UNPAIRED,F to get only-F1 records, one with UNPAIRED with F2 to get only-F2 records, and one with UNPAIRED,NONE to get matched; then you have three outputs from three steps. Many shops use one JOINKEYS with UNPAIRED,F,F and a REFORMAT that includes F1 and F2 full records, then a second step (or ICETOOL) that checks whether the F1 part is blank (then it's an add) or F2 part is blank (then it's a delete) and routes to three files.

Example: REFORMAT with source indicator via constants

If your REFORMAT can distinguish records by checking blank/low-value in F1 or F2, you can use OUTFIL with INCLUDE. For example REFORMAT FIELDS=(F1:1,80,F2:1,80). For unmatched F1 records, F2 bytes are typically blank. For unmatched F2 records, F1 bytes are typically blank. So you could use OUTFIL with INCLUDE=(81,80,CH,EQ,C' ') for "only in F1" (F2 part all blanks) and OMIT with the same for "only in F2"—but that is fragile if data contains blanks. The robust way is to use a MODS exit to set a byte to 1/2/3 and then SPLITBY or INCLUDE/OMIT on that byte. For illustration, a simple split that works when unused half is blank:

text

1
2
3
4
5
6
7
8
  JOINKEYS F1=SORTIN,FIELDS=(1,20,CH)
  JOINKEYS F2=SORTIN2,FIELDS=(1,20,CH)
  REFORMAT FIELDS=(F1:1,80,F2:1,80)
  JOIN UNPAIRED,F,F
  SORT FIELDS=COPY
  OUTFIL FNAMES=ONLYF1,INCLUDE=(81,80,CH,EQ,C'                ')
  OUTFIL FNAMES=ONLYF2,INCLUDE=(1,80,CH,EQ,C'                ')
  OUTFIL FNAMES=BOTH,OMIT=(81,80,CH,EQ,C' '),OMIT=(1,80,CH,EQ,C' ')

The exact OMIT/INCLUDE conditions depend on your record layout and how DFSORT fills the "missing" side for unmatched records (often blanks). Check your DFSORT documentation for REFORMAT and UNPAIRED behavior. When in doubt, use a user exit to set an explicit indicator and split on that.

Comparing Full Record Content (Changes)

To find not just adds and deletes but also "changed" records (same key, different content), you need to compare the full record or selected fields after the join. Step 1: JOINKEYS with UNPAIRED,NONE (or UNPAIRED,F,F and then keep only matched) so you have matched pairs. Step 2: REFORMAT so the output record contains the key and the full record (or fields to compare) from both F1 and F2—for example F1:1,80 and F2:1,80. Step 3: Use a second DFSORT step or a small program to compare the two halves byte-by-byte. If they differ, write the record to a "changes" file. DFSORT does not have a built-in "compare two fields" operator in INCLUDE/OMIT; you need a user exit (MODS) or a separate program/ICETOOL step to do the byte comparison and keep only differing records. So the pattern is: JOINKEYS (match on key) → REFORMAT (both versions in one record) → exit or next step (compare bytes, output only differences).

JCL for File Comparison

You need two input DDs for the two files (e.g. SORTIN and SORTIN2, or F1 and F2 as required by your DFSORT version) and one or more output DDs. Example JCL skeleton:

text

1
2
3
4
5
6
7
8
9
  //STEP     EXEC PGM=SORT
  //SORTIN   DD DSN=OLD.FILE,DISP=SHR
  //SORTIN2  DD DSN=NEW.FILE,DISP=SHR
  //SORTOUT  DD DSN=COMPARE.OUTPUT,DISP=(NEW,CATLG),...
  //SYSIN    DD *
    JOINKEYS F1=SORTIN,FIELDS=(1,20,CH)
    JOINKEYS F2=SORTIN2,FIELDS=(1,20,CH)
    ...
  /*

Ensure OLD.FILE and NEW.FILE are both sorted by (1,20,CH) before this step. The join output goes to SORTOUT (or to OUTFIL DDs if you use multiple OUTFILs with FNAMES=).

Best Practices

Always sort both inputs by the same join key before JOINKEYS; otherwise results are wrong.
Use UNPAIRED,ONLY when you only need adds and deletes; it keeps the output small.
For three-way split (only F1, only F2, both), use a reliable indicator (e.g. MODS exit) rather than relying on blank detection unless your data guarantees it.
For content comparison (changes), combine JOINKEYS + REFORMAT with a comparator (exit or next step); document the comparison rule (e.g. which bytes are compared).
Validate record counts: e.g. only-F1 + only-F2 + 2*both should match total input records when using UNPAIRED,F,F, depending on how you count.

Explain It Like I'm Five

Imagine you have two lists of toy names: yesterday's list and today's. You want to know which toys are new (on today's list but not yesterday's), which are gone (on yesterday's but not today's), and which are on both lists. JOINKEYS is like a helper who lines up both lists by name and tells you: "this name only on the first list," "this name only on the second list," and "this name on both." So you get three piles: new toys, gone toys, and same toys. If you also want to know "same name but the color changed," you need another step that looks at both lists side by side and says when the rest of the line is different.

Exercises

Write JOINKEYS control statements to compare two files on a 10-byte key at position 1 and output only unmatched records (adds and deletes). Assume F1=SORTIN, F2=SORTIN2.
What is the difference between UNPAIRED,F,F and UNPAIRED,ONLY? When would you use each?
Both input files have 80-byte records. You want one output file with records that exist in both files but with different content (same key, different bytes). Describe the two-step approach (JOINKEYS + what next?).
Why can you not use JOINKEYS correctly if one of the input files is not sorted by the join key?

Quiz

Test Your Knowledge

1. What is the main DFSORT feature used to compare two files by key?

INCLUDE/OMIT only
JOINKEYS: it joins two files on a key and can output matched, unmatched from F1, or unmatched from F2 so you can see adds, deletes, and matches
SUM only
OUTFIL only

2. How do you get three outputs: records only in file 1, only in file 2, and in both?

Three separate SORT steps
One JOINKEYS step with REFORMAT and OUTFIL SPLITBY or multiple OUTFIL with INCLUDE on a match indicator field that you put in REFORMAT (e.g. 1,1 from F1 and a constant for F2; then INCLUDE/OMIT on that indicator)
MERGE only
SUM only

3. What does UNPAIRED,ONLY do in JOINKEYS?

Keeps only matched records
Outputs only records that do not have a match in the other file: records in F1 not in F2, and records in F2 not in F1 (adds and deletes)
Keeps the first record of each key
Deduplicates

4. Why must both input files be in key order for JOINKEYS comparison?

They do not need to be
JOINKEYS assumes both files are sorted (or at least ordered) by the join key; it reads both streams in key order and matches by advancing through both. If either file is not in key order, matches are missed or wrong
Only SORTIN must be sorted
Only for MERGE

5. How can you compare full records (not just keys) to find changed content?

JOINKEYS cannot do that
Use JOINKEYS to match on key, REFORMAT to include the full records (or the fields to compare) from both F1 and F2 in one output record, then use a later step or OUTREC/IFTHEN to compare bytes and flag or output only records where the content differs
Only with SUM
Only with INCLUDE

File Comparison

Why Compare Two Files?

Key-Based Comparison: JOINKEYS

JOINKEYS UNPAIRED options

Ensuring Both Files Are in Key Order

JOINKEYS Control Statements for Comparison

Producing Three Outputs: Only in File1, Only in File2, In Both

Example: REFORMAT with source indicator via constants

Comparing Full Record Content (Changes)

JCL for File Comparison

Best Practices

Explain It Like I'm Five

Exercises

Quiz

Test Your Knowledge

Related Concepts

JOINKEYS and inner joins

Handling unmatched records

Reformat statement

Detecting changes between records

Related Pages