MainframeMaster

Detecting Changes Between Records

Detecting changes between records in DFSORT can mean two things. First: within a single file, detecting when a control field (e.g. department, account ID) changes from one record to the next—a control break. That lets you output headers, trailers, or only the first or last record of each group. Second: comparing two files to find which records appear only in the first file, only in the second, or in both. DFSORT supports the first with OUTFIL SECTIONS= and HEADER/TRAILER; it supports the second with JOINKEYS and UNPAIRED. This tutorial covers both and shows how to use them for reporting and file comparison.

Advanced Tricks
Progress0 of 0 lessons

Two Kinds of Change Detection

In one file, "change" usually means the value of a key or control field (e.g. department code, account number) is different from the previous record. That is a control break. You sort by that field so all records with the same value are adjacent; then you can output something at the start or end of each group (header/trailer) or keep only the first or last record per group. In two files, "change" means which records are in file A but not B (deletes), in B but not A (adds), or in both (matches). DFSORT handles the first with SECTIONS= and the second with JOINKEYS UNPAIRED.

Control Breaks: SORT and OUTFIL SECTIONS

To detect when a control field changes, sort by that field first. Use SORT FIELDS=(start,length,format,direction) with the control field. Then in OUTFIL specify SECTIONS=(start,length,format) with the same field. SECTIONS= tells DFSORT what defines a "section" (group). As output is written, whenever that field value changes, a new section begins. HEADER1=, HEADER2=, or HEADER3= are printed at the start of each section; TRAILER1=, TRAILER2=, or TRAILER3= are printed at the end of each section. So you get automatic change detection: every time the key value changes, you get a header (and when the group ends, a trailer). You can put fixed text, the key value, or counts and totals in the header and trailer.

Example: Section Header and Trailer at Each Department Change

Input: fixed-length records with department in 1–4 and amount in 20–27. Sort by department. At each department change, print a header line "DEPT xxx" and at the end of each department print a trailer with count and total.

text
1
2
3
4
5
SORT FIELDS=(1,4,CH,A) OUTFIL FNAMES=OUT, SECTIONS=(1,4,CH, HEADER1=(1,20,C'--- Dept: ',5,4,CH), TRAILER1=(1,20,C'--- Count: ',21,8,COUNT=(10,8,ZD),30,20,TOTAL=(20,8,PD,M11)))

SECTIONS=(1,4,CH) uses the department field (1–4) as the section key. When the value in 1–4 changes, a new section starts. HEADER1= prints at the start of each section: the literal "--- Dept: " and the department value. TRAILER1= prints at the end of each section: "--- Count: ", the record count for that section (COUNT=), and the total of the amount field (TOTAL= with edit mask M11). So you see a clear "change" (new department) at each header and a summary at each trailer.

First or Last Record of Each Group

Sometimes you want only the first record of each group (e.g. first order per customer) or only the last. With SECTIONS=, HEADER1= is printed when a new section starts; at that moment the "current" record is the first record of the section. So you can set HEADER1=(BUILD=(1,80)) or similar to output the first record in the header. If you specify NODETAIL, the detail records are suppressed and only the header (and trailer) lines are written—so you get one line per section, and that line can be the first record of the section. For the last record of each group, use TRAILER1= with BUILD to output the current record (the last of the section). NODETAIL then gives you only those trailer lines, i.e. the last record of each group. The exact BUILD layout depends on your record length and positions.

Comparing Two Files: JOINKEYS and UNPAIRED

To detect changes between two files (what was added, what was deleted), use JOINKEYS. Define F1= as the first file (e.g. yesterday's extract) and F2= as the second file (e.g. today's extract), both with the same join key (e.g. account number). JOINKEYS F1,F2 with the same key fields. Then use JOIN UNPAIRED,F1 to keep only records that are in F1 but not in F2 (e.g. "deleted" or "removed"). JOIN UNPAIRED,F2 keeps only records in F2 but not in F1 (e.g. "added"). JOIN UNPAIRED,F1,F2 keeps both unmatched sets. REFORMAT FIELDS= can include F1 and F2 data plus a match indicator. After the join, SORT FIELDS=COPY and OUTFIL with INCLUDE/OMIT on the match indicator let you write "in A only," "in B only," and "in both" to separate files.

Example: Records in File A Not in File B

Two files with 10-byte key in position 1. Find records that are in the first file (SORTIN) but not in the second (SORTIN2). JOINKEYS with F1 and F2; JOIN UNPAIRED,F1.

text
1
2
3
4
5
JOINKEYS F1=SORTIN,FIELDS=(1,10,CH) JOINKEYS F2=SORTIN2,FIELDS=(1,10,CH) REFORMAT FIELDS=(F1:1,10) SORT FIELDS=COPY JOIN UNPAIRED,F1

F1 and F2 are joined on 1,10,CH. REFORMAT FIELDS=(F1:1,10) outputs only the key from F1 (you can add more fields). JOIN UNPAIRED,F1 keeps only records that appeared in F1 but had no match in F2. So the output is "records in A but not in B"—the "deleted" or "removed" set. To get "in B but not A," use JOIN UNPAIRED,F2 and REFORMAT FIELDS=(F2:1,10) (or the full F2 record).

Strategy Summary

Change detection approaches in DFSORT
GoalApproach
Detect when key changes (control break)Sort by key. OUTFIL SECTIONS=(key) with HEADER/TRAILER to react at each new group.
First or last record per groupSort by key. SECTIONS= with HEADER1= (first) or TRAILER1= (last) and NODETAIL; or use exit/second pass.
Records in A but not B (or B but not A)JOINKEYS F1=A, F2=B; JOIN UNPAIRED,F1 or UNPAIRED,F2; OUTFIL to separate files.

Explain It Like I'm Five

Imagine a list of toys: first all the red ones, then all the blue ones, then all the green ones. "Detecting a change" means noticing when the color switches from red to blue or blue to green. DFSORT does that by putting the list in order (sort by color), then every time the color changes it can write a little sign that says "new color!" (that's the header) and when a color group ends it can write "that was 5 red toys" (that's the trailer). For two lists (yesterday's toys and today's), "change" means: which toys are new today (only in today's list) and which toys are missing (only in yesterday's list). DFSORT can compare the two lists by matching names and then keeping only the ones that don't have a match—so you get "new" and "missing" lists.

Exercises

  1. Input has department in 1–4. You want a header line "DEPT xxx" at the start of each department and a trailer "END DEPT xxx" at the end. Write the SORT and OUTFIL SECTIONS= with HEADER1 and TRAILER1.
  2. You have two files with key in 1–8. You want one output with records only in file A and another with records only in file B. What JOINKEYS and JOIN options do you use?
  3. How would you output only the last record of each department (assuming records are already sorted by department)?

Quiz

Test Your Knowledge

1. What does "detecting changes between records" mean in DFSORT?

  • Only sorting
  • Identifying when a key or control field changes from one record to the next (control break), or identifying which records differ between two files
  • Copying only
  • Summing only

2. How do you detect a "control break" (change in a key field) in a single sorted file?

  • INCLUDE only
  • Sort by the control field, then use OUTFIL SECTIONS=(control_field) with HEADER1/HEADER2/HEADER3 or TRAILER1/2/3 to output lines when the section (group) changes; you can also use NODETAIL and BUILD to output only first or last record per group with IFTHEN
  • SUM only
  • MERGE only

3. How do you find records that are in file A but not in file B (or vice versa)?

  • INCLUDE only
  • Use JOINKEYS to join the two files on the key, with UNPAIRED,F1 or UNPAIRED,F2 or UNPAIRED,F1,F2; REFORMAT and SORT FIELDS=COPY; then OUTFIL to route matched vs unmatched
  • SUM FIELDS only
  • Single SORT only

4. What is OUTFIL SECTIONS= used for in change detection?

  • Only report headers
  • SECTIONS= defines the control-break key; when the key value changes, a new "section" starts, so DFSORT can print HEADER at section start and TRAILER at section end—that is change detection at each new key value
  • Only for totals
  • Only for INCLUDE

5. You want to output only the first record of each group (e.g. first record per department). How can DFSORT help?

  • SUM only
  • Sort by the group key, then use OUTFIL SECTIONS= with HEADER1= that contains the first record of the section (e.g. BUILD or COPY), and NODETAIL so only the header (first record) is written; or use a user exit or multiple passes to keep only first of group
  • OMIT only
  • MERGE only