Detecting changes between records in DFSORT can mean two things. First: within a single file, detecting when a control field (e.g. department, account ID) changes from one record to the next—a control break. That lets you output headers, trailers, or only the first or last record of each group. Second: comparing two files to find which records appear only in the first file, only in the second, or in both. DFSORT supports the first with OUTFIL SECTIONS= and HEADER/TRAILER; it supports the second with JOINKEYS and UNPAIRED. This tutorial covers both and shows how to use them for reporting and file comparison.
In one file, "change" usually means the value of a key or control field (e.g. department code, account number) is different from the previous record. That is a control break. You sort by that field so all records with the same value are adjacent; then you can output something at the start or end of each group (header/trailer) or keep only the first or last record per group. In two files, "change" means which records are in file A but not B (deletes), in B but not A (adds), or in both (matches). DFSORT handles the first with SECTIONS= and the second with JOINKEYS UNPAIRED.
To detect when a control field changes, sort by that field first. Use SORT FIELDS=(start,length,format,direction) with the control field. Then in OUTFIL specify SECTIONS=(start,length,format) with the same field. SECTIONS= tells DFSORT what defines a "section" (group). As output is written, whenever that field value changes, a new section begins. HEADER1=, HEADER2=, or HEADER3= are printed at the start of each section; TRAILER1=, TRAILER2=, or TRAILER3= are printed at the end of each section. So you get automatic change detection: every time the key value changes, you get a header (and when the group ends, a trailer). You can put fixed text, the key value, or counts and totals in the header and trailer.
Input: fixed-length records with department in 1–4 and amount in 20–27. Sort by department. At each department change, print a header line "DEPT xxx" and at the end of each department print a trailer with count and total.
12345SORT FIELDS=(1,4,CH,A) OUTFIL FNAMES=OUT, SECTIONS=(1,4,CH, HEADER1=(1,20,C'--- Dept: ',5,4,CH), TRAILER1=(1,20,C'--- Count: ',21,8,COUNT=(10,8,ZD),30,20,TOTAL=(20,8,PD,M11)))
SECTIONS=(1,4,CH) uses the department field (1–4) as the section key. When the value in 1–4 changes, a new section starts. HEADER1= prints at the start of each section: the literal "--- Dept: " and the department value. TRAILER1= prints at the end of each section: "--- Count: ", the record count for that section (COUNT=), and the total of the amount field (TOTAL= with edit mask M11). So you see a clear "change" (new department) at each header and a summary at each trailer.
Sometimes you want only the first record of each group (e.g. first order per customer) or only the last. With SECTIONS=, HEADER1= is printed when a new section starts; at that moment the "current" record is the first record of the section. So you can set HEADER1=(BUILD=(1,80)) or similar to output the first record in the header. If you specify NODETAIL, the detail records are suppressed and only the header (and trailer) lines are written—so you get one line per section, and that line can be the first record of the section. For the last record of each group, use TRAILER1= with BUILD to output the current record (the last of the section). NODETAIL then gives you only those trailer lines, i.e. the last record of each group. The exact BUILD layout depends on your record length and positions.
To detect changes between two files (what was added, what was deleted), use JOINKEYS. Define F1= as the first file (e.g. yesterday's extract) and F2= as the second file (e.g. today's extract), both with the same join key (e.g. account number). JOINKEYS F1,F2 with the same key fields. Then use JOIN UNPAIRED,F1 to keep only records that are in F1 but not in F2 (e.g. "deleted" or "removed"). JOIN UNPAIRED,F2 keeps only records in F2 but not in F1 (e.g. "added"). JOIN UNPAIRED,F1,F2 keeps both unmatched sets. REFORMAT FIELDS= can include F1 and F2 data plus a match indicator. After the join, SORT FIELDS=COPY and OUTFIL with INCLUDE/OMIT on the match indicator let you write "in A only," "in B only," and "in both" to separate files.
Two files with 10-byte key in position 1. Find records that are in the first file (SORTIN) but not in the second (SORTIN2). JOINKEYS with F1 and F2; JOIN UNPAIRED,F1.
12345JOINKEYS F1=SORTIN,FIELDS=(1,10,CH) JOINKEYS F2=SORTIN2,FIELDS=(1,10,CH) REFORMAT FIELDS=(F1:1,10) SORT FIELDS=COPY JOIN UNPAIRED,F1
F1 and F2 are joined on 1,10,CH. REFORMAT FIELDS=(F1:1,10) outputs only the key from F1 (you can add more fields). JOIN UNPAIRED,F1 keeps only records that appeared in F1 but had no match in F2. So the output is "records in A but not in B"—the "deleted" or "removed" set. To get "in B but not A," use JOIN UNPAIRED,F2 and REFORMAT FIELDS=(F2:1,10) (or the full F2 record).
| Goal | Approach |
|---|---|
| Detect when key changes (control break) | Sort by key. OUTFIL SECTIONS=(key) with HEADER/TRAILER to react at each new group. |
| First or last record per group | Sort by key. SECTIONS= with HEADER1= (first) or TRAILER1= (last) and NODETAIL; or use exit/second pass. |
| Records in A but not B (or B but not A) | JOINKEYS F1=A, F2=B; JOIN UNPAIRED,F1 or UNPAIRED,F2; OUTFIL to separate files. |
Imagine a list of toys: first all the red ones, then all the blue ones, then all the green ones. "Detecting a change" means noticing when the color switches from red to blue or blue to green. DFSORT does that by putting the list in order (sort by color), then every time the color changes it can write a little sign that says "new color!" (that's the header) and when a color group ends it can write "that was 5 red toys" (that's the trailer). For two lists (yesterday's toys and today's), "change" means: which toys are new today (only in today's list) and which toys are missing (only in yesterday's list). DFSORT can compare the two lists by matching names and then keeping only the ones that don't have a match—so you get "new" and "missing" lists.
1. What does "detecting changes between records" mean in DFSORT?
2. How do you detect a "control break" (change in a key field) in a single sorted file?
3. How do you find records that are in file A but not in file B (or vice versa)?
4. What is OUTFIL SECTIONS= used for in change detection?
5. You want to output only the first record of each group (e.g. first record per department). How can DFSORT help?