DFSORT processes your data in a fixed order: first it reads and optionally reformats or filters records, then it sorts or merges them, then it reformats and writes the result. Understanding these phases helps you know when each control statement runs and how to design your job correctly.
DFSORT runs your job in three main stages: input, sort (or merge), and output. Each stage has a specific role. Control statements are tied to one of these stages, so knowing the phase order tells you when your INREC, INCLUDE, OMIT, OUTREC, and OUTFIL logic runs.
In the input phase, DFSORT reads records from the input dataset(s)—the DD name(s) SORTIN or SORTIN01, SORTIN02, and so on for multi-input. For each record read, the following happen in logical order:
After the input phase, DFSORT has a set of records in memory (or in work datasets) that are ready to be sorted or merged. So the input phase answers: "What records do we have, and in what layout?"
In the sort phase, DFSORT takes the records that survived the input phase and arranges them in order. How it does that depends on whether you used a SORT or a MERGE:
If you use SUM, the collapse of duplicate keys and the aggregation (sum, min, max) are typically done as part of or right after the sort/merge phase, so that records with the same key are adjacent and can be combined. So SUM is logically part of "what we do with the ordered stream" before we write it out.
In the output phase, DFSORT takes the ordered (and possibly summed) records and writes them to the output dataset(s). For each record to be written, the following apply:
So the output phase answers: "What do we write, and where?"
Knowing that INREC runs before the sort and OUTREC runs after the sort helps you avoid mistakes. For example:
Suppose you have a fixed-length 80-byte input. You want to: keep only records where position 1 is 'A'; sort by positions 10–19 (character); write only positions 10–19 and 40–49 to the output with a space between them. You could use:
123INCLUDE COND=(1,1,CH,EQ,C'A') SORT FIELDS=(10,10,CH,A) OUTREC FIELDS=(10,10,40,10)
Processing order: (1) Input phase: read each record; no INREC, so record stays 80 bytes; INCLUDE keeps only records with byte 1 = 'A'. (2) Sort phase: sort the kept records by positions 10–19, ascending. (3) Output phase: for each sorted record, OUTREC picks bytes 10–19 and 40–49 (20 bytes total) and writes them to SORTOUT. So the phases clearly separate "filter and keep layout" → "order" → "final layout."
When you use the SUM control statement, DFSORT collapses records that have the same key (and optionally aggregates numeric fields). Conceptually this happens after the sort phase has put records in key order: records with the same key are adjacent, so DFSORT can merge them into one and apply sum/min/max. So the flow is: input phase (read, INREC, INCLUDE/OMIT) → sort phase (order by key) → SUM (collapse and aggregate) → output phase (OUTREC, write). SUM does not run in the input phase; it needs the sorted order to know which records to combine.
For a MERGE job, the input phase still applies: you can have INREC and INCLUDE/OMIT on the merge inputs. The "sort" phase is replaced by a merge phase: two or more pre-sorted streams are merged into one ordered stream. The output phase is the same: OUTREC and OUTFIL apply when writing. So the three-phase model still holds; only the middle phase is "merge" instead of "sort."
Imagine you have two piles of cards: one pile is the raw cards (input), and you want to end up with one neat pile in order (output). Step 1 (input): You look at each card. Maybe you fix the card or make a shorter copy (INREC), and maybe you throw away some cards you don't need (INCLUDE/OMIT). The cards you keep go into a "to be sorted" pile. Step 2 (sort): You take that pile and put the cards in order by the rule you chose (e.g. by name). Now you have one pile in the right order. Step 3 (output): Before you write the answer on a new sheet, you might copy only some parts of each card or add numbers (OUTREC). Then you write that to the output. So: get the cards ready and filter → put them in order → format and write. That order never changes.
1. In which phase does DFSORT apply INREC?
2. When is OUTREC applied?
3. What happens in the sort phase?
4. When are INCLUDE and OMIT applied?
5. How many main phases does DFSORT use for a typical SORT job?