DFSORT expects specific DD names in your JCL. Each name has a fixed meaning: one is for input data, one for output data, one for control statements, one for messages, and optionally others for work or extra outputs. This page lists every DD statement DFSORT uses, whether it is required or optional, and what happens when you use it.
For a SORT step (single input, sort or copy), four DDs are required: SYSOUT, SORTIN, SORTOUT, and SYSIN. For a MERGE step, you do not use SORTIN; instead you use SORTIN01, SORTIN02, and so on (one per input stream). SORTOUT, SYSIN, and SYSOUT are still required. SORTWK01, SORTWK02, … are optional work datasets; if you omit them, DFSORT may dynamically allocate work. If you use OUTFIL FNAMES= to write to additional datasets, you must define those DD names in JCL as well.
SYSOUT is the DD name where DFSORT writes its messages. This includes informational messages (e.g. ICE000I with record counts), warnings, and error or abend text. It does not contain your data; it contains the text that tells you how the step ran.
You almost always code SYSOUT DD SYSOUT=*. The asterisk means "use the job's default output class" (often the same as the MSGCLASS on the JOB statement). The output is written to the JES spool so you can view it in the job log. If you omit the SYSOUT DD, DFSORT may abend or you will have no way to see why a step failed or how many records were processed. So in practice SYSOUT is required.
Effect of omitting it: Message output has nowhere to go. The step may fail with a DD-related error, or messages may be lost. Always include SYSOUT for production and debugging.
SORTIN is the DD name for the input dataset when you are doing a SORT (or OPTION COPY with a single input). DFSORT reads all input records from the dataset allocated to SORTIN. The dataset can be fixed-length (FB) or variable-length (VB); DFSORT uses the dataset's DCB (record format and length) when reading.
You allocate SORTIN like any other input: DSN=your.input.dataset,DISP=SHR (or OLD if you need exclusive access). The dataset must exist before the step runs. You do not specify SORTIN when doing a MERGE; for MERGE you use SORTIN01, SORTIN02, etc. So SORTIN is required for a SORT step and must not be used for a MERGE step.
Effect of omitting it: For a SORT step, DFSORT expects to read from SORTIN. If the DD is missing, the step will fail (e.g. missing DD or open failure). For MERGE, omitting SORTIN is correct—you are supposed to use SORTIN01, SORTIN02, … instead.
For a MERGE operation, you have two or more input streams, each already sorted by the same key. DFSORT does not use SORTIN in that case. Instead it uses SORTIN01, SORTIN02, and optionally SORTIN03 through SORTIN16 (depending on the product level). Each DD corresponds to one input stream. The control statements will include MERGE FIELDS=... and the number of inputs must match the number of SORTINnn DDs you provide.
SORTIN01 is the first input, SORTIN02 the second, and so on. All must be pre-sorted by the same key (and same format) as specified in MERGE FIELDS. DFSORT merges them in order into one sorted stream and writes to SORTOUT. You cannot mix SORTIN with SORTIN01 in the same step: use either the single-input SORT convention (SORTIN) or the multi-input MERGE convention (SORTIN01, SORTIN02, …).
Effect of omitting one: If your MERGE control statement expects three inputs but you only allocate SORTIN01 and SORTIN02, the step will fail when DFSORT tries to use the third input. If you allocate more DDs than the MERGE expects, the extra DDs are typically ignored.
SORTOUT is the DD name for the primary output dataset. After the sort or merge (and after any OUTREC or OUTFIL processing), DFSORT writes the result to the dataset allocated to SORTOUT. There is exactly one primary output stream; that stream goes to SORTOUT.
For a new dataset you allocate SORTOUT with DISP=(NEW,CATLG,DELETE) (or similar), SPACE=, and DCB= to define record format and length. The LRECL and RECFM must match what DFSORT will write (same as input if you do not change length in OUTREC; otherwise the length and format you build in OUTREC). If you use OUTFIL with FNAMES=, you are adding extra outputs; SORTOUT still receives the main sorted/merged stream unless you use options that redirect it.
Effect of omitting it: DFSORT has nowhere to write the result. The step will fail (e.g. missing DD or allocation failure). SORTOUT is always required.
SYSIN is the DD name for the control statement input. The dataset (or in-stream data) allocated to SYSIN contains the instructions that tell DFSORT what to do: SORT FIELDS or MERGE FIELDS, INCLUDE, OMIT, INREC, OUTREC, OUTFIL, SUM, OPTION, and so on. DFSORT reads and parses SYSIN at the beginning of the step, before it reads any data from SORTIN or SORTINnn.
You can use in-stream data (DD * or DD DATA) so the control statements appear directly in the JCL, or you can point SYSIN to a cataloged dataset (e.g. a PDS member). Either way, the content must be valid DFSORT control statements. SYSIN is required; without it, DFSORT has no instructions.
Effect of omitting it: No control statements are available. The step will fail or behave unpredictably. DFSORT may abend or issue a message indicating missing or invalid SYSIN.
SORTWK01, SORTWK02, … (typically SORTWK01 through SORTWK32, depending on product) are optional DD names for work (scratch) datasets. When the sort or merge needs more space than can fit in memory, DFSORT uses these datasets to hold intermediate data. If you do not allocate them, DFSORT can often dynamically allocate its own work datasets (when allowed by options such as DYNALLOC and by your installation). So they are optional in the sense that the step can succeed without them—but if dynamic allocation is disabled or fails, you must provide SORTWKnn.
When to use them: Use explicit SORTWKnn when your shop requires work datasets on specific volumes or with specific space or when dynamic allocation is not desired. You allocate them as temporary (e.g. DISP=(NEW,DELETE,DELETE)) or as reusable work datasets. The exact number (01, 02, …) and how many DFSORT uses depends on the data volume and product behavior.
Effect of omitting them: If DFSORT can dynamically allocate work, the step runs without them. If it cannot (e.g. option or environment forbids it), the step may fail with a message about insufficient work space or missing work DD.
When you use the OUTFIL control statement with FNAMES=, you tell DFSORT to write an additional output stream to a different dataset. The value of FNAMES= is a DD name (e.g. OUTFIL FNAMES=REPORT). You must define that DD name in your JCL and allocate a dataset for it (e.g. //REPORT DD ...). So the required DDs are still SYSOUT, SORTIN or SORTIN01/02/..., SORTOUT, and SYSIN; any name you use in OUTFIL FNAMES= becomes an additional required DD for that job.
If you have OUTFIL FNAMES=(OUT1,OUT2), you need two DDs in JCL: one named OUT1 and one named OUT2. If you omit one, DFSORT will fail when it tries to open that output.
The following table summarizes each DD name, when it is used, and whether it is required.
| DD Name | Purpose | Required? |
|---|---|---|
| SYSOUT | Messages, statistics, errors | Yes |
| SORTIN | Input for SORT step | Yes (SORT) |
| SORTIN01, SORTIN02, … | Inputs for MERGE step | Yes (MERGE) |
| SORTOUT | Primary output | Yes |
| SYSIN | Control statements | Yes |
| SORTWK01, SORTWK02, … | Work datasets | Optional |
| Names in OUTFIL FNAMES= | Extra output datasets | Yes if used in SYSIN |
12345678//SORTSTEP EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN DD DSN=MY.INPUT,DISP=SHR //SORTOUT DD DSN=MY.OUTPUT,DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * SORT FIELDS=(1,20,CH,A) /*
Here all four required DDs are present: SYSOUT for messages, SORTIN for input, SORTOUT for output, SYSIN for the single control statement. No SORTWKnn are used; DFSORT can use dynamic allocation for work if needed.
123456789//MERGESTP EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN01 DD DSN=MY.SORTED1,DISP=SHR //SORTIN02 DD DSN=MY.SORTED2,DISP=SHR //SORTOUT DD DSN=MY.MERGED,DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * MERGE FIELDS=(1,20,CH,A) /*
For MERGE, SORTIN is not used. SORTIN01 and SORTIN02 are the two input streams. SORTOUT and SYSIN are still required. The MERGE FIELDS control statement tells DFSORT how the inputs are sorted and how to merge them.
Think of DFSORT as a worker who needs four labeled boxes. One box is "SYSOUT"—the worker puts a note in there saying "I'm done, here's how many cards I sorted." One box is "SORTIN"—that's where the messy pile of cards (your input) sits. One box is "SORTOUT"—that's where the worker puts the neat, sorted pile. And one box is "SYSIN"—that's where you put the instruction slip that says "sort by the first 20 letters." If any of those four boxes is missing, the worker doesn't know what to do or where to put things. The "required DD statements" are just the list of those box names (SYSOUT, SORTIN, SORTOUT, SYSIN) that you must give DFSORT in your JCL. For merging two piles, you give two input boxes (SORTIN01 and SORTIN02) instead of one SORTIN.
1. Which DD is required for both SORT and MERGE steps?
2. What is the purpose of the SYSOUT DD?
3. When are SORTIN01, SORTIN02, ... used instead of SORTIN?
4. Are SORTWK01, SORTWK02, ... required?
5. What happens if you omit the SYSIN DD?