MainframeMaster

Required DD Statements

DFSORT expects specific DD names in your JCL. Each name has a fixed meaning: one is for input data, one for output data, one for control statements, one for messages, and optionally others for work or extra outputs. This page lists every DD statement DFSORT uses, whether it is required or optional, and what happens when you use it.

Environment Setup
Progress0 of 0 lessons

Overview: Required vs Optional

For a SORT step (single input, sort or copy), four DDs are required: SYSOUT, SORTIN, SORTOUT, and SYSIN. For a MERGE step, you do not use SORTIN; instead you use SORTIN01, SORTIN02, and so on (one per input stream). SORTOUT, SYSIN, and SYSOUT are still required. SORTWK01, SORTWK02, … are optional work datasets; if you omit them, DFSORT may dynamically allocate work. If you use OUTFIL FNAMES= to write to additional datasets, you must define those DD names in JCL as well.

SYSOUT — Message and Diagnostic Output

SYSOUT is the DD name where DFSORT writes its messages. This includes informational messages (e.g. ICE000I with record counts), warnings, and error or abend text. It does not contain your data; it contains the text that tells you how the step ran.

You almost always code SYSOUT DD SYSOUT=*. The asterisk means "use the job's default output class" (often the same as the MSGCLASS on the JOB statement). The output is written to the JES spool so you can view it in the job log. If you omit the SYSOUT DD, DFSORT may abend or you will have no way to see why a step failed or how many records were processed. So in practice SYSOUT is required.

Effect of omitting it: Message output has nowhere to go. The step may fail with a DD-related error, or messages may be lost. Always include SYSOUT for production and debugging.

SORTIN — Single Input (SORT Step)

SORTIN is the DD name for the input dataset when you are doing a SORT (or OPTION COPY with a single input). DFSORT reads all input records from the dataset allocated to SORTIN. The dataset can be fixed-length (FB) or variable-length (VB); DFSORT uses the dataset's DCB (record format and length) when reading.

You allocate SORTIN like any other input: DSN=your.input.dataset,DISP=SHR (or OLD if you need exclusive access). The dataset must exist before the step runs. You do not specify SORTIN when doing a MERGE; for MERGE you use SORTIN01, SORTIN02, etc. So SORTIN is required for a SORT step and must not be used for a MERGE step.

Effect of omitting it: For a SORT step, DFSORT expects to read from SORTIN. If the DD is missing, the step will fail (e.g. missing DD or open failure). For MERGE, omitting SORTIN is correct—you are supposed to use SORTIN01, SORTIN02, … instead.

SORTIN01, SORTIN02, … — Multiple Inputs (MERGE Step)

For a MERGE operation, you have two or more input streams, each already sorted by the same key. DFSORT does not use SORTIN in that case. Instead it uses SORTIN01, SORTIN02, and optionally SORTIN03 through SORTIN16 (depending on the product level). Each DD corresponds to one input stream. The control statements will include MERGE FIELDS=... and the number of inputs must match the number of SORTINnn DDs you provide.

SORTIN01 is the first input, SORTIN02 the second, and so on. All must be pre-sorted by the same key (and same format) as specified in MERGE FIELDS. DFSORT merges them in order into one sorted stream and writes to SORTOUT. You cannot mix SORTIN with SORTIN01 in the same step: use either the single-input SORT convention (SORTIN) or the multi-input MERGE convention (SORTIN01, SORTIN02, …).

Effect of omitting one: If your MERGE control statement expects three inputs but you only allocate SORTIN01 and SORTIN02, the step will fail when DFSORT tries to use the third input. If you allocate more DDs than the MERGE expects, the extra DDs are typically ignored.

SORTOUT — Primary Output Dataset

SORTOUT is the DD name for the primary output dataset. After the sort or merge (and after any OUTREC or OUTFIL processing), DFSORT writes the result to the dataset allocated to SORTOUT. There is exactly one primary output stream; that stream goes to SORTOUT.

For a new dataset you allocate SORTOUT with DISP=(NEW,CATLG,DELETE) (or similar), SPACE=, and DCB= to define record format and length. The LRECL and RECFM must match what DFSORT will write (same as input if you do not change length in OUTREC; otherwise the length and format you build in OUTREC). If you use OUTFIL with FNAMES=, you are adding extra outputs; SORTOUT still receives the main sorted/merged stream unless you use options that redirect it.

Effect of omitting it: DFSORT has nowhere to write the result. The step will fail (e.g. missing DD or allocation failure). SORTOUT is always required.

SYSIN — Control Statements

SYSIN is the DD name for the control statement input. The dataset (or in-stream data) allocated to SYSIN contains the instructions that tell DFSORT what to do: SORT FIELDS or MERGE FIELDS, INCLUDE, OMIT, INREC, OUTREC, OUTFIL, SUM, OPTION, and so on. DFSORT reads and parses SYSIN at the beginning of the step, before it reads any data from SORTIN or SORTINnn.

You can use in-stream data (DD * or DD DATA) so the control statements appear directly in the JCL, or you can point SYSIN to a cataloged dataset (e.g. a PDS member). Either way, the content must be valid DFSORT control statements. SYSIN is required; without it, DFSORT has no instructions.

Effect of omitting it: No control statements are available. The step will fail or behave unpredictably. DFSORT may abend or issue a message indicating missing or invalid SYSIN.

SORTWK01, SORTWK02, … — Work Datasets (Optional)

SORTWK01, SORTWK02, … (typically SORTWK01 through SORTWK32, depending on product) are optional DD names for work (scratch) datasets. When the sort or merge needs more space than can fit in memory, DFSORT uses these datasets to hold intermediate data. If you do not allocate them, DFSORT can often dynamically allocate its own work datasets (when allowed by options such as DYNALLOC and by your installation). So they are optional in the sense that the step can succeed without them—but if dynamic allocation is disabled or fails, you must provide SORTWKnn.

When to use them: Use explicit SORTWKnn when your shop requires work datasets on specific volumes or with specific space or when dynamic allocation is not desired. You allocate them as temporary (e.g. DISP=(NEW,DELETE,DELETE)) or as reusable work datasets. The exact number (01, 02, …) and how many DFSORT uses depends on the data volume and product behavior.

Effect of omitting them: If DFSORT can dynamically allocate work, the step runs without them. If it cannot (e.g. option or environment forbids it), the step may fail with a message about insufficient work space or missing work DD.

OUTFIL and Additional Output DD Names

When you use the OUTFIL control statement with FNAMES=, you tell DFSORT to write an additional output stream to a different dataset. The value of FNAMES= is a DD name (e.g. OUTFIL FNAMES=REPORT). You must define that DD name in your JCL and allocate a dataset for it (e.g. //REPORT DD ...). So the required DDs are still SYSOUT, SORTIN or SORTIN01/02/..., SORTOUT, and SYSIN; any name you use in OUTFIL FNAMES= becomes an additional required DD for that job.

If you have OUTFIL FNAMES=(OUT1,OUT2), you need two DDs in JCL: one named OUT1 and one named OUT2. If you omit one, DFSORT will fail when it tries to open that output.

Quick Reference Table

The following table summarizes each DD name, when it is used, and whether it is required.

DD NamePurposeRequired?
SYSOUTMessages, statistics, errorsYes
SORTINInput for SORT stepYes (SORT)
SORTIN01, SORTIN02, …Inputs for MERGE stepYes (MERGE)
SORTOUTPrimary outputYes
SYSINControl statementsYes
SORTWK01, SORTWK02, …Work datasetsOptional
Names in OUTFIL FNAMES=Extra output datasetsYes if used in SYSIN

Example: SORT Step with All Required DDs

jcl
1
2
3
4
5
6
7
8
//SORTSTEP EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN DD DSN=MY.INPUT,DISP=SHR //SORTOUT DD DSN=MY.OUTPUT,DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * SORT FIELDS=(1,20,CH,A) /*

Here all four required DDs are present: SYSOUT for messages, SORTIN for input, SORTOUT for output, SYSIN for the single control statement. No SORTWKnn are used; DFSORT can use dynamic allocation for work if needed.

Example: MERGE Step with SORTIN01 and SORTIN02

jcl
1
2
3
4
5
6
7
8
9
//MERGESTP EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN01 DD DSN=MY.SORTED1,DISP=SHR //SORTIN02 DD DSN=MY.SORTED2,DISP=SHR //SORTOUT DD DSN=MY.MERGED,DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * MERGE FIELDS=(1,20,CH,A) /*

For MERGE, SORTIN is not used. SORTIN01 and SORTIN02 are the two input streams. SORTOUT and SYSIN are still required. The MERGE FIELDS control statement tells DFSORT how the inputs are sorted and how to merge them.

Explain It Like I'm Five

Think of DFSORT as a worker who needs four labeled boxes. One box is "SYSOUT"—the worker puts a note in there saying "I'm done, here's how many cards I sorted." One box is "SORTIN"—that's where the messy pile of cards (your input) sits. One box is "SORTOUT"—that's where the worker puts the neat, sorted pile. And one box is "SYSIN"—that's where you put the instruction slip that says "sort by the first 20 letters." If any of those four boxes is missing, the worker doesn't know what to do or where to put things. The "required DD statements" are just the list of those box names (SYSOUT, SORTIN, SORTOUT, SYSIN) that you must give DFSORT in your JCL. For merging two piles, you give two input boxes (SORTIN01 and SORTIN02) instead of one SORTIN.

Exercises

  1. Write the four required DD names for a SORT step and state what each is used for.
  2. You want to MERGE three pre-sorted files. Which DD names do you need for input? Is SORTIN used?
  3. Why is SYSOUT important even though it does not hold your data?
  4. If your SYSIN contains OUTFIL FNAMES=REPORT, what must you add to your JCL?

Quiz

Test Your Knowledge

1. Which DD is required for both SORT and MERGE steps?

  • SORTIN only
  • SORTIN, SORTOUT, SYSIN, and SYSOUT
  • SORTWK01 only
  • SORTIN01 only

2. What is the purpose of the SYSOUT DD?

  • It holds the sorted output data
  • It receives DFSORT messages and statistics
  • It supplies control statements
  • It is the input dataset

3. When are SORTIN01, SORTIN02, ... used instead of SORTIN?

  • When doing a SORT
  • When doing a MERGE of multiple pre-sorted inputs
  • When using OPTION COPY
  • When SORTIN is full

4. Are SORTWK01, SORTWK02, ... required?

  • Yes, always
  • No; DFSORT can dynamically allocate work datasets
  • Only for MERGE
  • Only when using INCLUDE

5. What happens if you omit the SYSIN DD?

  • DFSORT uses default sort by position 1
  • The step will fail; DFSORT needs control statements
  • SORTOUT is used as SYSIN
  • DFSORT prompts for input