What DD statements are required for DFSORT?

For a SORT step: SYSOUT, SORTIN, SORTOUT, and SYSIN are required. For a MERGE step: SYSOUT, SORTIN01 (and SORTIN02, ... for each input), SORTOUT, and SYSIN are required. SORTWKnn work datasets are optional.

What is the difference between SORTIN and SORTIN01?

SORTIN is used for a single-input SORT step. SORTIN01, SORTIN02, etc. are used for MERGE steps where you have two or more pre-sorted input streams. You cannot mix SORTIN with SORTIN01 in the same step; use one convention based on whether you are doing SORT or MERGE.

Is SORTOUT required in DFSORT?

Yes. SORTOUT is the primary output dataset. DFSORT always writes the sorted or merged result to the dataset allocated to SORTOUT (unless you use special options that redirect output). OUTFIL can add additional output datasets with their own DD names.

What are SORTWK01, SORTWK02 used for?

SORTWK01, SORTWK02, ... are optional work (scratch) datasets used when the sort or merge needs more space than memory. DFSORT uses them to hold intermediate data. If you do not allocate them, DFSORT may dynamically allocate work datasets instead, depending on options and installation setup.

Can DFSORT have more than one output dataset?

Yes. The primary output is always SORTOUT. You can use the OUTFIL control statement with FNAMES= to write additional outputs to other DD names (e.g. OUTFIL FNAMES=REPORT). Each of those DD names must be defined in your JCL.

Required DD Statements

DFSORT expects specific DD names in your JCL. Each name has a fixed meaning: one is for input data, one for output data, one for control statements, one for messages, and optionally others for work or extra outputs. This page lists every DD statement DFSORT uses, whether it is required or optional, and what happens when you use it.

Environment Setup

Progress0 of 0 lessons

Overview: Required vs Optional

For a SORT step (single input, sort or copy), four DDs are required: SYSOUT, SORTIN, SORTOUT, and SYSIN. For a MERGE step, you do not use SORTIN; instead you use SORTIN01, SORTIN02, and so on (one per input stream). SORTOUT, SYSIN, and SYSOUT are still required. SORTWK01, SORTWK02, … are optional work datasets; if you omit them, DFSORT may dynamically allocate work. If you use OUTFIL FNAMES= to write to additional datasets, you must define those DD names in JCL as well.

SYSOUT — Message and Diagnostic Output

SYSOUT is the DD name where DFSORT writes its messages. This includes informational messages (e.g. ICE000I with record counts), warnings, and error or abend text. It does not contain your data; it contains the text that tells you how the step ran.

You almost always code SYSOUT DD SYSOUT=*. The asterisk means "use the job's default output class" (often the same as the MSGCLASS on the JOB statement). The output is written to the JES spool so you can view it in the job log. If you omit the SYSOUT DD, DFSORT may abend or you will have no way to see why a step failed or how many records were processed. So in practice SYSOUT is required.

Effect of omitting it: Message output has nowhere to go. The step may fail with a DD-related error, or messages may be lost. Always include SYSOUT for production and debugging.

SORTIN — Single Input (SORT Step)

SORTIN is the DD name for the input dataset when you are doing a SORT (or OPTION COPY with a single input). DFSORT reads all input records from the dataset allocated to SORTIN. The dataset can be fixed-length (FB) or variable-length (VB); DFSORT uses the dataset's DCB (record format and length) when reading.

You allocate SORTIN like any other input: DSN=your.input.dataset,DISP=SHR (or OLD if you need exclusive access). The dataset must exist before the step runs. You do not specify SORTIN when doing a MERGE; for MERGE you use SORTIN01, SORTIN02, etc. So SORTIN is required for a SORT step and must not be used for a MERGE step.

Effect of omitting it: For a SORT step, DFSORT expects to read from SORTIN. If the DD is missing, the step will fail (e.g. missing DD or open failure). For MERGE, omitting SORTIN is correct—you are supposed to use SORTIN01, SORTIN02, … instead.

SORTIN01, SORTIN02, … — Multiple Inputs (MERGE Step)

For a MERGE operation, you have two or more input streams, each already sorted by the same key. DFSORT does not use SORTIN in that case. Instead it uses SORTIN01, SORTIN02, and optionally SORTIN03 through SORTIN16 (depending on the product level). Each DD corresponds to one input stream. The control statements will include MERGE FIELDS=... and the number of inputs must match the number of SORTINnn DDs you provide.

SORTIN01 is the first input, SORTIN02 the second, and so on. All must be pre-sorted by the same key (and same format) as specified in MERGE FIELDS. DFSORT merges them in order into one sorted stream and writes to SORTOUT. You cannot mix SORTIN with SORTIN01 in the same step: use either the single-input SORT convention (SORTIN) or the multi-input MERGE convention (SORTIN01, SORTIN02, …).

Effect of omitting one: If your MERGE control statement expects three inputs but you only allocate SORTIN01 and SORTIN02, the step will fail when DFSORT tries to use the third input. If you allocate more DDs than the MERGE expects, the extra DDs are typically ignored.

SORTOUT — Primary Output Dataset

SORTOUT is the DD name for the primary output dataset. After the sort or merge (and after any OUTREC or OUTFIL processing), DFSORT writes the result to the dataset allocated to SORTOUT. There is exactly one primary output stream; that stream goes to SORTOUT.

For a new dataset you allocate SORTOUT with DISP=(NEW,CATLG,DELETE) (or similar), SPACE=, and DCB= to define record format and length. The LRECL and RECFM must match what DFSORT will write (same as input if you do not change length in OUTREC; otherwise the length and format you build in OUTREC). If you use OUTFIL with FNAMES=, you are adding extra outputs; SORTOUT still receives the main sorted/merged stream unless you use options that redirect it.

Effect of omitting it: DFSORT has nowhere to write the result. The step will fail (e.g. missing DD or allocation failure). SORTOUT is always required.

SYSIN — Control Statements

SYSIN is the DD name for the control statement input. The dataset (or in-stream data) allocated to SYSIN contains the instructions that tell DFSORT what to do: SORT FIELDS or MERGE FIELDS, INCLUDE, OMIT, INREC, OUTREC, OUTFIL, SUM, OPTION, and so on. DFSORT reads and parses SYSIN at the beginning of the step, before it reads any data from SORTIN or SORTINnn.

You can use in-stream data (DD * or DD DATA) so the control statements appear directly in the JCL, or you can point SYSIN to a cataloged dataset (e.g. a PDS member). Either way, the content must be valid DFSORT control statements. SYSIN is required; without it, DFSORT has no instructions.

Effect of omitting it: No control statements are available. The step will fail or behave unpredictably. DFSORT may abend or issue a message indicating missing or invalid SYSIN.

SORTWK01, SORTWK02, … — Work Datasets (Optional)

SORTWK01, SORTWK02, … (typically SORTWK01 through SORTWK32, depending on product) are optional DD names for work (scratch) datasets. When the sort or merge needs more space than can fit in memory, DFSORT uses these datasets to hold intermediate data. If you do not allocate them, DFSORT can often dynamically allocate its own work datasets (when allowed by options such as DYNALLOC and by your installation). So they are optional in the sense that the step can succeed without them—but if dynamic allocation is disabled or fails, you must provide SORTWKnn.

When to use them: Use explicit SORTWKnn when your shop requires work datasets on specific volumes or with specific space or when dynamic allocation is not desired. You allocate them as temporary (e.g. DISP=(NEW,DELETE,DELETE)) or as reusable work datasets. The exact number (01, 02, …) and how many DFSORT uses depends on the data volume and product behavior.

Effect of omitting them: If DFSORT can dynamically allocate work, the step runs without them. If it cannot (e.g. option or environment forbids it), the step may fail with a message about insufficient work space or missing work DD.

OUTFIL and Additional Output DD Names

When you use the OUTFIL control statement with FNAMES=, you tell DFSORT to write an additional output stream to a different dataset. The value of FNAMES= is a DD name (e.g. OUTFIL FNAMES=REPORT). You must define that DD name in your JCL and allocate a dataset for it (e.g. //REPORT DD ...). So the required DDs are still SYSOUT, SORTIN or SORTIN01/02/..., SORTOUT, and SYSIN; any name you use in OUTFIL FNAMES= becomes an additional required DD for that job.

If you have OUTFIL FNAMES=(OUT1,OUT2), you need two DDs in JCL: one named OUT1 and one named OUT2. If you omit one, DFSORT will fail when it tries to open that output.

Quick Reference Table

The following table summarizes each DD name, when it is used, and whether it is required.

DD Name	Purpose	Required?
SYSOUT	Messages, statistics, errors	Yes
SORTIN	Input for SORT step	Yes (SORT)
SORTIN01, SORTIN02, …	Inputs for MERGE step	Yes (MERGE)
SORTOUT	Primary output	Yes
SYSIN	Control statements	Yes
SORTWK01, SORTWK02, …	Work datasets	Optional
Names in OUTFIL FNAMES=	Extra output datasets	Yes if used in SYSIN

Example: SORT Step with All Required DDs

jcl

1
2
3
4
5
6
7
8
//SORTSTEP EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTIN   DD DSN=MY.INPUT,DISP=SHR
//SORTOUT  DD DSN=MY.OUTPUT,DISP=(NEW,CATLG,DELETE),
//            SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920)
//SYSIN    DD *
  SORT FIELDS=(1,20,CH,A)
/*

Here all four required DDs are present: SYSOUT for messages, SORTIN for input, SORTOUT for output, SYSIN for the single control statement. No SORTWKnn are used; DFSORT can use dynamic allocation for work if needed.

Example: MERGE Step with SORTIN01 and SORTIN02

jcl

1
2
3
4
5
6
7
8
9
//MERGESTP EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTIN01 DD DSN=MY.SORTED1,DISP=SHR
//SORTIN02 DD DSN=MY.SORTED2,DISP=SHR
//SORTOUT  DD DSN=MY.MERGED,DISP=(NEW,CATLG,DELETE),
//            SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920)
//SYSIN    DD *
  MERGE FIELDS=(1,20,CH,A)
/*

For MERGE, SORTIN is not used. SORTIN01 and SORTIN02 are the two input streams. SORTOUT and SYSIN are still required. The MERGE FIELDS control statement tells DFSORT how the inputs are sorted and how to merge them.

Explain It Like I'm Five

Think of DFSORT as a worker who needs four labeled boxes. One box is "SYSOUT"—the worker puts a note in there saying "I'm done, here's how many cards I sorted." One box is "SORTIN"—that's where the messy pile of cards (your input) sits. One box is "SORTOUT"—that's where the worker puts the neat, sorted pile. And one box is "SYSIN"—that's where you put the instruction slip that says "sort by the first 20 letters." If any of those four boxes is missing, the worker doesn't know what to do or where to put things. The "required DD statements" are just the list of those box names (SYSOUT, SORTIN, SORTOUT, SYSIN) that you must give DFSORT in your JCL. For merging two piles, you give two input boxes (SORTIN01 and SORTIN02) instead of one SORTIN.

Exercises

Write the four required DD names for a SORT step and state what each is used for.
You want to MERGE three pre-sorted files. Which DD names do you need for input? Is SORTIN used?
Why is SYSOUT important even though it does not hold your data?
If your SYSIN contains OUTFIL FNAMES=REPORT, what must you add to your JCL?

Quiz

Test Your Knowledge

1. Which DD is required for both SORT and MERGE steps?

SORTIN only
SORTIN, SORTOUT, SYSIN, and SYSOUT
SORTWK01 only
SORTIN01 only

2. What is the purpose of the SYSOUT DD?

It holds the sorted output data
It receives DFSORT messages and statistics
It supplies control statements
It is the input dataset

3. When are SORTIN01, SORTIN02, ... used instead of SORTIN?

When doing a SORT
When doing a MERGE of multiple pre-sorted inputs
When using OPTION COPY
When SORTIN is full

4. Are SORTWK01, SORTWK02, ... required?

Yes, always
No; DFSORT can dynamically allocate work datasets
Only for MERGE
Only when using INCLUDE

5. What happens if you omit the SYSIN DD?

DFSORT uses default sort by position 1
The step will fail; DFSORT needs control statements
SORTOUT is used as SYSIN
DFSORT prompts for input