What is SORTIN in DFSORT?

SORTIN is the DD name for the input dataset in a DFSORT SORT step. Your JCL allocates a dataset to SORTIN (e.g. DSN=MY.INPUT,DISP=SHR). DFSORT reads all records from that dataset, optionally applies INREC and INCLUDE/OMIT, then sorts or copies and writes to SORTOUT.

What is SORTOUT in DFSORT?

SORTOUT is the DD name for the primary output dataset. DFSORT writes the sorted, merged, or copied records (after any OUTREC processing) to the dataset allocated to SORTOUT. You must allocate it with sufficient space and the correct DCB (record format and length) to match what DFSORT will write.

Can SORTIN and SORTOUT be the same file?

No. SORTIN is read from and SORTOUT is written to. They must be different datasets. Attempting to use the same dataset for both would cause conflicts and is not valid.

How do I know what LRECL to use for SORTOUT?

The SORTOUT LRECL must match the length of the records that DFSORT writes. If you do not use OUTREC (or OUTFIL reformatting), the output record length is the same as the input (after INREC if used). If you use OUTREC to build a different layout, the output length is the sum of the lengths of the fields you specify in OUTREC (or the built record length).

When does DFSORT use SORTIN01 and SORTIN02 instead of SORTIN?

When you run a MERGE (not a SORT). MERGE combines two or more pre-sorted inputs. Each input is allocated to SORTIN01, SORTIN02, SORTIN03, etc. SORTIN is only for single-input SORT steps.

SORTIN / SORTOUT Explained

SORTIN and SORTOUT are the two DD names that define where your data comes from and where it goes in a DFSORT step. SORTIN is the input dataset; SORTOUT is the output dataset. This page explains what each one is, how to allocate them, how record length and format relate to INREC and OUTREC, and how MERGE changes the picture (SORTIN01, SORTIN02, … instead of SORTIN).

Environment Setup

Progress0 of 0 lessons

What Is SORTIN?

SORTIN is the DD name that points to the input dataset when you run a SORT step (or a single-input copy with OPTION COPY). DFSORT reads every record from the dataset allocated to SORTIN. The order in which records appear in SORTIN does not matter for a SORT—DFSORT will reorder them according to SORT FIELDS. For OPTION COPY, records are written to SORTOUT in the same order they were read (after INCLUDE/OMIT and INREC if specified).

The dataset you allocate to SORTIN must already exist before the step runs. You typically use DISP=SHR so your job can read it while other jobs may also access it, or DISP=OLD if you need exclusive access. The record format (RECFM) and record length (LRECL) are taken from the dataset label (or from the DCB you specify if you override). DFSORT uses that information to read each record correctly. So for fixed-length 80-byte records, the input dataset has LRECL=80; for variable-length (VB), the dataset has RECFM=VB and a 4-byte RDW followed by the data. DFSORT handles both.

Important: SORTIN is used only for a single-input SORT (or copy). When you do a MERGE, you do not use SORTIN; you use SORTIN01, SORTIN02, and so on. So "SORTIN" means "the one input for a SORT step."

What Is SORTOUT?

SORTOUT is the DD name that points to the primary output dataset. After DFSORT has read input, applied INREC and INCLUDE/OMIT, sorted or merged (and optionally SUM), and applied OUTREC, it writes the resulting records to the dataset allocated to SORTOUT. There is exactly one primary output stream, and that stream goes to SORTOUT.

Unlike SORTIN, the SORTOUT dataset is usually created by the step. You allocate it with DISP=(NEW,CATLG,DELETE) (or similar), SPACE= to reserve space, and DCB= to define the record format and length. The DCB must match what DFSORT will write. If you do not use OUTREC, the output records have the same layout (and length) as the records that went into the sort phase—which may be the same as input or the result of INREC. If you use OUTREC, the output record layout and length are determined by the OUTREC specification. So you must set SORTOUT's LRECL (and RECFM) to match the actual output record length. If you get it wrong, you can get truncated or padded records, or allocation/abend issues.

SORTIN and SORTOUT must be different datasets. DFSORT reads from one and writes to the other; using the same dataset for both would be invalid.

Relationship Between SORTIN and SORTOUT

Data flows from SORTIN to SORTOUT. In between, DFSORT may change the record (INREC before sort, OUTREC after sort). So the record length and format on SORTIN can differ from the record length and format on SORTOUT. You are responsible for making sure the SORTOUT DCB matches the final output.

No INREC, no OUTREC: Output records are identical to input records. SORTOUT should have the same RECFM and LRECL as the input dataset (or as SORTIN's DCB).
INREC only: The record that gets sorted is the reformatted record (e.g. shorter). The output of the sort phase is that reformatted record. If you do not use OUTREC, that is what is written to SORTOUT. So SORTOUT's LRECL must match the length of the record produced by INREC, not the original input length.
OUTREC only: The sort phase sees the full input record (or the INREC result). After the sort, OUTREC builds a new record (possibly different length). SORTOUT's LRECL must match the length of the record built by OUTREC.
INREC and OUTREC: The sort phase sees the INREC record; the output phase builds the OUTREC record. SORTOUT's LRECL must match the OUTREC output length.

If you are unsure of the output length, you can compute it from your OUTREC FIELDS (or INREC if no OUTREC): add the lengths of each field you output. For example, OUTREC FIELDS=(1,20,40,30) outputs 20 + 30 = 50 bytes, so SORTOUT should have LRECL=50 (and RECFM=FB for fixed).

SORTIN and SORTOUT: Allocation Examples

SORTIN (input)

Input usually already exists, so you reference it by name and use SHR (or OLD):

jcl

1
//SORTIN   DD DSN=MY.INPUT.DATA,DISP=SHR

If the input is the output of a previous step in the same job, you might use a step-specific name and DISP=(SHR,PASS) or reference the step output (e.g. DSN=*.STEP1.OUTPUT).

SORTOUT (output)

Output is usually new; you create it with SPACE and DCB:

jcl

1
2
//SORTOUT  DD DSN=MY.SORTED.DATA,DISP=(NEW,CATLG,DELETE),
//            SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920)

Here RECFM=FB and LRECL=80 match an 80-byte fixed-length output. If your OUTREC produces 100-byte records, use LRECL=100 (and adjust BLKSIZE if desired, e.g. 32760 for 100-byte records). SPACE=(CYL,(5,2)) allocates 5 cylinders primary and 2 secondary; adjust for your expected record count and record length.

MERGE: SORTIN01, SORTIN02, … and SORTOUT

For a MERGE step, there is no single SORTIN. Instead you have SORTIN01, SORTIN02, and optionally more (SORTIN03, …). Each is a separate input stream; each dataset must be pre-sorted by the same key (and format) as specified in MERGE FIELDS. DFSORT merges these streams into one and writes the result to SORTOUT. So for MERGE, the relationship is: multiple inputs (SORTIN01, SORTIN02, …) → one output (SORTOUT). SORTOUT is still the primary output; allocation rules are the same (DCB must match the merged output record layout).

jcl

1
2
3
4
//SORTIN01 DD DSN=MY.SORTED.PART1,DISP=SHR
//SORTIN02 DD DSN=MY.SORTED.PART2,DISP=SHR
//SORTOUT  DD DSN=MY.MERGED.OUTPUT,DISP=(NEW,CATLG,DELETE),
//            SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920)

The merged output has the same record layout as each input (unless you use OUTREC). So if each input has LRECL=80, the merged output has LRECL=80 and SORTOUT should be allocated with LRECL=80.

Variable-Length Records (VB)

If your input is variable-length (RECFM=VB), each record includes a 4-byte RDW (Record Descriptor Word) at the front. The LRECL you specify for the dataset is the maximum record length (including the RDW). DFSORT reads and writes VB records correctly; it maintains the RDW. So for SORTIN with a VB dataset, you do not need to strip the RDW—DFSORT handles it. For SORTOUT, if the output is variable-length, allocate with RECFM=VB and the appropriate LRECL (max length including RDW). If you use OUTREC to produce fixed-length output, then SORTOUT would be RECFM=FB with the fixed LRECL.

Common Mistakes

SORTOUT LRECL wrong: If you use OUTREC to build a 60-byte record but leave SORTOUT as LRECL=80, the extra bytes may be padded or the dataset may not match what downstream programs expect. Set SORTOUT LRECL to the actual output length.
Using SORTIN for MERGE: For MERGE you must use SORTIN01, SORTIN02, etc. If you allocate only SORTIN, DFSORT will not use it for MERGE and will fail or behave incorrectly.
SORTIN and SORTOUT same dataset: Never point both to the same DSN. They must be different datasets.
Insufficient SORTOUT space: If SPACE= is too small, the step can abend with an out-of-space condition. Estimate the number of output records and allocate enough primary (and secondary) space.

Explain It Like I'm Five

SORTIN is the inbox where the messy pile of papers (your data) sits. DFSORT takes the papers from that inbox one by one. SORTOUT is the outbox where DFSORT puts the papers after it has sorted them (or copied them, or merged them with papers from other inboxes). The inbox and outbox must be different boxes—you can't put the sorted papers back into the same box you took them from while you're still reading from it. And the outbox has to be the right size: if each sorted "paper" is 50 lines long, the outbox must be made to hold 50-line papers, not 80-line. That's what LRECL on SORTOUT is: making sure the outbox fits the papers DFSORT will put there.

Exercises

Your input is 80-byte fixed. You use OUTREC FIELDS=(1,30,50,20). What LRECL should SORTOUT have?
Why can SORTIN and SORTOUT not be the same dataset?
You are merging three pre-sorted files. Which DD names do you use for the three inputs? Which for the single output?
If the input is RECFM=VB, LRECL=256, and you do not use INREC or OUTREC, what RECFM and LRECL should SORTOUT have?

Quiz

Test Your Knowledge

1. What is SORTIN used for?

Control statements
The primary output dataset
The input dataset for a SORT step
Work space for the sort

2. Can SORTIN and SORTOUT point to the same dataset?

Yes, always
No; they must be different datasets
Only for OPTION COPY
Only for MERGE

3. If you use OUTREC to shorten records, what must match on SORTOUT?

The input LRECL
The output record length produced by OUTREC
SYSIN length
SYSOUT block size

4. For a MERGE step, which DD name(s) hold the input?

SORTIN only
SORTOUT
SORTIN01, SORTIN02, ...
SYSIN

5. What does DISP=SHR on SORTIN mean?

Delete the dataset after the step
Share the dataset for read access with other jobs
Create a new dataset
Write-only access