On z/OS, DFSORT is run as a batch job step. You tell the system which program to run and which datasets to use by writing JCL (Job Control Language). This page explains how to invoke DFSORT in JCL: the EXEC statement, the DD statements you need, and how control statements are supplied via SYSIN.
The EXEC statement in JCL specifies which program to execute in that step. For DFSORT, you use one of two program names:
Both names run the same program. There is no difference in behavior. Your site may standardize on one name (e.g. always PGM=SORT) for consistency. If your installation uses a third-party sort (e.g. Syncsort), the same JCL might invoke that product instead, depending on the order of load libraries (JOBLIB, STEPLIB); the messages in SYSOUT (e.g. ICE for DFSORT, WER for Syncsort) tell you which product actually ran.
A typical DFSORT step might look like this. The step name (e.g. SORTSTEP) is up to you; the important part is PGM=SORT or PGM=ICEMAN.
1//SORTSTEP EXEC PGM=SORT
If you use a cataloged procedure (PROC) that invokes DFSORT, the procedure will contain the EXEC and possibly default DD statements; your JCL might only override DSN or parameters. The program name is still SORT or ICEMAN inside that procedure.
DD statements (Data Definition) tell the program which datasets to use. DFSORT expects specific DD names. If a required DD is missing or misspelled, the step will fail, often with an abend or an ICE message indicating the missing DD.
For a SORT step (one input, one output, sort or copy), the required DDs are:
SYSOUT=* so output goes to the job log (JES spool). Without SYSOUT, you cannot see ICE messages or abend information.For a MERGE step, you use SORTIN01, SORTIN02, … instead of SORTIN (one DD per input stream). SORTOUT, SYSIN, and SYSOUT are still required. Optional DD names include SORTWK01, SORTWK02, … for work datasets when you want to control sortwork allocation explicitly; otherwise DFSORT can dynamically allocate work.
Below is a minimal JCL job that runs DFSORT. It sorts a fixed-length 80-byte input by the first 20 bytes (character, ascending) and writes the result to a new output dataset. Each part is explained in the following sections.
12345678910//SORTJOB JOB (ACCT),'DFSORT EXAMPLE',CLASS=A,MSGCLASS=X //* //SORTSTEP EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN DD DSN=MY.INPUT.DATA,DISP=SHR //SORTOUT DD DSN=MY.SORTED.DATA,DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * SORT FIELDS=(1,20,CH,A) /*
JOB statement — Identifies the job (account, job name, class, message class). Exact syntax is site-dependent. EXEC PGM=SORT — Runs the DFSORT program. SYSOUT DD SYSOUT=* — Sends DFSORT messages to the job log. SORTIN — Input dataset; DISP=SHR means use existing dataset with shared read access. SORTOUT — New dataset; DISP=(NEW,CATLG,DELETE) creates it, catalogs it, and deletes it on normal completion; SPACE and DCB define allocation and record format. SYSIN DD * — In-stream data: the next lines are the content of SYSIN until /*. Here that content is one control statement: sort by bytes 1–20, character, ascending.
DFSORT gets its control statements from the dataset (or stream) allocated to the SYSIN DD. You have two main options:
DD * means "the following lines in the JCL are the data." So you put your control statements right after //SYSIN DD * and end them with /* on a line by itself. Everything between * and /* is read as SYSIN. This is convenient for small, one-off jobs or examples. Advantages: no separate dataset to create; everything is in one JCL member. Disadvantages: JCL can get long; harder to reuse the same control statements in multiple jobs.
12345//SYSIN DD * SORT FIELDS=(1,20,CH,A) INCLUDE COND=(5,2,CH,EQ,C'NY') OUTREC FIELDS=(1,30,35,20) /*
Instead of DD *, you can point SYSIN to a cataloged dataset (e.g. a PDS member) that contains the control statements. The dataset can be fixed- or variable-length; DFSORT reads records and parses them as control statements. This is preferred when the same control statements are used in many jobs or when the statement set is large. Example:
1//SYSIN DD DSN=MY.PROCS(SORT1),DISP=SHR
Here MY.PROCS(SORT1) is a member in a partitioned dataset. That member contains the SORT FIELDS, INCLUDE, OUTREC, etc. One member can be shared by multiple jobs.
SORTIN must reference a dataset that exists and contains the records you want to sort (or copy). Common practices:
The record format (RECFM) and record length (LRECL) of the input dataset are read from the dataset label (DCB). You do not have to specify them on SORTIN unless you are overriding the label. DFSORT uses the actual record length and format when reading.
SORTOUT is where DFSORT writes the result. It is usually a new dataset, so you must specify how to create it and how much space to allocate.
If the output dataset already exists and you want to replace it, you might use DISP=(MOD,DELETE,DELETE) or first delete the dataset in a prior step; overwriting existing datasets is site-dependent and often avoided in favor of creating new names (e.g. by generation or date).
SYSOUT DD SYSOUT=* routes DFSORT messages to the job's output class (often the same as MSGCLASS). You will see messages such as ICE000I (summary of records read/sorted/written), ICE001I, and others. If the step abends, the abend code and often a short explanation appear in SYSOUT. Always include SYSOUT so you can diagnose failures and confirm record counts.
A single job can have multiple steps. You might run a program that creates an unsorted file, then run DFSORT to sort it, then run another program to process the sorted file. Each step is separate: the DFSORT step only needs its own EXEC and DD statements. Input and output dataset names can be passed by symbolic parameters if you use a cataloged procedure.
Many shops use a cataloged procedure for DFSORT (e.g. a PROC that includes EXEC PGM=SORT and default DDs). Your JCL then invokes that procedure and overrides SORTIN, SORTOUT, and SYSIN (and optionally SYSOUT) with the actual dataset names. The idea is the same: DFSORT runs in a step with the required DDs; the procedure just standardizes the JCL.
Imagine you have a big pile of cards (your data) and you want to put them in order. The JCL is like a set of instructions for a helper. You say: "Run the sorting program" (EXEC PGM=SORT). You point to the pile of cards: "The cards are here" (SORTIN). You point to an empty box: "Put the sorted cards here" (SORTOUT). You give a small note that says "Sort by the first 20 letters on each card" (SYSIN). And you say "Tell me when you're done and if anything went wrong" (SYSOUT). The helper (DFSORT) reads your note, takes the cards from the pile, sorts them, puts them in the box, and writes you a message. Running DFSORT in JCL is exactly that: telling the system which program to run and where the input, output, and instructions are.
1. Which program name is commonly used to run DFSORT in JCL?
2. Where do DFSORT control statements (e.g. SORT FIELDS) go?
3. What does DISP=SHR on SORTIN mean?
4. Why is SYSOUT needed in a DFSORT step?
5. What is in-stream data (DD *) used for in DFSORT JCL?