MainframeMaster

Running DFSORT in JCL

On z/OS, DFSORT is run as a batch job step. You tell the system which program to run and which datasets to use by writing JCL (Job Control Language). This page explains how to invoke DFSORT in JCL: the EXEC statement, the DD statements you need, and how control statements are supplied via SYSIN.

Environment Setup
Progress0 of 0 lessons

The EXEC Statement: Which Program Runs

The EXEC statement in JCL specifies which program to execute in that step. For DFSORT, you use one of two program names:

  • PGM=SORT — The most common way to invoke the IBM sort program. "SORT" is a standard alias that points to the same load module as ICEMAN.
  • PGM=ICEMAN — The other standard name for the same DFSORT program. "ICEMAN" is the historical name used in IBM documentation; many shops use "SORT" in JCL for brevity.

Both names run the same program. There is no difference in behavior. Your site may standardize on one name (e.g. always PGM=SORT) for consistency. If your installation uses a third-party sort (e.g. Syncsort), the same JCL might invoke that product instead, depending on the order of load libraries (JOBLIB, STEPLIB); the messages in SYSOUT (e.g. ICE for DFSORT, WER for Syncsort) tell you which product actually ran.

Example EXEC Statement

A typical DFSORT step might look like this. The step name (e.g. SORTSTEP) is up to you; the important part is PGM=SORT or PGM=ICEMAN.

jcl
1
//SORTSTEP EXEC PGM=SORT

If you use a cataloged procedure (PROC) that invokes DFSORT, the procedure will contain the EXEC and possibly default DD statements; your JCL might only override DSN or parameters. The program name is still SORT or ICEMAN inside that procedure.

Required DD Statements

DD statements (Data Definition) tell the program which datasets to use. DFSORT expects specific DD names. If a required DD is missing or misspelled, the step will fail, often with an abend or an ICE message indicating the missing DD.

For a SORT step (one input, one output, sort or copy), the required DDs are:

  • SYSOUT — Where DFSORT writes messages, diagnostics, and summary statistics. Usually SYSOUT=* so output goes to the job log (JES spool). Without SYSOUT, you cannot see ICE messages or abend information.
  • SORTIN — The input dataset. DFSORT reads records from the dataset allocated to SORTIN. It must be present for a SORT step (unless you use a special setup like SORTIN from an earlier step).
  • SORTOUT — The output dataset. DFSORT writes the sorted (or copied) records here. You must allocate it with sufficient space and correct DCB attributes (RECFM, LRECL, etc.).
  • SYSIN — The source of control statements. DFSORT reads SORT FIELDS, MERGE, INCLUDE, OMIT, INREC, OUTREC, OUTFIL, SUM, and other statements from SYSIN. Without SYSIN, DFSORT does not know what to do.

For a MERGE step, you use SORTIN01, SORTIN02, … instead of SORTIN (one DD per input stream). SORTOUT, SYSIN, and SYSOUT are still required. Optional DD names include SORTWK01, SORTWK02, … for work datasets when you want to control sortwork allocation explicitly; otherwise DFSORT can dynamically allocate work.

Minimal Working Example

Below is a minimal JCL job that runs DFSORT. It sorts a fixed-length 80-byte input by the first 20 bytes (character, ascending) and writes the result to a new output dataset. Each part is explained in the following sections.

jcl
1
2
3
4
5
6
7
8
9
10
//SORTJOB JOB (ACCT),'DFSORT EXAMPLE',CLASS=A,MSGCLASS=X //* //SORTSTEP EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN DD DSN=MY.INPUT.DATA,DISP=SHR //SORTOUT DD DSN=MY.SORTED.DATA,DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * SORT FIELDS=(1,20,CH,A) /*

JOB statement — Identifies the job (account, job name, class, message class). Exact syntax is site-dependent. EXEC PGM=SORT — Runs the DFSORT program. SYSOUT DD SYSOUT=* — Sends DFSORT messages to the job log. SORTIN — Input dataset; DISP=SHR means use existing dataset with shared read access. SORTOUT — New dataset; DISP=(NEW,CATLG,DELETE) creates it, catalogs it, and deletes it on normal completion; SPACE and DCB define allocation and record format. SYSIN DD * — In-stream data: the next lines are the content of SYSIN until /*. Here that content is one control statement: sort by bytes 1–20, character, ascending.

SYSIN: How Control Statements Are Supplied

DFSORT gets its control statements from the dataset (or stream) allocated to the SYSIN DD. You have two main options:

In-stream data (SYSIN DD *)

DD * means "the following lines in the JCL are the data." So you put your control statements right after //SYSIN DD * and end them with /* on a line by itself. Everything between * and /* is read as SYSIN. This is convenient for small, one-off jobs or examples. Advantages: no separate dataset to create; everything is in one JCL member. Disadvantages: JCL can get long; harder to reuse the same control statements in multiple jobs.

jcl
1
2
3
4
5
//SYSIN DD * SORT FIELDS=(1,20,CH,A) INCLUDE COND=(5,2,CH,EQ,C'NY') OUTREC FIELDS=(1,30,35,20) /*

SYSIN as a dataset

Instead of DD *, you can point SYSIN to a cataloged dataset (e.g. a PDS member) that contains the control statements. The dataset can be fixed- or variable-length; DFSORT reads records and parses them as control statements. This is preferred when the same control statements are used in many jobs or when the statement set is large. Example:

jcl
1
//SYSIN DD DSN=MY.PROCS(SORT1),DISP=SHR

Here MY.PROCS(SORT1) is a member in a partitioned dataset. That member contains the SORT FIELDS, INCLUDE, OUTREC, etc. One member can be shared by multiple jobs.

Allocating the Input Dataset (SORTIN)

SORTIN must reference a dataset that exists and contains the records you want to sort (or copy). Common practices:

  • DISP=SHR — Use when the dataset already exists and you only need to read it. SHR allows other jobs to read (or in some cases update) the dataset at the same time. This is the usual choice for input.
  • DISP=OLD — Exclusive access. No other job can use the dataset while your step runs. Use when you need exclusive control or when your site requires it for certain datasets.
  • DSN — The dataset name. Can be a fully qualified name (e.g. MY.INPUT.DATA) or a name that gets a prefix from the JOB or JOBCAT.

The record format (RECFM) and record length (LRECL) of the input dataset are read from the dataset label (DCB). You do not have to specify them on SORTIN unless you are overriding the label. DFSORT uses the actual record length and format when reading.

Allocating the Output Dataset (SORTOUT)

SORTOUT is where DFSORT writes the result. It is usually a new dataset, so you must specify how to create it and how much space to allocate.

  • DISP=(NEW,CATLG,DELETE) — NEW creates the dataset; CATLG catalogs it so it can be found by name later; DELETE removes it if the step abends (so you do not leave an incomplete dataset cataloged). Normal completion keeps the dataset. Some shops use (NEW,CATLG,KEEP) to keep the dataset even on abend.
  • SPACE=(CYL,(5,2)) — Allocate 5 cylinders primary, 2 cylinders secondary. When the primary is full, the system allocates secondary extents. Exact numbers depend on record count and record length; under-allocation can cause out-of-space abends.
  • DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) — Defines the output record format. FB = fixed blocked; LRECL=80 = 80 bytes per record; BLKSIZE=27920 is a common block size for 80-byte records. The output DCB should match what DFSORT will write (same LRECL as input if you are not reformatting, or the new length if you use OUTREC to change length).

If the output dataset already exists and you want to replace it, you might use DISP=(MOD,DELETE,DELETE) or first delete the dataset in a prior step; overwriting existing datasets is site-dependent and often avoided in favor of creating new names (e.g. by generation or date).

SYSOUT and Messages

SYSOUT DD SYSOUT=* routes DFSORT messages to the job's output class (often the same as MSGCLASS). You will see messages such as ICE000I (summary of records read/sorted/written), ICE001I, and others. If the step abends, the abend code and often a short explanation appear in SYSOUT. Always include SYSOUT so you can diagnose failures and confirm record counts.

Multiple Steps and Procedures

A single job can have multiple steps. You might run a program that creates an unsorted file, then run DFSORT to sort it, then run another program to process the sorted file. Each step is separate: the DFSORT step only needs its own EXEC and DD statements. Input and output dataset names can be passed by symbolic parameters if you use a cataloged procedure.

Many shops use a cataloged procedure for DFSORT (e.g. a PROC that includes EXEC PGM=SORT and default DDs). Your JCL then invokes that procedure and overrides SORTIN, SORTOUT, and SYSIN (and optionally SYSOUT) with the actual dataset names. The idea is the same: DFSORT runs in a step with the required DDs; the procedure just standardizes the JCL.

Explain It Like I'm Five

Imagine you have a big pile of cards (your data) and you want to put them in order. The JCL is like a set of instructions for a helper. You say: "Run the sorting program" (EXEC PGM=SORT). You point to the pile of cards: "The cards are here" (SORTIN). You point to an empty box: "Put the sorted cards here" (SORTOUT). You give a small note that says "Sort by the first 20 letters on each card" (SYSIN). And you say "Tell me when you're done and if anything went wrong" (SYSOUT). The helper (DFSORT) reads your note, takes the cards from the pile, sorts them, puts them in the box, and writes you a message. Running DFSORT in JCL is exactly that: telling the system which program to run and where the input, output, and instructions are.

Exercises

  1. Write a one-step JCL that runs DFSORT with in-stream SYSIN to sort by positions 10–15 (character, ascending). Use placeholder dataset names for SORTIN and SORTOUT.
  2. What happens if you misspell SORTIN as SORTINN in your JCL? Why is SYSOUT important when debugging?
  3. Explain the difference between supplying control statements with DD * versus a dataset name for SYSIN. When might you prefer each?
  4. If your output has LRECL=100 (e.g. after OUTREC), what must you specify for SORTOUT DCB?

Quiz

Test Your Knowledge

1. Which program name is commonly used to run DFSORT in JCL?

  • PGM=DFSORT
  • PGM=SORT or PGM=ICEMAN
  • PGM=SYSIN
  • PGM=SORTOUT

2. Where do DFSORT control statements (e.g. SORT FIELDS) go?

  • In the JOB statement
  • In the dataset pointed to by the SYSIN DD
  • In SORTIN
  • In SYSOUT

3. What does DISP=SHR on SORTIN mean?

  • Share the dataset with other jobs in exclusive mode
  • Share read access; other jobs can read the dataset too
  • Delete the dataset after the step
  • Create a new dataset

4. Why is SYSOUT needed in a DFSORT step?

  • It holds the sorted output
  • It receives messages and diagnostics from DFSORT
  • It replaces SYSIN
  • It is optional and rarely used

5. What is in-stream data (DD *) used for in DFSORT JCL?

  • Defining the input dataset
  • Supplying control statements directly in the JCL
  • Allocating SORTOUT
  • Reserving sort work space