MainframeMaster

What is DFSORT?

DFSORT (Data Facility Sort) is IBM's high-performance sort, merge, and copy utility for z/OS. It is one of the most widely used mainframe utilities for organizing and transforming large datasets in batch jobs.

Foundation Tutorial
Progress0 of 0 lessons

Definition of DFSORT

DFSORT is a utility program that runs as a batch step on IBM z/OS. You invoke it from JCL (Job Control Language) by specifying PGM=SORT or PGM=ICEMAN. It reads one or more input datasets, processes them according to control statements you provide (in the SYSIN stream), and writes one or more output datasets.

In simple terms: you give DFSORT an input file and a set of instructions (e.g. "sort by this field," "keep only these records," "reformat the output"). It performs the work without you writing a COBOL or other program. That makes it fast to use and easy to change when requirements change.

What Does DFSORT Stand For?

DFSORT stands for Data Facility Sort. The name breaks down as:

  • Data Facility — It is part of IBM's "Data Facility" family of products that manage and process data on z/OS (other members include utilities for backup, migration, and data set management).
  • Sort — Its primary function is sorting: rearranging records into a specified order (e.g. by a key field, ascending or descending).

Although "Sort" is in the name, DFSORT does much more than sorting: it merges, copies, filters, reformats, summarizes, and can join files. So "Data Facility Sort" is the official name, but in practice people use DFSORT for a broad set of data operations.

Where Does DFSORT Run?

DFSORT runs on IBM z/OS (and compatible mainframe environments). It is the standard sort/merge utility supplied with z/OS. You run it as a batch job step: the job’s JCL defines the input and output datasets and the SYSIN data set (or in-stream data) that contains the DFSORT control statements.

You do not run DFSORT on Windows, Linux, or in the cloud directly—those platforms have their own sort tools. Mainframe applications and batch jobs on z/OS use DFSORT (or a licensed alternative such as Syncsort) when they need to sort, merge, or transform data.

How You Run DFSORT: JCL and Control Statements

To use DFSORT you need:

  • A JCL step that runs the program (e.g. EXEC PGM=SORT or EXEC PGM=ICEMAN).
  • DD statements for input and output: typically SORTIN (input), SORTOUT (output), SYSIN (control statements), and SYSOUT (messages). Merge uses SORTIN01, SORTIN02, etc.
  • Control statements in SYSIN that tell DFSORT what to do: SORT FIELDS, MERGE, INCLUDE, OMIT, INREC, OUTREC, OUTFIL, SUM, and others.

The following is a minimal example that sorts a fixed-length input by the first 20 bytes (character, ascending) and writes the same record layout to the output:

jcl
1
2
3
4
5
6
7
8
//SORTSTEP EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN DD DSN=MY.INPUT.DATA,DISP=SHR //SORTOUT DD DSN=MY.SORTED.DATA,DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * SORT FIELDS=(1,20,CH,A) /*

SORT FIELDS=(1,20,CH,A) means: sort by a field starting at position 1, length 20, character format (CH), ascending (A). The rest of the record is carried along unchanged. Later tutorials cover other formats (e.g. packed decimal, binary) and multiple sort keys.

The Three Primary Operations: Sort, Merge, and Copy

DFSORT is built around three main operations. Understanding them helps you choose the right control statements.

Sort

Sort means reordering records of a single input (SORTIN) by one or more key fields. You use the SORT FIELDS= control statement to define the key (position, length, format, direction). DFSORT reads all records, sorts them in memory and/or work files, and writes them to SORTOUT in the new order. Use sort when your data is not already in the order you need.

Merge

Merge combines two or more inputs that are already sorted by the same key. You use MERGE FIELDS= and multiple input DD names (SORTIN01, SORTIN02, …). DFSORT does not re-sort the whole set; it merges the streams in one pass. Use merge when you have multiple sorted files and want one sorted result—it is more efficient than concatenating and sorting again.

Copy

Copy means reading input and writing it to output without changing order. You use OPTION COPY when you only want to filter (INCLUDE/OMIT), reformat (INREC/OUTREC), split (OUTFIL), or summarize (SUM) without sorting. Copy is faster when order is not required.

Many jobs use a combination: for example, SORT with INCLUDE (filter before sort), OUTREC (reformat after sort), and OUTFIL (write multiple outputs or reports).

Key Capabilities Beyond Basic Sort and Merge

DFSORT provides many features that let you avoid writing a custom program:

  • Filtering — INCLUDE and OMIT select or drop records by conditions (e.g. field equals a value, numeric comparisons, ranges).
  • Reformatting — INREC changes the record before sort (e.g. build a shorter record to sort). OUTREC changes it after sort (e.g. reorder fields, add constants, edit numbers).
  • Multiple outputs — OUTFIL can write several datasets or reports in one step (e.g. split by key, different layouts, headers/trailers).
  • Summarization — SUM removes duplicates and/or adds numeric fields for groups with the same key.
  • Joining — JOINKEYS (with REFORMAT) can join two files (inner, left, right, full outer) in one step.
  • Data types — It supports character (CH), packed decimal (PD), zoned decimal (ZD), binary (BI), and floating-point, so you can sort and compare numeric fields correctly.

DFSORT and ICETOOL

ICETOOL is an enhanced interface that uses DFSORT underneath. You run PGM=ICETOOL and give it "operators" (e.g. COPY, COUNT, SORT, SPLICE, DISPLAY). ICETOOL can run multiple DFSORT operations in one step and is useful for reporting, counting, and multi-file operations. When you learn DFSORT control statements, you are also in a good position to use ICETOOL.

Why Use DFSORT Instead of a Program?

Mainframe shops use DFSORT for several reasons:

  • Performance — It is highly optimized for large datasets and uses tuned buffers and work files.
  • No compile step — You change control statements and rerun the job; no need to change and compile COBOL or PL/I.
  • Standard and supported — It is part of z/OS, so it is available everywhere and supported by IBM.
  • Rich syntax — One utility handles sort, merge, copy, filter, reformat, sum, and join, so many batch needs are met without custom code.

Explain It Like I'm Five

Imagine you have a big pile of cards with names and numbers. You want them in order by name, or you want to throw away some cards, or you want to copy only certain ones into a new pile. DFSORT is like a very fast helper that reads your instructions (e.g. "sort by name," "only keep the ones from New York") and does it all in one go. You don’t have to write a whole program—you just tell it what you want in a short list of commands, and it does the work.

Exercises

  1. In your own words, what does "Data Facility Sort" mean and what is DFSORT used for?
  2. What is the difference between SORT and MERGE in DFSORT? When would you use each?
  3. In the sample JCL above, what does SORT FIELDS=(1,20,CH,A) do? What would change if you used D instead of A?
  4. Name three things DFSORT can do besides sorting records into a new order.
  5. What DD names are used in the example for input data, output data, and control statements?

Quiz

Test Your Knowledge

1. What does DFSORT stand for?

  • Data File Sort
  • Data Facility Sort
  • Direct File Sort
  • Disk Facility Sort

2. Which JCL program name is commonly used to run DFSORT?

  • DFSORT
  • ICEMAN
  • SORT
  • Both SORT and ICEMAN

3. What are the three primary operations DFSORT performs?

  • Read, Write, Delete
  • Sort, Merge, Copy
  • Filter, Transform, Load
  • Index, Search, Replace

4. Where does DFSORT typically run?

  • Windows
  • Linux only
  • IBM z/OS mainframes
  • Cloud only

5. What is ICETOOL in relation to DFSORT?

  • A competing product
  • An enhanced interface to DFSORT
  • A type of sort key
  • A JCL keyword