DFSORT (Data Facility Sort) is a high-performance sorting, merging, copying, and data manipulation utility for IBM mainframes. It is an essential tool for batch processing in the mainframe environment, allowing you to efficiently process large volumes of data. DFSORT is also known as ICEMAN, its program name.
Here's a simple example of JCL to sort a sequential file by a character field:
123456789101112//SORTJOB JOB (ACCT),'SORT EXAMPLE',CLASS=A,MSGCLASS=X //* //STEP010 EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN DD DSN=MY.INPUT.FILE,DISP=SHR //SORTOUT DD DSN=MY.SORTED.FILE, // DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(10,5),RLSE), // DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * SORT FIELDS=(1,10,CH,A) /*
This example sorts the records in MY.INPUT.FILE based on a character (CH) field starting at position 1 for a length of 10 bytes in ascending (A) order. The sorted output is written to MY.SORTED.FILE.
DFSORT requires several DD statements to function properly:
DD Name | Description | Required? |
---|---|---|
SYSOUT | Messages and statistics produced by DFSORT | Yes |
SORTIN | Input dataset (for SORT operations) | Yes (for SORT) |
SORTOUT | Output dataset where sorted data is written | Yes (unless using OUT DD) |
SYSIN | Control statements defining the sort operation | Yes |
SORTWKnn | Work files for sorting (nn is 01-32) | Optional (dynamically allocated) |
SORTINnn | Input files for MERGE operations (nn is 01-16) | Yes (for MERGE) |
The SORT control statement specifies the fields to sort on and the order of the sort:
1234567SORT FIELDS=(start,length,format,order,...) Where: - start = Starting position of the field (first byte is position 1) - length = Length of the field in bytes - format = Data format (CH, BI, FI, PD, ZD, etc.) - order = Sort order (A for ascending, D for descending)
Format | Description | Example |
---|---|---|
CH | Character | Names, addresses, alphanumeric data |
BI | Binary | Fullword/halfword binary fields |
FI | Fixed-point | Signed binary numbers |
PD | Packed decimal | COMP-3 fields in COBOL |
ZD | Zoned decimal | DISPLAY numeric fields in COBOL |
AC | ASCII character | ASCII text (rather than EBCDIC) |
123//SYSIN DD * SORT FIELDS=(10,5,CH,A,25,4,PD,D,5,3,ZD,A) /*
This example sorts records by three fields:
1. Characters in positions 10-14 (ascending)
2. Packed decimal in positions 25-28 (descending)
3. Zoned decimal in positions 5-7 (ascending)
12345//SYSIN DD * SORT FIELDS=(1,10,CH,A) INCLUDE COND=(15,3,CH,EQ,C'ABC',AND, 25,2,BI,GT,X'0005') /*
This example sorts records but only includes those where positions 15-17 contain 'ABC' AND the binary value in positions 25-26 is greater than 5.
123456789//SYSIN DD * SORT FIELDS=(1,10,CH,A) OUTREC FIELDS=(1,20, /* First 20 bytes */ C'CUSTOMER: ', /* Literal */ 21,15, /* Next 15 bytes */ 45,10, /* 10 bytes from pos 45 */ X'40', /* Hex constant (space) */ 55,8) /* Last 8 bytes */ /*
This example sorts records and then reformats the output by combining selected fields and constants. The OUTREC statement creates a new record layout for the output file.
1234567891011121314//MERGEJOB JOB (ACCT),'MERGE EXAMPLE',CLASS=A,MSGCLASS=X //* //STEP010 EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN01 DD DSN=SORTED.FILE.ONE,DISP=SHR //SORTIN02 DD DSN=SORTED.FILE.TWO,DISP=SHR //SORTIN03 DD DSN=SORTED.FILE.THREE,DISP=SHR //SORTOUT DD DSN=MERGED.OUTPUT.FILE, // DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(15,5),RLSE), // DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * MERGE FIELDS=(1,10,CH,A,25,5,PD,A) /*
This example merges three pre-sorted files into a single output file. The MERGE control statement specifies the sort fields, which must match how the input files were originally sorted.
Filter records based on conditions:
12INCLUDE COND=(5,3,CH,EQ,C'YES') OMIT COND=(1,1,BI,EQ,X'FF')
INCLUDE keeps matching records, OMIT removes them.
Reformat records:
1INREC FIELDS=(1,10,20,15,C'FIXED TEXT')
INREC reformats before sorting, OUTREC after sorting.
Summarize duplicate records:
12SORT FIELDS=(1,10,CH,A) SUM FIELDS=(25,4,ZD,30,6,PD)
Adds specified fields when records have identical keys.
Control DFSORT processing:
12OPTION DYNALLOC=SYSDA,MAINSIZE=MAX, EQUALS=YES,FILSZ=E50000
Sets memory usage, work file allocation, etc.
The SORT utility (DFSORT or ICEMAN) is a program that allows you to sort, merge, copy, or transform data in mainframe environments. It can handle various data sources including sequential files, VSAM files, and PDS members. SORT is highly optimized for performance and can process large volumes of data efficiently.
The required DD statements for SORT are:
Additional optional DD statements include SYSIN (for control statements if not inline) and SORTINnn/SORTOUTnn for merge operations.
Sort fields are specified using the SORT control statement with the FIELDS parameter:
1SORT FIELDS=(start,length,format,order,...)
Where:
Multiple fields can be specified within the parentheses, separated by commas.
SORT reads all input data from a single source (SORTIN) and sorts it according to specified keys. MERGE combines multiple pre-sorted input sources (SORTIN01, SORTIN02, etc.) into a single sorted output. The key difference is that MERGE requires each input to already be sorted in the same order as the merge operation.
You can include or exclude records using the INCLUDE and OMIT statements:
12INCLUDE COND=(logic_expression) OMIT COND=(logic_expression)
Logic expressions can use relational operators (EQ, NE, GT, LT, GE, LE) and logical operators (AND, OR). For example, INCLUDE COND=(5,4,CH,EQ,C'ABCD')
includes only records where positions 5-8 contain "ABCD".
Yes, DFSORT can modify data during processing using the OUTREC or INREC statements. These allow you to:
INREC reformats records before sorting, while OUTREC reformats after sorting but before output.
To improve SORT performance: