MainframeMaster

SORT Statement

The SORT control statement tells DFSORT to sort the input records by one or more keys. You use SORT FIELDS= to define those keys: where each key is in the record (position and length), how to interpret it (character, packed decimal, binary, etc.), and whether to sort ascending or descending. This page covers the full syntax of the SORT statement, the meaning of each parameter, single and multiple keys, and how SORT relates to MERGE and OPTION COPY.

Control Statements
Progress0 of 0 lessons

SORT FIELDS= Syntax

The basic form of the SORT statement is:

text
1
SORT FIELDS=(start,length,format,direction,...)

Each key is defined by four values: start, length, format, and direction. For multiple keys, you repeat the four values for each key (e.g. key1 start, key1 length, key1 format, key1 direction, key2 start, key2 length, key2 format, key2 direction).

Start (position)

Start is the starting byte position of the key in the record. Positions are 1-based: the first byte of the record is position 1. So start=1 means the key begins at the first byte; start=10 means the key begins at the 10th byte. The value must be a positive integer. If you use INREC to reformat the record before the sort, the positions in SORT FIELDS refer to the reformatted record (the record after INREC), not the original input.

Length

Length is the length of the key in bytes. So (1,10,...) means a 10-byte key starting at position 1 (bytes 1 through 10). The key must lie entirely within the record; start + length - 1 must not exceed the record length. Length is always in bytes, regardless of format (character, numeric, etc.).

Format

Format tells DFSORT how to interpret the bytes so that the correct sort order is produced. Common formats:

  • CH — Character (alphanumeric). Bytes are compared in EBCDIC order (or the collating sequence in effect). Use for names, IDs, and any non-numeric key. Case and character order depend on the encoding.
  • PD — Packed decimal. Used for COBOL COMP-3 and similar. Each byte holds two decimal digits (except the last, which may hold one digit and sign). DFSORT compares the numeric value. Use for numeric keys stored in packed form.
  • ZD — Zoned decimal. Used for COBOL DISPLAY numeric. Each byte holds one digit; the last byte includes the sign (F for positive, D for negative in EBCDIC). Use for numeric keys stored in zoned form.
  • BI — Binary. Fullword (4 bytes) or halfword (2 bytes) binary. Use for COBOL COMP or COMP-4, or fixed-length binary fields. You specify the length (2 or 4 typically).
  • FI — Fixed-point (signed binary). Similar to BI but explicitly signed.

Choosing the wrong format can give wrong sort order. For example, sorting a packed-decimal field as CH will sort by the byte values, not the numeric value.

Direction

Direction is either A (ascending) or D (descending). A means low-to-high: smaller values (or earlier in collating sequence) come first. D means high-to-low: larger values (or later in collating sequence) come first. You can mix: e.g. first key ascending, second key descending.

Single-Key Examples

text
1
SORT FIELDS=(1,20,CH,A)

Sorts by the first 20 bytes of the record, character format, ascending. So records are ordered A–Z (and 0–9) by that 20-byte field.

text
1
SORT FIELDS=(25,4,PD,D)

Sorts by bytes 25–28 as packed decimal, descending. So the largest numeric value in that field comes first.

text
1
SORT FIELDS=(10,8,ZD,A)

Sorts by bytes 10–17 as zoned decimal, ascending. Numeric order from smallest to largest.

Multiple Sort Keys

When you specify more than one key, DFSORT sorts by the first key; when two records have the same value in the first key, it uses the second key to break the tie, then the third, and so on. Example:

text
1
SORT FIELDS=(1,10,CH,A,11,5,PD,D,16,20,CH,A)

Primary key: bytes 1–10, character, ascending. Secondary key: bytes 11–15, packed decimal, descending. Tertiary key: bytes 16–35, character, ascending. So you might sort by name (CH A), then by amount (PD D), then by another field (CH A).

SORT vs MERGE vs OPTION COPY

SORT is used when you have a single input (SORTIN) and you want to reorder the records by key. DFSORT reads all records, sorts them, and writes to SORTOUT. Use MERGE when you have two or more inputs (SORTIN01, SORTIN02, …) that are already sorted by the same key; DFSORT merges them into one sorted stream without re-sorting. Use OPTION COPY when you do not want to sort at all—you just want to copy (and optionally filter or reformat) records from input to output in input order. So: SORT = one input, reorder; MERGE = multiple sorted inputs, combine; OPTION COPY = no reordering.

Record Layout and INREC

The positions in SORT FIELDS refer to the record as seen by the sort phase. If you use INREC, the record is reformatted before the sort; so the record layout that the sort sees is the output of INREC. For example, if INREC builds a 50-byte record from an 80-byte input, then SORT FIELDS positions are 1–50, not 1–80. If you do not use INREC, the record is the same as the input (SORTIN) record, and positions refer to that layout.

Explain It Like I'm Five

The SORT statement is like telling the sorter: "Line up the cards by this part of the card." Start and length say which part (e.g. the first 10 letters). Format says how to read that part—as letters (CH) or as a number (PD, ZD, BI). Direction says whether you want the smallest first (A) or the biggest first (D). If two cards have the same first part, you can say "then sort by this other part"—that's a second key. So SORT FIELDS is just: where is the key, how long is it, how do we read it, and do we want A or D?

Exercises

  1. Write a SORT FIELDS that sorts by bytes 5–14 as character ascending.
  2. What is the difference between sorting a numeric field as CH versus PD? When would each be correct?
  3. Write SORT FIELDS for: primary key bytes 1–8 character ascending, secondary key bytes 9–12 packed decimal descending.
  4. If you use INREC to shorten the record to 40 bytes, what is the maximum start+length you can use in SORT FIELDS?

Quiz

Test Your Knowledge

1. What does SORT FIELDS=(1,10,CH,A) do?

  • Sorts by bytes 1-10, character format, descending
  • Sorts by bytes 1-10, character format, ascending
  • Merges 10 files
  • Copies only the first 10 bytes

2. What does the fourth parameter in SORT FIELDS= (A or D) mean?

  • Record length
  • Ascending (A) or Descending (D) order
  • Number of keys
  • Format type

3. Which format is used for character (alphanumeric) data?

  • PD
  • BI
  • CH
  • ZD

4. How do you specify a second sort key in SORT FIELDS?

  • Use a second SORT statement
  • Add another (position,length,format,direction) group after the first
  • Use MERGE
  • You cannot have two keys

5. When do you use MERGE instead of SORT?

  • When you have one input file
  • When you have two or more pre-sorted inputs to combine
  • When you want descending order
  • When you use INCLUDE