What is the full syntax of SORT FIELDS?

SORT FIELDS=(start,length,format,direction,...). Start is the starting byte position (1-based), length is the key length in bytes, format is CH/PD/ZD/BI/FI/FL (character, packed, zoned, binary, fixed-point, float), and direction is A (ascending) or D (descending). For multiple keys, repeat the four values for each key.

What is the difference between CH and ZD when sorting numbers?

CH compares bytes in character (EBCDIC) order, so 9 sorts before 1 in a single-byte comparison. ZD interprets the bytes as zoned decimal and compares numeric value, so 1 sorts before 9. For numeric order you must use ZD (or PD/BI) when the data is stored in that format.

What format do I use for COMP-3 (packed decimal) fields?

Use PD (packed decimal). COMP-3 is stored in packed form with two digits per byte (except the last, which has digit and sign). Specifying PD ensures DFSORT compares the numeric value correctly.

Do SORT FIELDS positions refer to the input or the INREC record?

They refer to the record as seen by the sort phase. If you use INREC, that is the reformatted (possibly shortened) record, so positions are 1 to the length of the INREC output. If you do not use INREC, positions refer to the original SORTIN record.

Can I have more than one sort key?

Yes. Code multiple 4-tuples: SORT FIELDS=(pos1,len1,fmt1,ord1,pos2,len2,fmt2,ord2,...). Records are sorted by the first key; when equal, the second key is used, then the third, and so on. Each key can have a different format and direction.

Sort Fields Syntax

This page is a deep dive on the syntax of SORT FIELDS=: the four parts of each key (position, length, format, direction), what each format code really means, how position and length interact with the record layout, and why choosing the wrong format produces wrong sort order. You use SORT FIELDS= to tell DFSORT where each sort key is, how long it is, how to interpret it (character vs numeric and which numeric encoding), and whether to sort ascending or descending. Getting the format right is critical—sorting a numeric field as character can make 100 come before 20; sorting a character field as numeric can cause garbage order or errors. Here we go through every common format, the rules for position and length, and how multiple keys are specified.

SORT Statement Deep Dive

Progress0 of 0 lessons

The Four Parts of Every Sort Key

Each sort key in SORT FIELDS= is defined by exactly four values: position, length, format, and direction. You cannot omit any of them. For one key you write (position, length, format, direction). For two keys you write (pos1, len1, fmt1, dir1, pos2, len2, fmt2, dir2), and so on. The order of keys is the order of significance: the first key is the primary sort key; when two records have the same value in the first key, the second key breaks the tie; then the third, and so on.

Position (Start)

Position is the starting byte of the key in the record. Positions are 1-based: the first byte of the record is position 1, the second is position 2, and so on. So position 1 means the key starts at the very beginning of the record; position 10 means the key starts at the 10th byte. Position must be a positive integer (1 or greater). The record that position refers to is the record as seen by the sort phase—so if you use INREC, the record may be shorter than the original input, and positions 1, 2, 3, … refer to the reformatted record. If you do not use INREC, positions refer to the original SORTIN record. This is important: after INREC, an 80-byte input might become a 40-byte record; then valid positions are 1 through 40, and byte 50 of the original no longer exists for the sort.

Length

Length is the number of bytes the key occupies. So (1, 10, …) means a 10-byte key occupying bytes 1 through 10. Length is always in bytes, regardless of format. For character (CH) data, one byte is one character. For packed decimal (PD), each byte holds two decimal digits (except the last byte may hold one digit and the sign), so a 4-byte PD field might represent up to 7 digits plus sign—but in SORT FIELDS you still specify length as 4 (bytes). For zoned decimal (ZD), one byte per digit, so length 5 means 5 bytes and typically 5 digits. The key must fit entirely within the record: position + length − 1 must not exceed the record length. For example, in a 80-byte record, (75, 10, CH, A) is invalid because 75 + 10 − 1 = 84 > 80. (75, 6, CH, A) is valid (75 + 6 − 1 = 80).

Format: How DFSORT Interprets the Bytes

Format tells DFSORT how to interpret the key bytes so that the comparison produces the correct order. If you choose the wrong format, the sort order will be wrong. For example, if you have a numeric amount stored in packed decimal and you specify CH (character), DFSORT will compare the raw byte values (EBCDIC), not the numeric values—so you might see amounts sorted as if they were random characters. Below we explain each common format and when to use it.

CH — Character

CH means the key is character (alphanumeric) data. DFSORT compares the bytes in byte order—typically EBCDIC on the mainframe. So the sort order follows the collating sequence of the encoding (e.g. A before B, 0 before 9 in EBCDIC). Use CH for names, IDs, codes, and any key that is not stored as a numeric type (PD, ZD, BI). If you use CH for a numeric field stored as display digits (e.g. "00123" in EBCDIC), the order will be correct for positive numbers of the same length because the character order of digits 0–9 matches numeric order; but for different lengths or negative numbers, CH can give wrong numeric order. For true numeric comparison of zoned or packed fields, use ZD or PD.

ZD — Zoned Decimal

ZD means zoned decimal. Each byte holds one digit (0–9) in the low nibble; the high nibble is the "zone"; the last byte also carries the sign (e.g. F for positive, D for negative in EBCDIC). This is how COBOL DISPLAY numeric data is stored. DFSORT interprets the key as a signed decimal number and compares numeric value. So 00001 sorts before 00002, and negative numbers sort before positive if ascending. Use ZD whenever the key is stored in zoned (display) form. The length you specify is the number of bytes (e.g. 5 for a PIC 9(5) or S9(5) DISPLAY field).

PD — Packed Decimal

PD means packed decimal. Each byte holds two decimal digits in the two nibbles, except the last byte which holds one digit and the sign (e.g. C for positive, D for negative). This is how COBOL COMP-3 (and similar) data is stored. DFSORT interprets the key as a signed packed decimal number and compares numeric value. Use PD whenever the key is stored in packed form. The length in SORT FIELDS is the number of bytes (e.g. 4 bytes for a typical PIC S9(7) COMP-3, which uses 4 bytes: 7 digits + sign). If you specify PD for a field that is actually zoned or character, the comparison will be wrong because the byte layout is different.

BI — Binary

BI means binary (unsigned or fullword/halfword integer). The key is interpreted as a binary integer. You must specify the correct length: typically 2 bytes (halfword) or 4 bytes (fullword) for COBOL COMP or COMP-4. DFSORT compares the numeric value. Use BI for binary integer keys. If the data is actually signed, some products support a signed form (e.g. FI); see your documentation. Using BI for a packed or zoned field will produce incorrect order because the bit pattern is not a binary integer.

FI — Fixed-Point (Signed Binary)

FI means fixed-point (signed binary). Similar to BI but the value is treated as signed. Use when the key is a signed binary integer (e.g. COBOL S9(n) COMP). Length is typically 2 or 4 bytes. The exact support (FI vs BI for signed) is product-dependent; refer to your DFSORT manual.

FL — Floating-Point

FL means floating-point. The key is interpreted as a floating-point number (e.g. IEEE or IBM hex float). Length and format details depend on the product (e.g. 4 or 8 bytes). Use FL only when the key is actually stored in a floating-point format. Floating-point sorting has subtleties (e.g. negative zero, NaN); see the floating-point sorting tutorial and your product docs.

Direction: A and D

The fourth value is direction: A for ascending or D for descending. Ascending (A) means low-to-high: smaller values (or earlier in collating sequence for CH) come first. Descending (D) means high-to-low: larger values (or later in collating sequence) come first. You can mix: for example, primary key ascending and secondary key descending. So SORT FIELDS=(1,10,CH,A,11,4,PD,D) sorts by the first 10 bytes character ascending, then by the 4-byte packed field descending when the first key is equal.

Why Format Matters: Wrong Format, Wrong Order

If your data is packed decimal (COMP-3) and you specify CH, DFSORT will compare the raw bytes as characters. The byte values for packed digits do not follow 0–9 in order; they are nibbles. So the sort order will not be numeric—you might see 100 before 20, or seemingly random order. Similarly, if your data is zoned decimal (DISPLAY) and you specify PD, DFSORT will interpret the bytes as packed, which is wrong—each byte in zoned has one digit, not two. So always match the format to the way the key is actually stored in the record. When in doubt, check the program or copybook that writes the data (e.g. COBOL PIC clause: DISPLAY → ZD, COMP-3 → PD, COMP → BI/FI).

Multiple Keys: Left to Right

For multiple sort keys, you list one 4-tuple after another: (pos1, len1, fmt1, dir1, pos2, len2, fmt2, dir2, …). There is no separate keyword between keys—just the comma-separated list. Keys can overlap in the record (e.g. key1 bytes 1–10, key2 bytes 5–8) though usually they are distinct. The primary key is the first 4-tuple; when two records are equal on the primary key, the second key is used; when still equal, the third, and so on. Each key can have a different format and direction. Example: SORT FIELDS=(1,20,CH,A,21,4,PD,D,25,8,ZD,A) — primary: bytes 1–20 character ascending; secondary: bytes 21–24 packed descending; tertiary: bytes 25–32 zoned ascending.

Continuation and Statement Length

Control statements in SYSIN are often limited to 72 characters per line (or similar). If your SORT FIELDS= list is long, you can continue on the next line by ending the current line with a comma and putting the rest on the following line. Blanks can be used for readability. Ensure that no required value is split across lines in a way that the parser misreads (e.g. a number or format code). Check your DFSORT documentation for the exact continuation rules.

Record Layout and INREC

The record that SORT FIELDS= sees is the record after INREC (if INREC is present) or the original input record (if not). So if INREC builds a 40-byte record from an 80-byte input, all positions in SORT FIELDS must be between 1 and 40. The original positions 41–80 no longer exist for the sort. This is a common source of errors: after adding or changing INREC, forgetting to update SORT FIELDS positions so they refer to the new layout. Similarly, the record length that matters for the "position + length − 1" check is the length of the record at sort time (the INREC length or the input record length).

Examples With Explanation

text

1
  SORT FIELDS=(1,10,CH,A)

One key: bytes 1–10, character format, ascending. Records are ordered by the first 10 bytes in EBCDIC order. Good for names, IDs, or any 10-byte character key.

text

1
  SORT FIELDS=(25,4,PD,D)

One key: bytes 25–28 as packed decimal, descending. The largest numeric value in that 4-byte packed field comes first. Use when the field is COMP-3.

text

1
  SORT FIELDS=(1,8,CH,A,9,5,ZD,D,14,20,CH,A)

Three keys: (1) bytes 1–8 character ascending, (2) bytes 9–13 zoned decimal descending, (3) bytes 14–33 character ascending. Ties on the first key are broken by the second; ties on the first two are broken by the third.

Explain It Like I'm Five

Sort fields syntax is like telling the sorter: "Look at this part of the card (position and length), read it as letters or as a number (format), and line them up from small to big or big to small (A or D)." If you say "read it as a number" but the part is actually letters, the order will be wrong—like sorting people by height when you accidentally looked at their shoe size. So position and length say where to look, and format says how to read it. Getting format right is what makes the sort order make sense.

Exercises

Write SORT FIELDS= for: primary key bytes 1–15 character ascending, secondary key bytes 16–19 packed decimal descending. What is the total number of parameters?
Your record is 80 bytes. Is SORT FIELDS=(70,15,CH,A) valid? Why or why not?
A COBOL program writes a field as PIC S9(5) COMP-3. What format and length (in bytes) would you use in SORT FIELDS for that field? (Hint: COMP-3 uses roughly (digits+1)/2 bytes.)
What happens to sort order if you specify CH for a field that is actually stored as PD?

Quiz

Test Your Knowledge

1. Why must the format in SORT FIELDS match how the data is stored?

It does not matter
If you use the wrong format, DFSORT compares bytes as the wrong type and the sort order can be wrong (e.g. numeric order vs character order)
Only CH matters
Format is only for display

2. What does length mean for a PD (packed decimal) key?

Number of digits
Number of bytes in the key (same as for CH or any format); PD stores two digits per byte except the last
Number of records
PD has no length

3. Positions in SORT FIELDS are 1-based. What does position 1 mean?

The second byte
The first byte of the record (or of the INREC record if INREC is used)
The sort key number
The block number

4. You have a 50-byte record after INREC. What is the valid range for start + length in SORT FIELDS?

1 to 50 for start; length can be any value
Start can be 1 to 50; start + length - 1 must not exceed 50
Only 1,50 is valid
Positions refer to the original input

5. Which format would you use for a COBOL DISPLAY numeric field (e.g. PIC 9(5))?

PD
BI
ZD
CH

Sort Fields Syntax

The Four Parts of Every Sort Key

Position (Start)

Length

Format: How DFSORT Interprets the Bytes

CH — Character

ZD — Zoned Decimal

PD — Packed Decimal

BI — Binary

FI — Fixed-Point (Signed Binary)

FL — Floating-Point

Direction: A and D

Why Format Matters: Wrong Format, Wrong Order

Multiple Keys: Left to Right

Continuation and Statement Length

Record Layout and INREC

Examples With Explanation

Explain It Like I'm Five

Exercises

Quiz

Test Your Knowledge

Related Concepts

Numeric vs character sorts

Packed decimal sorting

Zoned decimal sorting

Related Pages