This page is a deep dive on the syntax of SORT FIELDS=: the four parts of each key (position, length, format, direction), what each format code really means, how position and length interact with the record layout, and why choosing the wrong format produces wrong sort order. You use SORT FIELDS= to tell DFSORT where each sort key is, how long it is, how to interpret it (character vs numeric and which numeric encoding), and whether to sort ascending or descending. Getting the format right is critical—sorting a numeric field as character can make 100 come before 20; sorting a character field as numeric can cause garbage order or errors. Here we go through every common format, the rules for position and length, and how multiple keys are specified.
Each sort key in SORT FIELDS= is defined by exactly four values: position, length, format, and direction. You cannot omit any of them. For one key you write (position, length, format, direction). For two keys you write (pos1, len1, fmt1, dir1, pos2, len2, fmt2, dir2), and so on. The order of keys is the order of significance: the first key is the primary sort key; when two records have the same value in the first key, the second key breaks the tie; then the third, and so on.
Position is the starting byte of the key in the record. Positions are 1-based: the first byte of the record is position 1, the second is position 2, and so on. So position 1 means the key starts at the very beginning of the record; position 10 means the key starts at the 10th byte. Position must be a positive integer (1 or greater). The record that position refers to is the record as seen by the sort phase—so if you use INREC, the record may be shorter than the original input, and positions 1, 2, 3, … refer to the reformatted record. If you do not use INREC, positions refer to the original SORTIN record. This is important: after INREC, an 80-byte input might become a 40-byte record; then valid positions are 1 through 40, and byte 50 of the original no longer exists for the sort.
Length is the number of bytes the key occupies. So (1, 10, …) means a 10-byte key occupying bytes 1 through 10. Length is always in bytes, regardless of format. For character (CH) data, one byte is one character. For packed decimal (PD), each byte holds two decimal digits (except the last byte may hold one digit and the sign), so a 4-byte PD field might represent up to 7 digits plus sign—but in SORT FIELDS you still specify length as 4 (bytes). For zoned decimal (ZD), one byte per digit, so length 5 means 5 bytes and typically 5 digits. The key must fit entirely within the record: position + length − 1 must not exceed the record length. For example, in a 80-byte record, (75, 10, CH, A) is invalid because 75 + 10 − 1 = 84 > 80. (75, 6, CH, A) is valid (75 + 6 − 1 = 80).
Format tells DFSORT how to interpret the key bytes so that the comparison produces the correct order. If you choose the wrong format, the sort order will be wrong. For example, if you have a numeric amount stored in packed decimal and you specify CH (character), DFSORT will compare the raw byte values (EBCDIC), not the numeric values—so you might see amounts sorted as if they were random characters. Below we explain each common format and when to use it.
CH means the key is character (alphanumeric) data. DFSORT compares the bytes in byte order—typically EBCDIC on the mainframe. So the sort order follows the collating sequence of the encoding (e.g. A before B, 0 before 9 in EBCDIC). Use CH for names, IDs, codes, and any key that is not stored as a numeric type (PD, ZD, BI). If you use CH for a numeric field stored as display digits (e.g. "00123" in EBCDIC), the order will be correct for positive numbers of the same length because the character order of digits 0–9 matches numeric order; but for different lengths or negative numbers, CH can give wrong numeric order. For true numeric comparison of zoned or packed fields, use ZD or PD.
ZD means zoned decimal. Each byte holds one digit (0–9) in the low nibble; the high nibble is the "zone"; the last byte also carries the sign (e.g. F for positive, D for negative in EBCDIC). This is how COBOL DISPLAY numeric data is stored. DFSORT interprets the key as a signed decimal number and compares numeric value. So 00001 sorts before 00002, and negative numbers sort before positive if ascending. Use ZD whenever the key is stored in zoned (display) form. The length you specify is the number of bytes (e.g. 5 for a PIC 9(5) or S9(5) DISPLAY field).
PD means packed decimal. Each byte holds two decimal digits in the two nibbles, except the last byte which holds one digit and the sign (e.g. C for positive, D for negative). This is how COBOL COMP-3 (and similar) data is stored. DFSORT interprets the key as a signed packed decimal number and compares numeric value. Use PD whenever the key is stored in packed form. The length in SORT FIELDS is the number of bytes (e.g. 4 bytes for a typical PIC S9(7) COMP-3, which uses 4 bytes: 7 digits + sign). If you specify PD for a field that is actually zoned or character, the comparison will be wrong because the byte layout is different.
BI means binary (unsigned or fullword/halfword integer). The key is interpreted as a binary integer. You must specify the correct length: typically 2 bytes (halfword) or 4 bytes (fullword) for COBOL COMP or COMP-4. DFSORT compares the numeric value. Use BI for binary integer keys. If the data is actually signed, some products support a signed form (e.g. FI); see your documentation. Using BI for a packed or zoned field will produce incorrect order because the bit pattern is not a binary integer.
FI means fixed-point (signed binary). Similar to BI but the value is treated as signed. Use when the key is a signed binary integer (e.g. COBOL S9(n) COMP). Length is typically 2 or 4 bytes. The exact support (FI vs BI for signed) is product-dependent; refer to your DFSORT manual.
FL means floating-point. The key is interpreted as a floating-point number (e.g. IEEE or IBM hex float). Length and format details depend on the product (e.g. 4 or 8 bytes). Use FL only when the key is actually stored in a floating-point format. Floating-point sorting has subtleties (e.g. negative zero, NaN); see the floating-point sorting tutorial and your product docs.
The fourth value is direction: A for ascending or D for descending. Ascending (A) means low-to-high: smaller values (or earlier in collating sequence for CH) come first. Descending (D) means high-to-low: larger values (or later in collating sequence) come first. You can mix: for example, primary key ascending and secondary key descending. So SORT FIELDS=(1,10,CH,A,11,4,PD,D) sorts by the first 10 bytes character ascending, then by the 4-byte packed field descending when the first key is equal.
If your data is packed decimal (COMP-3) and you specify CH, DFSORT will compare the raw bytes as characters. The byte values for packed digits do not follow 0–9 in order; they are nibbles. So the sort order will not be numeric—you might see 100 before 20, or seemingly random order. Similarly, if your data is zoned decimal (DISPLAY) and you specify PD, DFSORT will interpret the bytes as packed, which is wrong—each byte in zoned has one digit, not two. So always match the format to the way the key is actually stored in the record. When in doubt, check the program or copybook that writes the data (e.g. COBOL PIC clause: DISPLAY → ZD, COMP-3 → PD, COMP → BI/FI).
For multiple sort keys, you list one 4-tuple after another: (pos1, len1, fmt1, dir1, pos2, len2, fmt2, dir2, …). There is no separate keyword between keys—just the comma-separated list. Keys can overlap in the record (e.g. key1 bytes 1–10, key2 bytes 5–8) though usually they are distinct. The primary key is the first 4-tuple; when two records are equal on the primary key, the second key is used; when still equal, the third, and so on. Each key can have a different format and direction. Example: SORT FIELDS=(1,20,CH,A,21,4,PD,D,25,8,ZD,A) — primary: bytes 1–20 character ascending; secondary: bytes 21–24 packed descending; tertiary: bytes 25–32 zoned ascending.
Control statements in SYSIN are often limited to 72 characters per line (or similar). If your SORT FIELDS= list is long, you can continue on the next line by ending the current line with a comma and putting the rest on the following line. Blanks can be used for readability. Ensure that no required value is split across lines in a way that the parser misreads (e.g. a number or format code). Check your DFSORT documentation for the exact continuation rules.
The record that SORT FIELDS= sees is the record after INREC (if INREC is present) or the original input record (if not). So if INREC builds a 40-byte record from an 80-byte input, all positions in SORT FIELDS must be between 1 and 40. The original positions 41–80 no longer exist for the sort. This is a common source of errors: after adding or changing INREC, forgetting to update SORT FIELDS positions so they refer to the new layout. Similarly, the record length that matters for the "position + length − 1" check is the length of the record at sort time (the INREC length or the input record length).
1SORT FIELDS=(1,10,CH,A)
One key: bytes 1–10, character format, ascending. Records are ordered by the first 10 bytes in EBCDIC order. Good for names, IDs, or any 10-byte character key.
1SORT FIELDS=(25,4,PD,D)
One key: bytes 25–28 as packed decimal, descending. The largest numeric value in that 4-byte packed field comes first. Use when the field is COMP-3.
1SORT FIELDS=(1,8,CH,A,9,5,ZD,D,14,20,CH,A)
Three keys: (1) bytes 1–8 character ascending, (2) bytes 9–13 zoned decimal descending, (3) bytes 14–33 character ascending. Ties on the first key are broken by the second; ties on the first two are broken by the third.
Sort fields syntax is like telling the sorter: "Look at this part of the card (position and length), read it as letters or as a number (format), and line them up from small to big or big to small (A or D)." If you say "read it as a number" but the part is actually letters, the order will be wrong—like sorting people by height when you accidentally looked at their shoe size. So position and length say where to look, and format says how to read it. Getting format right is what makes the sort order make sense.
1. Why must the format in SORT FIELDS match how the data is stored?
2. What does length mean for a PD (packed decimal) key?
3. Positions in SORT FIELDS are 1-based. What does position 1 mean?
4. You have a 50-byte record after INREC. What is the valid range for start + length in SORT FIELDS?
5. Which format would you use for a COBOL DISPLAY numeric field (e.g. PIC 9(5))?