In DFSORT, every sort key and many control fields are described by a format that tells the program how to interpret the bytes. CH stands for character (sometimes called alphanumeric). When you specify CH, DFSORT does not treat the field as a number; it compares the bytes in collating sequence—on z/OS typically EBCDIC—one byte at a time from left to right. CH is the right choice for names, IDs, codes, and any text data. Using CH for numeric data can produce wrong order (e.g. "10" before "9"). This page explains what CH is, how it behaves, when to use it, and how it differs from numeric formats like ZD and PD.
CH tells DFSORT: "this field is character data." The sort does not interpret the bytes as a number (packed, zoned, or binary). Instead, it compares the key byte by byte, from the first byte to the last, using the collating sequence of the job. On IBM z/OS that is usually EBCDIC (Extended Binary Coded Decimal Interchange Code). Whichever record has the smaller byte value at the first position where the two keys differ is ordered first in ascending sort; for descending, the order is reversed. So CH gives you lexicographic (dictionary-style) order based on the character set, not numeric order.
The most common place you use CH is in SORT FIELDS. The format is:
1SORT FIELDS=(start,length,CH,direction)
Example: sort on a 20-byte name at position 1, ascending:
1SORT FIELDS=(1,20,CH,A)
For multiple sort keys, you repeat the (position, length, format, direction) pattern. Example: sort by last name (1,20,CH,A) then first name (21,15,CH,A):
1SORT FIELDS=(1,20,CH,A,21,15,CH,A)
With CH, DFSORT compares the two keys like this: start at byte 1; if the bytes are equal, move to byte 2; repeat until a byte differs or one key runs out. The key with the smaller byte value at that position is ordered first (ascending). If all bytes are equal and the lengths are the same, the two records are equal for sort purposes (order between them may depend on OPTION EQUALS/NOEQUALS). If one key is a prefix of the other (e.g. "ABC" vs "ABCD"), the shorter one is less in EBCDIC, so it comes first in ascending order.
In standard EBCDIC, the character codes for the digits 0 through 9 are in ascending numeric order (typically 0xF0 through 0xF9). So for fixed-length strings that contain only digits and no sign (e.g. 8-digit date YYYYMMDD or 5-digit ID), comparing byte-by-byte in EBCDIC gives the same order as comparing the numbers. That is why some shops use CH for such fields. But as soon as you have variable-length numbers (e.g. "9" vs "10"), the first byte of "10" is "1", which is less than "9" in EBCDIC, so "10" sorts before "9"—wrong for numeric order. Negative numbers (with a sign in the last byte) and leading spaces also break the correspondence. So for reliability with numeric data, use a numeric format (ZD, PD, BI) that matches the storage.
| Use case | Reason |
|---|---|
| Names (person, product, etc.) | Data is text; byte order is the correct order. |
| Alphanumeric IDs or codes | IDs are compared as strings, not as numbers. |
| Address lines, descriptions | Plain text; no numeric meaning. |
| Fixed-length date as string (e.g. YYYYMMDD) | Can work if all values are same length and positive; ZD is safer for dates stored as numbers. |
| Keys that mix letters and digits | CH preserves lexicographic order; numeric formats do not apply. |
CH is not only for SORT FIELDS. Whenever DFSORT needs to know the format of a field—for comparison, building a new field, or writing output—you can specify CH. Examples:
The order you get with CH depends on the collating sequence. By default that is the job's character set (e.g. EBCDIC). You can change it with ALTSEQ (alternate collating sequence) to get case-insensitive sorts or custom order. That does not change the fact that the key is still compared byte-by-byte; it only changes which byte value is considered "less than" or "greater than" another. So CH plus ALTSEQ is still character comparison, not numeric.
Imagine sorting word cards. We look at the first letter: if one card has "A" and another "B", the A card goes first. If the first letters are the same, we look at the second letter, and so on. We never try to "add up" the letters as a number—we just compare them in ABC order. CH does that with the bytes in your field: first byte, then second, and so on, using the computer's letter order (EBCDIC). So CH is for names and words. When the field is really a number (like money or quantity), we use a different rule (ZD or PD) so that 9 comes before 10.
1. What does CH mean in a DFSORT SORT FIELDS specification?
2. For a 10-byte name field starting at position 1, which SORT FIELDS specification is correct for ascending order?
3. When can sorting digits with CH produce the same order as numeric sort?
4. Why might "10" sort before "9" when using CH?
5. Where can CH be used besides SORT FIELDS?