What is CH format in DFSORT?

CH (character) is a DFSORT data format that tells the sort to treat the field as character (alphanumeric) data. The bytes are compared in collating sequence—on z/OS usually EBCDIC—byte by byte from left to right. No numeric interpretation is applied. Use CH for names, codes, IDs, and text.

When should I use CH vs ZD or PD in DFSORT?

Use CH when the sort key is character data: names, alphanumeric codes, or text. Use ZD when the key is zoned decimal (e.g. COBOL DISPLAY numeric). Use PD when the key is packed decimal (e.g. COMP-3). Using CH for numeric data can give wrong order when lengths differ or data is signed.

Does CH sort in EBCDIC order?

Yes. On z/OS, when you specify CH, DFSORT compares the key bytes using the job collating sequence, which is typically EBCDIC. So the order is the same as EBCDIC byte order: blanks, then special characters, then A–Z (and a–z depending on code page), then digits 0–9, etc., according to the code page in use.

Can I use CH for numeric fields in DFSORT?

Only when the field is fixed-length, positive digits only, and same length for all records (e.g. 8-digit date). Then EBCDIC digit order can match numeric order. For variable length, negatives, or leading spaces, CH gives wrong numeric order—use ZD or PD to match how the number is stored.

What is the SORT FIELDS syntax for a character key in DFSORT?

SORT FIELDS=(start,length,CH,A) for ascending or CH,D for descending. Example: SORT FIELDS=(1,20,CH,A) sorts on bytes 1–20 as character in ascending order. For multiple keys, add more triplets: SORT FIELDS=(1,10,CH,A,11,4,ZD,D).

CH (Character) Format - DFSORT Data Types

CH (Character) Format

In DFSORT, every sort key and many control fields are described by a format that tells the program how to interpret the bytes. CH stands for character (sometimes called alphanumeric). When you specify CH, DFSORT does not treat the field as a number; it compares the bytes in collating sequence—on z/OS typically EBCDIC—one byte at a time from left to right. CH is the right choice for names, IDs, codes, and any text data. Using CH for numeric data can produce wrong order (e.g. "10" before "9"). This page explains what CH is, how it behaves, when to use it, and how it differs from numeric formats like ZD and PD.

Data Types & Formats

What CH Means

CH tells DFSORT: "this field is character data." The sort does not interpret the bytes as a number (packed, zoned, or binary). Instead, it compares the key byte by byte, from the first byte to the last, using the collating sequence of the job. On IBM z/OS that is usually EBCDIC (Extended Binary Coded Decimal Interchange Code). Whichever record has the smaller byte value at the first position where the two keys differ is ordered first in ascending sort; for descending, the order is reversed. So CH gives you lexicographic (dictionary-style) order based on the character set, not numeric order.

Syntax: Where CH Appears

The most common place you use CH is in SORT FIELDS. The format is:

text

1
  SORT FIELDS=(start,length,CH,direction)

start — Starting position of the field in the record (1-based). The first byte of the record is position 1.
length — Length of the field in bytes. DFSORT will compare this many bytes.
CH — Format: character. No numeric conversion; byte-by-byte comparison in collating sequence.
direction — A for ascending, D for descending.

Example: sort on a 20-byte name at position 1, ascending:

text

1
  SORT FIELDS=(1,20,CH,A)

For multiple sort keys, you repeat the (position, length, format, direction) pattern. Example: sort by last name (1,20,CH,A) then first name (21,15,CH,A):

text

1
  SORT FIELDS=(1,20,CH,A,21,15,CH,A)

How Comparison Works

With CH, DFSORT compares the two keys like this: start at byte 1; if the bytes are equal, move to byte 2; repeat until a byte differs or one key runs out. The key with the smaller byte value at that position is ordered first (ascending). If all bytes are equal and the lengths are the same, the two records are equal for sort purposes (order between them may depend on OPTION EQUALS/NOEQUALS). If one key is a prefix of the other (e.g. "ABC" vs "ABCD"), the shorter one is less in EBCDIC, so it comes first in ascending order.

EBCDIC and digit order

In standard EBCDIC, the character codes for the digits 0 through 9 are in ascending numeric order (typically 0xF0 through 0xF9). So for fixed-length strings that contain only digits and no sign (e.g. 8-digit date YYYYMMDD or 5-digit ID), comparing byte-by-byte in EBCDIC gives the same order as comparing the numbers. That is why some shops use CH for such fields. But as soon as you have variable-length numbers (e.g. "9" vs "10"), the first byte of "10" is "1", which is less than "9" in EBCDIC, so "10" sorts before "9"—wrong for numeric order. Negative numbers (with a sign in the last byte) and leading spaces also break the correspondence. So for reliability with numeric data, use a numeric format (ZD, PD, BI) that matches the storage.

When to Use CH

Use CH when the field is character data
Use case	Reason
Names (person, product, etc.)	Data is text; byte order is the correct order.
Alphanumeric IDs or codes	IDs are compared as strings, not as numbers.
Address lines, descriptions	Plain text; no numeric meaning.
Fixed-length date as string (e.g. YYYYMMDD)	Can work if all values are same length and positive; ZD is safer for dates stored as numbers.
Keys that mix letters and digits	CH preserves lexicographic order; numeric formats do not apply.

When Not to Use CH

Packed decimal (COMP-3) — Use PD. The bytes are not character digits; they are nibbles. CH gives meaningless order.
Zoned decimal (DISPLAY numeric) — Use ZD if you want numeric order, especially with sign or variable length.
Binary integers — Use BI. Raw byte comparison does not match numeric value.
Variable-length or signed numeric data — Use the numeric format that matches storage (ZD, PD, etc.) so 9 comes before 10 and negatives are correct.

CH in Other Control Statements

CH is not only for SORT FIELDS. Whenever DFSORT needs to know the format of a field—for comparison, building a new field, or writing output—you can specify CH. Examples:

INCLUDE / OMIT — When you use a numeric test (e.g. comparison with a constant), you specify the format of the field. If the field is character and you are comparing as character, the constant is typically character and the format is CH.
INREC / OUTREC — When building or copying fields, you refer to input positions; the "format" of what you copy can be thought of as character when you are just moving bytes. For build items like constants, you are supplying character data.
OUTFIL — Report fields and constants are often character; CH is implicit when you output literal text or unmodified character fields.
MERGE FIELDS — Same idea as SORT FIELDS: use CH for character merge keys.

Collating Sequence and ALTSEQ

The order you get with CH depends on the collating sequence. By default that is the job's character set (e.g. EBCDIC). You can change it with ALTSEQ (alternate collating sequence) to get case-insensitive sorts or custom order. That does not change the fact that the key is still compared byte-by-byte; it only changes which byte value is considered "less than" or "greater than" another. So CH plus ALTSEQ is still character comparison, not numeric.

Explain It Like I'm Five

Imagine sorting word cards. We look at the first letter: if one card has "A" and another "B", the A card goes first. If the first letters are the same, we look at the second letter, and so on. We never try to "add up" the letters as a number—we just compare them in ABC order. CH does that with the bytes in your field: first byte, then second, and so on, using the computer's letter order (EBCDIC). So CH is for names and words. When the field is really a number (like money or quantity), we use a different rule (ZD or PD) so that 9 comes before 10.

Exercises

Write a SORT FIELDS statement to sort on bytes 5–14 as character ascending, then bytes 15–18 as character descending.
Your file has a 6-byte customer ID at position 1 (digits only, fixed length). Could you use CH? What could go wrong if one day the ID has a leading space or a letter?
Sort a file by last name (positions 1–25) then first name (26–40), both CH ascending. Write the full SORT FIELDS control statement.
Why does sorting "9" and "10" with CH put "10" before "9"? What format would you use to get numeric order?

Quiz

Test Your Knowledge

1. What does CH mean in a DFSORT SORT FIELDS specification?

Compact Hexadecimal
Character: the field is compared byte-by-byte in collating sequence (e.g. EBCDIC)
Check digit
Column header

2. For a 10-byte name field starting at position 1, which SORT FIELDS specification is correct for ascending order?

SORT FIELDS=(1,10,ZD,A)
SORT FIELDS=(1,10,CH,A)
SORT FIELDS=(1,10,PD,A)
SORT FIELDS=(CH,1,10,A)

3. When can sorting digits with CH produce the same order as numeric sort?

Never
When the field is fixed-length, contains only positive digits (no sign), and all values have the same length
Only when OPTION EQUALS is used
Only for the primary key

4. Why might "10" sort before "9" when using CH?

CH always sorts descending
CH compares byte by byte; the first byte of "10" is "1" and of "9" is "9"; in EBCDIC "1" is less than "9", so "10" comes first
It is a bug in DFSORT
Only when the field is packed decimal

5. Where can CH be used besides SORT FIELDS?

Only in SORT FIELDS
In INCLUDE/OMIT, INREC, OUTREC, OUTFIL, and SUM when specifying field format (e.g. comparison or build type)
Only in MERGE
Only in JOINKEYS

CH (Character) Format

What CH Means

Syntax: Where CH Appears

How Comparison Works

EBCDIC and digit order

When to Use CH

When Not to Use CH

CH in Other Control Statements

Collating Sequence and ALTSEQ

Explain It Like I'm Five

Exercises

Quiz

Test Your Knowledge

Related Concepts

Numeric vs character sorts

Collating sequence

Sort fields syntax

ZD format

Related Pages