MainframeMaster

Case-Insensitive Sorting

Case-insensitive sorting means that uppercase and lowercase letters are treated as equal for the purpose of comparison. So "Smith", "SMITH", and "smith" would be considered the same key value and ordered together. With the default EBCDIC collating sequence and CH format, that is not what you get: in EBCDIC, uppercase letters and lowercase letters have different byte values, so "A" and "a" sort in different positions. To get case-insensitive order you can: (1) Use ALTSEQ to define an alternate collating sequence that maps lowercase to the same comparison value as uppercase (or vice versa); (2) Use INREC to translate the sort key to uppercase (or lowercase) before the sort and sort on that translated key, then use OUTREC to output the original record if needed; (3) Use a product-specific case-insensitive option if available. This page explains the problem and these approaches.

SORT Statement Deep Dive
Progress0 of 0 lessons

Why Default CH Is Case-Sensitive

When you sort with CH, DFSORT compares bytes using the collating sequence. In EBCDIC, the byte for "A" (0xC1) is different from the byte for "a" (0x81). So when comparing "SMITH" and "smith", the first byte is different (S vs s), and one will be "smaller" in the sequence. Typically all uppercase A–Z sort in one block and all lowercase a–z in another (or the order depends on the code page). So the default is case-sensitive: "Apple" and "APPLE" do not compare as equal.

Using ALTSEQ for Case-Insensitive Order

ALTSEQ lets you define an alternate collating sequence. You can specify that certain input bytes are to be treated as other bytes for comparison. For case-insensitive sort, you map each lowercase letter to the same comparison value as its uppercase (or vice versa). Then when DFSORT compares two CH keys using that alternate sequence, "a" and "A" compare as equal. The actual record bytes are not changed—only the order used for comparison. See the ALTSEQ control statement in your DFSORT manual for syntax (e.g. which codes to map).

Using INREC Translation

Another approach is to translate the sort key to uppercase (or lowercase) in INREC before the sort. For example, you build a reformatted record that includes an uppercase copy of the name field, and you sort on that uppercase copy. The rest of the record can be passed through. Then the sort order is effectively case-insensitive because the key used for comparison is already normalized. In OUTREC you can output the original record (e.g. from the original input or from a saved copy) so that the output file contains the original mixed case, but the order is by the normalized key. This avoids ALTSEQ but uses more INREC logic.

Product Options

Some DFSORT products or releases may offer a direct "case-insensitive" or "fold case" option for CH keys. Check your Application Programming Guide. If available, you might specify it in OPTION or in the SORT FIELDS definition so that the product handles case folding without an explicit ALTSEQ or INREC translation.

Explain It Like I'm Five

Case-insensitive sort is like saying "when we line up by name, we don’t care if the name is written in BIG letters or small letters—Smith and SMITH go together." The computer normally cares (because big S and small s are different codes). So we either tell it "treat small s like big S when comparing" (ALTSEQ) or we rewrite the name in one kind of letters before comparing (INREC), so that Smith and SMITH end up next to each other.

Exercises

  1. With default CH and EBCDIC, will "apple" sort before or after "BANANA"? How do you find out for your code page?
  2. What is one advantage of using INREC translation instead of ALTSEQ for case-insensitive sort?
  3. If you use INREC to uppercase the sort key, does the output record have to be uppercased? Why or why not?

Quiz

Test Your Knowledge

1. With default EBCDIC CH sort, how do "SMITH" and "smith" compare?

  • They are equal
  • In EBCDIC, uppercase and lowercase have different code positions; one will sort before the other (e.g. uppercase before lowercase)
  • Case is always ignored
  • DFSORT abends

2. How can you get case-insensitive sort order in DFSORT?

  • Use PD
  • Use ALTSEQ to map lowercase to uppercase (or vice versa) so that when CH comparison runs, both cases sort together
  • It is automatic
  • Use ZD

3. What does "case-insensitive" mean for sort order?

  • Only lowercase is used
  • Upper- and lowercase letters are treated as equal for comparison; "Apple" and "APPLE" sort next to each other
  • No letters are used
  • Only the first letter matters

4. Can INREC or OUTREC be used to achieve case-insensitive sort?

  • No
  • Yes—you can translate the sort key to uppercase (or lowercase) in INREC before the sort, then sort on that; output can be original or transformed
  • Only OUTREC
  • Only for numeric keys

5. Does case-insensitive sorting change the record content?

  • Always
  • Not necessarily—with ALTSEQ only the comparison order changes; with INREC translation you can sort on a translated key and still output the original record
  • Never
  • Only for CH