MainframeMaster

Binary Sorting

Binary (BI) in DFSORT is the format you use when your sort key is stored as a binary integer: typically a halfword (2 bytes, 16 bits) or fullword (4 bytes, 32 bits) in the machine's native integer format. This is how COBOL COMP and COMP-4 numeric fields are stored. When you specify BI in SORT FIELDS=, DFSORT interprets the key as an integer and compares numeric value. The length you give is the number of bytes (2 for halfword, 4 for fullword). FI (fixed-point/signed binary) is sometimes used for signed integers; product documentation varies. Using PD, ZD, or CH on binary data—or BI on packed/zoned data—produces wrong sort order because the encoding is different. This page explains when to use BI, how to choose length, and how it differs from other formats.

SORT Statement Deep Dive
Progress0 of 0 lessons

What Is Binary (BI) Format?

Binary here means the key is stored as a binary integer in memory: the bytes represent a whole number in base-2 (big-endian on the mainframe). A 2-byte (halfword) value can hold integers in a fixed range; a 4-byte (fullword) value holds a larger range. COBOL COMP and COMP-4 use this storage. DFSORT BI tells the sort to read the key as a binary integer and compare the numeric value. So 100 is greater than 99, and the sort order is correct integer order. The exact interpretation (signed vs unsigned) can be product-dependent; FI (fixed-point) is often the signed form—check your DFSORT manual.

When to Use BI

Use BI (or FI for signed) when the sort key is a binary integer: COBOL PIC S9(n) COMP, PIC 9(n) COMP, COMP-4, or any field stored as halfword or fullword integer. Do not use BI for packed decimal (COMP-3)—use PD. Do not use BI for zoned decimal (DISPLAY)—use ZD. Do not use BI for character—use CH. Using the wrong format misinterprets the bytes.

Length: Halfword vs Fullword

On z/OS, a halfword is 2 bytes, a fullword is 4 bytes. So for PIC S9(4) COMP (typically halfword), use length 2. For PIC S9(9) COMP (typically fullword), use length 4. The copybook or program that defines the file will tell you the storage size. In SORT FIELDS you specify this byte length: e.g. SORT FIELDS=(20,2,BI,A) for a 2-byte binary key at position 20.

BI vs FI (Signed vs Unsigned)

BI is often treated as unsigned binary integer; FI as signed fixed-point. For negative integer keys (e.g. COBOL S9(n) COMP), you may need FI so that -1 sorts before 0. Product behavior varies—some DFSORT versions use BI for both and interpret signed correctly. Check your documentation. When in doubt, use the format recommended for signed integers (often FI) so that negative values sort in the correct numeric position.

BI vs PD/ZD/CH

Binary storage is not decimal: the bytes are not digits or nibbles but the raw integer in binary. If you use PD or ZD, DFSORT will try to interpret those bytes as packed or zoned decimal and the comparison will be wrong. If you use CH, byte-by-byte comparison does not give integer order. So for COMP/COMP-4 keys, always use BI (or FI) to get correct numeric order.

Examples

text
1
SORT FIELDS=(15,2,BI,A)

Sort by a 2-byte binary key at position 15, ascending (e.g. halfword COMP).

text
1
SORT FIELDS=(1,10,CH,A,11,4,BI,D)

Sort by bytes 1–10 character, then by bytes 11–14 as 4-byte binary descending (e.g. name then fullword ID).

Explain It Like I'm Five

Binary is like the computer's "native" number: the number is stored as a whole number in a fixed-size box (2 or 4 bytes). The sort program needs to know "read this as that kind of number" so it can compare 99 and 100 correctly. If it read the box as letters or as a different kind of number, the order would be wrong. So we say "BI" so it reads the integer and sorts from smallest to biggest.

Exercises

  1. A field is PIC S9(9) COMP (fullword). What length do you use in SORT FIELDS? What format?
  2. Why is CH wrong for a COMP key?
  3. When might you use FI instead of BI? (Hint: signed integers.)

Quiz

Test Your Knowledge

1. What COBOL usage does DFSORT BI (binary) correspond to?

  • DISPLAY
  • COMP-3
  • COMP or COMP-4
  • INDEX

2. For a PIC S9(4) COMP field (halfword), what length do you specify in SORT FIELDS?

  • 4 digits
  • 2 bytes
  • 4 bytes
  • 1 byte

3. What is the difference between BI and FI in DFSORT?

  • There is no difference
  • BI is typically unsigned binary; FI is signed fixed-point (signed binary). Product docs may vary.
  • FI is for floating-point
  • BI is for packed

4. Why must you not use PD or ZD for a COMP field?

  • PD is faster
  • COMP is stored as binary bits, not decimal digits. PD/ZD expect decimal encoding; using them misinterprets the bytes and gives wrong order.
  • ZD is only for display
  • You can use PD

5. A fullword binary integer occupies how many bytes on the mainframe?

  • 2 bytes
  • 4 bytes
  • 8 bytes
  • 1 byte