MainframeMaster

BI (Binary) Format

In DFSORT, BI stands for binary (unsigned binary integer). When you specify BI, DFSORT treats the sort key as a single binary integer stored in 2, 4, or 8 bytes—a halfword, fullword, or doubleword in mainframe terms. The bytes are not compared as characters (CH) or as decimal digits (PD, ZD); they are interpreted as one integer value, and the sort order is numeric. BI is the right format for COBOL COMP and COMP-4 fields and for any key stored in binary integer form. For signed binary integers (which can be negative), FI (signed fixed-point) is used instead so that negative numbers sort in correct numeric order. This page explains BI in detail: allowed lengths, how comparison works, when to use BI vs FI/PD/ZD/CH, and how to avoid common mistakes.

Data Types & Formats
Progress0 of 0 lessons

What BI Means

BI tells DFSORT: "this field is an unsigned binary integer." The program reads the specified number of bytes (2, 4, or 8), interprets them as one integer in the machine's binary format (big-endian on z/OS), and compares that numeric value. Smaller integer first in ascending order; larger first in descending. No conversion to or from character or decimal is done—the key is already in binary form in the record. So BI is for keys that are stored as binary integers, such as COBOL COMP (or COMP-4): halfword (2 bytes) for smaller integers and fullword (4 bytes) for larger ones. Doubleword (8 bytes) is also supported in many products for 64-bit integers.

Allowed Lengths

BI is defined for fixed lengths only:

BI field lengths
Length (bytes)Common nameUnsigned value range (typical)COBOL example
2Halfword0 to 65,535PIC S9(4) COMP, PIC 9(4) COMP
4Fullword0 to 4,294,967,295PIC S9(9) COMP, PIC 9(9) COMP
8Doubleword0 to 18,446,744,073,709,551,615PIC S9(18) COMP, 8-byte binary

The length in SORT FIELDS is always in bytes, not digits. So for a COBOL field PIC S9(9) COMP, you specify length 4 (four bytes), not 9. If you use an unsupported length (e.g. 3 or 5), behavior is product-dependent and may cause incorrect order or errors; stick to 2, 4, or 8.

Syntax: SORT FIELDS with BI

Use BI in SORT FIELDS like this:

text
1
SORT FIELDS=(start,length,BI,direction)
  • start — Starting position of the binary field in the record (1-based).
  • length — 2, 4, or 8 (bytes). Must match the actual storage size of the field.
  • BI — Format: unsigned binary integer.
  • direction — A for ascending, D for descending.

Example: sort by a 4-byte binary key at position 20, ascending:

text
1
SORT FIELDS=(20,4,BI,A)

Example: primary key 2-byte binary at position 1, secondary key 4-byte binary at position 5, both ascending:

text
1
SORT FIELDS=(1,2,BI,A,5,4,BI,A)

BI vs FI: Unsigned vs Signed

BI is unsigned: the entire bit pattern is interpreted as a non-negative integer. FI (fixed-point / signed binary) is signed: the high-order bit is the sign bit, and the value can be negative. If your COBOL field is PIC 9(n) COMP (no sign), it holds only zero and positive values—BI is correct. If the field is PIC S9(n) COMP (signed), it can hold negative numbers. With BI, a negative value (e.g. -1) would be interpreted as a very large unsigned number (e.g. 4,294,967,295 for a 4-byte -1), so it would sort after all positive numbers instead of before them. For signed binary keys you should use FI so that DFSORT compares the signed integer value and negative numbers order correctly relative to zero and positives. Product documentation may use slightly different names (e.g. "signed binary") but the idea is the same: use the format that matches whether your data is unsigned or signed.

When to Use BI

Use BI when the field is unsigned binary integer
Use caseReason
COBOL COMP or COMP-4 (unsigned)Storage is binary integer; length 2 or 4 (or 8) bytes.
Binary sequence or counter (no negatives)BI compares numeric value correctly.
Keys stored as halfword/fullword/doublewordBI is the format for raw binary integer keys.

When Not to Use BI

  • Packed decimal (COMP-3) — Use PD. The bytes are packed decimal nibbles, not binary.
  • Zoned decimal (DISPLAY numeric) — Use ZD. One digit per byte plus sign.
  • Character or alphanumeric — Use CH. Byte-by-byte collating sequence.
  • Signed binary (negative values possible) — Use FI so negative numbers sort correctly.

Comparing BI to Other Formats

Using the wrong format on a binary field causes wrong order. If you use CH, DFSORT compares the raw bytes in EBCDIC order. The byte pattern for the integer 256 (in 2-byte binary) is nothing like the character "256"; so character comparison does not match numeric order. If you use PD, DFSORT expects two decimal digits per byte (plus sign nibble). Binary bytes do not have that layout, so the "number" DFSORT derives is wrong and the sort order is wrong. If you use ZD, it expects one digit per byte (zoned). Again, the bytes are not zoned decimal. So for any key that is stored in binary integer form, you must specify BI (or FI for signed) and the correct byte length.

BI in INCLUDE, OMIT, and Build Statements

Wherever DFSORT needs to interpret a field—for comparison in INCLUDE/OMIT, or when building or editing in INREC/OUTREC—you can specify the format. For a binary field, use BI (or FI for signed) so that numeric comparisons and constants are correct. For example, to include only records where a 4-byte binary at position 10 is greater than 1000, you would use a numeric test with format BI and a numeric constant (or equivalent); the exact syntax depends on your product (e.g. INCLUDE with position, length, format, and comparison value). Using CH or PD for that field would compare or interpret the bytes incorrectly.

Explain It Like I'm Five

Some numbers in the computer are stored like "secret code": not as the digits 1, 2, 3, but as a pattern of bits (binary). BI is the rule that says: "this piece of the record is one of those secret-code numbers." The sort then compares the actual numbers (smaller first, or bigger first), not the letters or digits you might see in a printout. So we use BI when the key is stored in that binary form—like many ID or counter fields in mainframe files. If we used the "character" rule (CH) or the "decimal" rule (PD) on that same piece, the sort would get the order wrong because it would be reading the secret code as if it were letters or decimal digits.

Exercises

  1. A record has a 2-byte binary at position 8 and a 4-byte binary at position 20. Write SORT FIELDS for ascending on the first key and descending on the second.
  2. Your COBOL copybook has 05 CUST-CTR PIC S9(9) COMP. At what position does it start (you may assume 1 for this exercise), and what length and format do you use in SORT FIELDS? Should you use BI or FI?
  3. Why does using CH on a binary field produce wrong sort order? Explain in terms of how the bytes are stored vs how CH compares them.
  4. What is the largest unsigned value that can fit in a 4-byte BI field? What happens if you use BI for a field that sometimes contains negative values (signed)?

Quiz

Test Your Knowledge

1. What does BI mean in DFSORT?

  • Binary input
  • Binary (unsigned binary integer): the field is interpreted as a binary integer and compared by numeric value
  • Byte index
  • Block identifier

2. Which lengths are valid for BI in DFSORT?

  • 1, 2, 3, or 4 bytes only
  • 2, 4, or 8 bytes (halfword, fullword, doubleword)
  • Any length
  • Only 4 bytes

3. Your COBOL record has PIC S9(9) COMP at position 50. What SORT FIELDS do you use for ascending numeric order?

  • SORT FIELDS=(50,9,BI,A)
  • SORT FIELDS=(50,4,BI,A)
  • SORT FIELDS=(50,9,PD,A)
  • SORT FIELDS=(50,4,CH,A)

4. What is the main difference between BI and FI in DFSORT?

  • There is no difference
  • BI is unsigned binary integer; FI is signed fixed-point (signed binary). For negative values, FI gives correct numeric order.
  • FI is for floating-point
  • BI is only for 2-byte fields

5. Why must you not use CH or PD for a binary (COMP) field?

  • CH and PD are slower
  • Binary storage is base-2 integer encoding; CH compares raw bytes (wrong order) and PD expects packed decimal nibbles (wrong interpretation). Use BI or FI.
  • CH is only for the first key
  • PD is only for output