MainframeMaster

Summing Numeric Fields

Summing numeric fields in DFSORT means adding up the values in one or more numeric fields for each group of records that share the same sort key. You specify each field to sum with SUM FIELDS=(position, length, format)—and for multiple fields, you list several such triples. Position is the starting byte (1-based) in the record, length is the number of bytes, and format tells DFSORT how to interpret those bytes (PD, ZD, BI, FI, or FL). The format must match how the data is actually stored; otherwise the sum will be wrong or the job may abend. This page goes into depth on position and length, each numeric format and when to use it, summing multiple fields, record layout after INREC, and how to avoid overflow or convert character data before summing.

SUM Statement
Progress0 of 0 lessons

Position and Length: Where Is the Field?

Every field you sum is identified by where it starts in the record and how many bytes it occupies. Position is the starting byte number, usually 1-based (byte 1 is the first byte of the record). Length is the number of consecutive bytes. So (11, 5, ZD) means “the field starting at byte 11, 5 bytes long, in zoned decimal format”—i.e. bytes 11 through 15.

If you use INREC, the record that SORT and SUM see is the record after INREC. So all positions in SORT FIELDS= and SUM FIELDS= refer to that reformatted record, not the original input. If your input has the amount at bytes 50–54 and you use INREC to move it to bytes 21–25, you would specify SUM FIELDS=(21,5,ZD) (or PD if you converted to packed). Getting the position or length wrong—for example pointing to another field or including an extra byte—causes incorrect sums or protection exceptions (e.g. S0C7). Always verify against your record layout or copybook.

Numeric Formats: What Each One Means

The same bytes in storage represent different numbers depending on the format. DFSORT uses the format to interpret the bytes as a numeric value before adding. You must specify the format that matches how the data is stored.

Numeric formats for SUM FIELDS=
FormatNameStorageLength exampleTypical use
PDPacked decimalTwo digits per byte; sign in last half-bytee.g. 4 bytes = 7 digits + signCOMP-3, amounts, quantities
ZDZoned decimalOne digit per byte; sign in last byte (C/D/F)e.g. 5 bytes = 5 digits + signDISPLAY numeric, display fields
BIBinaryTwo's complement integer2 bytes (halfword) or 4 bytes (fullword)COMP, COMP-4, counts, IDs
FIFixed-pointFixed-point numeric (product-dependent)Product-specificWhen data is in FI form
FLFloating-pointFloating-point (e.g. IEEE or hex)4 or 8 bytes typicallyScientific or float data

PD — Packed Decimal

Packed decimal stores two decimal digits per byte, except the rightmost half-byte which holds the sign (e.g. C for positive, D for negative in EBCDIC). So a 4-byte packed field holds 7 digits plus sign. The length you specify is the byte length (e.g. 4). PD is very common for amounts and quantities in mainframe files (e.g. COBOL COMP-3). If the data is packed and you specify ZD or BI, the bytes will be misinterpreted and the sum will be wrong or you may get an abend.

ZD — Zoned Decimal

Zoned decimal uses one byte per digit. The last byte also carries the sign (e.g. C or F for positive, D for negative). So a 5-byte ZD field holds 5 digits plus sign. Length is the number of bytes (same as digit count for a signed field). ZD is common when the numeric data is stored in “display” or character-like form (e.g. COBOL DISPLAY numeric). If you have a character field that looks like a number (e.g. "12345"), you cannot sum it directly; you must convert it to ZD or PD in INREC first, then sum the converted field.

BI — Binary

Binary fields are two’s complement integers: halfword (2 bytes) or fullword (4 bytes). Length is 2 or 4 (or 8 for doubleword if supported). Use BI when the field is stored as binary (e.g. COBOL COMP or COMP-4). Binary summing is exact for integers; specify the correct length or you will read the wrong value.

FI and FL

FI (fixed-point) and FL (floating-point) are less common in typical batch reporting. Use them when your data is actually in those formats. The exact length and interpretation are product-dependent; refer to your DFSORT/ICETOOL documentation.

One Field: Basic Example

Input: fixed-length records with department code in bytes 1–10 and a 5-byte zoned decimal “sales” amount in bytes 21–25. Requirement: one record per department with the sum of sales.

text
1
2
SORT FIELDS=(1,10,CH,A) SUM FIELDS=(21,5,ZD)

Records are sorted by department (1–10). For each department, the 5-byte ZD field at 21–25 is summed. The output has one record per department; that record is based on the first record of the group, with bytes 21–25 replaced by the sum. So you get one total per department in the same record layout.

Summing Multiple Fields

You can sum as many numeric fields as you need in a single SUM FIELDS= by listing each as (position, length, format). DFSORT adds each field independently per group and writes all totals into the output record in the same positions. Non-summed positions usually retain the value from the first record of the group.

Example: sum two amounts and a quantity

Record layout: key at 1–8, amount1 (4-byte PD) at 9–12, amount2 (4-byte PD) at 13–16, quantity (2-byte BI) at 17–18. One record per key with sum of amount1, amount2, and quantity.

text
1
2
SORT FIELDS=(1,8,CH,A) SUM FIELDS=(9,4,PD,13,4,PD,17,2,BI)

The output has one record per unique (1–8). Positions 9–12 hold the sum of amount1, 13–16 the sum of amount2, and 17–18 the sum of quantity. Bytes 1–8 (the key) and any bytes after 18 come from the first record of the group.

Record Layout and INREC

If the input record does not have the numeric fields in a summable form or in the positions you want, use INREC first. INREC runs before the sort; SORT and SUM see the record after INREC. So you can move fields, convert character to numeric (e.g. build a ZD or PD field), or rearrange the layout. The positions in SORT FIELDS= and SUM FIELDS= then refer to the INREC output. For example, if the input has a 6-byte character amount at 40–45, you might use INREC to convert it to a 5-byte ZD at positions 21–25, then SORT FIELDS=(1,10,CH,A) and SUM FIELDS=(21,5,ZD). Always ensure the summed field length is sufficient for the largest possible total (see “Handling overflow”).

Overflow and Field Size

The sum is written back into the same byte range as the input field. If the total is too large to fit (e.g. you sum 10,000 records and the total needs more digits than the field allows), you get overflow. With VLSHRT, DFSORT may truncate the value to fit, which can lose significant digits. With NOVLSHRT, the step fails so you can increase the field size (e.g. define a longer field in INREC and sum that) or correct the data. For reliable totals, plan the field size for the maximum possible sum or use NOVLSHRT to detect overflow.

Explain It Like I'm Five

Imagine a stack of receipts. Each receipt has a store name and a number (the amount). You first sort the receipts so all “Store A” receipts are together. Then for Store A you add up all the amounts and write the total on one line. You do the same for Store B, and so on. In the computer, the “store” is the sort key, and “add up the amounts” is SUM FIELDS=. The computer has to know where the amount is on each receipt (position) and how long it is (length), and whether the number is written in “packed” or “zoned” or “binary” (format). If it uses the wrong kind of number, the total will be wrong. If the total is too big to fit in the box, the computer can either squeeze it in and lose part of the number (VLSHRT) or stop and tell you (NOVLSHRT) so you can use a bigger box.

Exercises

  1. Your record has customer ID at 1–8 and a 4-byte packed decimal “balance” at 25–28. Write SORT and SUM to get one record per customer with the sum of balance.
  2. Sum three fields: 5-byte ZD at 20, 4-byte PD at 30, and 2-byte BI at 40. Sort by bytes 1–10. Write the full SUM FIELDS= line.
  3. Why can’t you use SUM FIELDS=(50,6,CH) to sum a 6-byte character field that contains digits? What would you do instead?
  4. If you use INREC to move the amount from input positions 100–104 to output positions 15–19 and convert to ZD, what SUM FIELDS= would you use to sum that amount? What SORT FIELDS= positions refer to—input or INREC output?

Quiz

Test Your Knowledge

1. What does the "position" in SUM FIELDS=(position,length,format) refer to?

  • The position in the output record only
  • The starting byte position (1-based) of the field in the record that SORT/SUM sees—after INREC if used
  • The column on the control card
  • The group number

2. Why must the format in SUM FIELDS= match how the data is stored?

  • Format is only for display
  • DFSORT interprets the same bytes differently for PD vs ZD vs BI—wrong format gives wrong numeric value and wrong sum, or S0C7
  • Only PD is correct for summing
  • Format affects only the output

3. Can you sum a character field that contains digits (e.g. "01234") with SUM FIELDS=?

  • Yes, use CH format
  • No—SUM FIELDS= accepts only numeric formats (PD, ZD, BI, FI, FL). Use INREC to convert the character field to ZD (or PD) first, then sum the converted field
  • Yes, with SUM FIELDS=(pos,5,ZD)
  • Only if it has a sign

4. How do you sum three different numeric fields in one SUM statement?

  • Use three separate SUM statements
  • List all three in SUM FIELDS= as (pos1,len1,fmt1,pos2,len2,fmt2,pos3,len3,fmt3)—each field is (position, length, format)
  • Only two fields are allowed
  • Use SUM FIELDS= three times

5. What happens to the output field length when you sum many large numbers?

  • DFSORT automatically expands the field
  • The sum is placed in the same byte range as the input field—if the total is too large to fit (overflow), VLSHRT truncates or NOVLSHRT causes an error
  • The field is always doubled
  • Nothing; overflow is ignored