How do you calculate average per group in DFSORT?

Average is sum divided by count. Use SUM FIELDS= to get the sum of the numeric field per group. To get count in the same record, add a field set to 1 in every record (e.g. via INREC) and sum that field too—the sum of 1s is the count. Then average = sum / count. The division may require OUTREC arithmetic (if supported) or a second step or program.

Does DFSORT have an AVG or average function in SUM?

DFSORT SUM FIELDS= does not have a built-in AVG option. You compute average by obtaining the sum and the count (e.g. by summing a constant 1 field) and then dividing. So you use SUM to get sum and count, then either OUTREC/OUTFIL arithmetic (product-dependent) or a second step to compute average = sum/count.

How do you get both sum and count per group in one SUM statement?

Sum your numeric field as usual in SUM FIELDS=. For count, add a numeric field that contains 1 in every record—use INREC to overlay or build that field (e.g. a 1-byte or 2-byte PD or ZD with value 1). Include that field in SUM FIELDS=. The sum of that field is the record count per group. Your output record then has both the sum of the real field and the count (sum of 1s).

Can you get section average in an OUTFIL report?

Use OUTFIL with SECTIONS= and TRAILER3= to get section (group) totals and counts: TOTAL= for the sum and COUNT= for the record count in each section. The section average is total/count. Some products allow formatting or computing this in the trailer; otherwise the trailer gives you total and count and you can derive the average in reporting logic or a follow-up step.

What if my DFSORT does not support division in OUTREC?

Output the record with the sum and count (from SUM). Then use a second step: a short program (e.g. COBOL, REXX) or a second SORT with INREC/OVERLAY that reads the record and overlays the average field with sum/count, or use a reporting tool that computes average from the sum and count fields. The first SORT step does the grouping and sum/count; the second step or program does the division.

DFSORT AVG Calculation Techniques - Average Per Group Sum Count

AVG Calculation Techniques

DFSORT does not provide a built-in AVG (average) function in the SUM statement. To get an average per group, you use the formula average = sum / count: you need the sum of the numeric field and the count of records in each group. You can get the sum with SUM FIELDS=. To get the count in the same aggregated record, a common technique is to add a field that is always 1 in every record (e.g. via INREC), then include that field in SUM FIELDS=—the sum of 1s is the record count. The output record then has both the total and the count; computing the average (total divided by count) may be done in OUTREC if your product supports arithmetic, or in a second step or program. This page covers the sum-and-count approach, the constant-1 trick, and options for doing the division.

SUM Statement

Average = Sum / Count

The average of a numeric field over a group of records is the sum of that field divided by the number of records. So you need two values per group: the sum of the field and the count of records. SUM FIELDS= gives you the sum. It does not add a count field. So you must obtain the count by some other means and then perform the division.

Getting the Count: Sum a Constant 1

A standard technique is to add a numeric field that contains the value 1 in every input record. When you SUM that field, the result is the number of records in the group (each record contributes 1). So you use INREC to build or overlay a field—for example a 1-byte or 2-byte packed or zoned decimal with value 1—and then list that field in SUM FIELDS= along with your real numeric field. The aggregated record then has: sum of your field (e.g. amount) and sum of the 1s (count). From that you can compute average = amount_sum / count.

Example: INREC add count field, then SUM

Input: 80-byte records, key at 1–8, amount (5-byte ZD) at 21–25. You want one record per key with sum of amount, count, and (conceptually) average. Add a count field at a fixed position (e.g. 26–27 as 2-byte PD with value 1) via INREC, then sort and sum both amount and that field.

text

1
2
3
  INREC FIELDS=(1,80,26,2,PD,1,0)
  SORT FIELDS=(1,8,CH,A)
  SUM FIELDS=(21,5,ZD,26,2,PD)

The exact INREC syntax for “overlay position 26–27 with packed 1” is product-dependent (the example shows the idea: 26,2,PD,1,0 might be BUILD or OVERLAY with a constant). After SUM, positions 21–25 hold the sum of amount and 26–27 hold the sum of the 1s (the count). You then need to compute average = (21–25) / (26–27). If your DFSORT supports arithmetic in OUTREC (e.g. DIV or similar), you can add an OUTREC step that overlays an “average” field with that quotient. Otherwise, write this record and use a second step or program to compute the average.

Steps for average per group
Step	What	How
1	Sum the numeric field	SUM FIELDS=(pos, len, format) for the amount (or other field)
2	Get count per group	INREC add a field = 1; SUM that field too—sum of 1s = count
3	Compute average	Average = sum / count; use OUTREC arithmetic if supported, else second step or program

Record Layout After INREC

When you add a “constant 1” field with INREC, the positions in SORT FIELDS= and SUM FIELDS= refer to the record after INREC. So you must build a layout where the key, the amount, and the new count field are at known positions. For example: INREC FIELDS=(1,25,26,2,PD,1,0,28,53) might copy bytes 1–25, put a 2-byte packed 1 at 26–27, and copy the rest. Then SUM FIELDS=(21,5,ZD,26,2,PD) sums the amount at 21–25 and the count field at 26–27. Check your product manual for the exact OVERLAY or BUILD syntax for constants.

Computing the Division (Average)

Once you have sum and count in the same record, average = sum / count. Some DFSORT or ICETOOL versions support arithmetic in OUTREC or OUTFIL BUILD (e.g. divide one field by another and put the result in a target position). If so, you can add an OUTREC that overlays the average field. If not, your options are: (1) a second SORT step that reads the aggregated file and uses INREC/OVERLAY with arithmetic (if supported in that product); (2) a short program (COBOL, REXX, etc.) that reads the record, computes sum/count, and writes it back; (3) leave the record with sum and count and let a downstream reporting or analytics tool compute the average. For report-only output, OUTFIL SECTIONS with TRAILER3 can give you section total and section count; the average for the section is total/count, which you may be able to format in the trailer or derive in the report writer.

OUTFIL SECTIONS: Section Total and Count

When you need section subtotals (e.g. per department), you can use OUTFIL with SECTIONS= and TRAILER3=. In TRAILER3 you can specify COUNT= for the record count in that section and TOTAL= (or TOT=) for the sum of a numeric field. So each section gets a trailer line with total and count; the section average is total/count. Some products allow you to include a computed average in the trailer; otherwise you have the two numbers and can derive the average in reporting. This is useful when you are building a report and do not need the average in the same data record as the detail.

Rounding and Precision

When you divide sum by count, the result may have decimal places. Decide how many digits you need and whether to round. If you are writing the average to a fixed-length field (e.g. 5 bytes ZD), ensure the quotient fits and consider rounding (product-dependent). If you use a program for the division, you can control rounding and precision there.

Explain It Like I'm Five

Imagine you have three jars of coins. Jar 1 has 6 coins, jar 2 has 10, jar 3 has 8. The “average per jar” is total coins (24) divided by number of jars (3), so 8. The computer does the same: first it adds up the numbers (sum) and counts how many records there are (count). To get the count, we trick it: we put a 1 in every record and add those 1s up—that’s the count. Then we divide the sum by the count to get the average. Sometimes the computer can do the division in the same job; sometimes we need a second step to do the division.

Exercises

Your record has customer at 1–8 and amount (4-byte PD) at 20–23. Describe how you would get one record per customer with sum of amount, count of records, and average amount (conceptually). What INREC would you add?
Why is “sum of a field that is 1 in every record” equal to the record count for the group?
If your DFSORT does not support division in OUTREC, what are two ways to still produce a record that contains the average?
For a report with section subtotals, how do TRAILER3 COUNT= and TOTAL= help you show a section average?

Quiz

Test Your Knowledge

1. How do you get a record count per group when using SUM for aggregation?

SUM automatically adds a count field
Add a field that is always 1 in every record (e.g. via INREC), then include that field in SUM FIELDS=—the sum of 1s is the count
Use COUNT= in SUM
Only OUTFIL can count

2. What is the formula for average per group?

Sum only
Average = sum of the field / count of records in the group
Average = count / sum
SUM FIELDS= has an AVG option

3. Can you compute the average in a single DFSORT step?

No, always two steps
It depends on the product: some DFSORT/ICETOOL versions support arithmetic in OUTREC (e.g. divide sum by count); otherwise you need a second step or program to do the division
Yes, with SUM FIELDS=AVG
Only with ICETOOL

4. What is the purpose of adding a "constant 1" field before SUM when you need average?

To pad the record
So that when you SUM that field, the result is the number of records in the group (count)—each record contributes 1, so the sum is the count
To avoid overflow
Only for reporting

5. For report-style output with section averages, what OUTFIL feature is often used?

Only SUM
SECTIONS= with TRAILER3= can give section count and section total (COUNT=, TOTAL=); the average for the section is total/count—some products allow expressing this in the trailer or you compute it separately
INCLUDE only
OUTREC only