Writing Summary Datasets

When you use DFSORT to produce summary information, you need to decide where that summary goes: into the same dataset as the main output, into a separate summary dataset, or into a report dataset that mixes detail and summary lines. This page explains how to write summary data to datasets: (1) using SUM to write one record per group to SORTOUT, (2) using OUTFIL with multiple FNAMES= to send detail to one dataset and a report (with headers and trailers) to another, and (3) what to consider when allocating and sizing summary and report datasets.

OUTFIL Advanced Output Control
Progress0 of 0 lessons

Where Summary Data Can Go

Summary information in DFSORT can be written in two main forms:

  • Summary records — One record per group (e.g. one per department) with numeric fields containing totals. This is the output of SUM FIELDS=. It goes to the main sort output (SORTOUT in a SORT step). The result is a dataset with fewer records than the input, same record layout, with aggregated values in the summed fields.
  • Report summary lines — Trailer (and optionally header) lines that contain counts and totals, e.g. “TOTAL RECORDS: 1000” and “TOTAL AMOUNT: 50000”. These are produced by OUTFIL with TRAILER1= and TRAILER3= (and HEADER1, etc.). They are written to whatever dataset(s) you name in OUTFIL FNAMES=.

So “writing summary datasets” can mean: (a) writing the SUM output to a dataset (SORTOUT), or (b) writing an OUTFIL report (detail plus summary lines) to a dataset, or (c) writing detail to one dataset and the report (with summary lines) to another.

Writing SUM Output to SORTOUT

When you use SUM FIELDS=, DFSORT produces one record per group. That output is written to the same output as a normal sort: SORTOUT. So the dataset you allocate to SORTOUT becomes your summary dataset. You do not need a separate DD for the summary; SORTOUT holds the summary records.

Example: sort by department (1,5), sum amount (20,6,PD). SORTOUT receives one record per department.

text
1
2
SORT FIELDS=(1,5,CH,A) SUM FIELDS=(20,6,PD)

In JCL you allocate SORTOUT with the correct record length and format. The record layout is the same as the input (or as after INREC if you use it). So if input is RECFM=FB, LRECL=80, allocate SORTOUT with RECFM=FB, LRECL=80. The number of records will be the number of distinct key values (e.g. number of departments), not the number of input records. So you need less space: if you have 50,000 input records and 20 departments, SORTOUT has 20 records.

Allocating the Summary Dataset for SUM

For SUM output you must:

  • Use the same RECFM and LRECL as the sort output record (input record or INREC/OUTREC layout).
  • Estimate the number of records = number of distinct values in the sort key. So primary and secondary space can be much smaller than for the input file.
  • Use an appropriate BLKSIZE for the record length (e.g. 27920 for LRECL=80 FB).

If you underestimate record count (e.g. more distinct keys than expected), the step can abend with an out-of-space condition. So it is safer to estimate high for the number of groups or use a secondary allocation that allows for growth.

Writing Detail and Report (with Summary) to Different Datasets

Often you want both: a detail dataset (every record, possibly reformatted) and a report dataset (same detail plus headers and trailers with counts and totals). You can do that in one step with OUTFIL and multiple FNAMES=.

Each OUTFIL can name one or more DDs in FNAMES=. So you can have one OUTFIL that writes only the detail records to DD DETAIL, and another OUTFIL that writes the full report (detail + HEADER1 + TRAILER1, and optionally TRAILER3) to DD REPORT.

text
1
2
3
4
5
6
7
SORT FIELDS=(1,5,CH,A) OUTFIL FNAMES=DETAIL,OUTREC=(1,80) OUTFIL FNAMES=REPORT, OUTREC=(1,80), HEADER1=(1:'Sales Report',/,5:DATE=(MD4-)), TRAILER1=(1:'TOTAL RECORDS: ',COUNT=(M11,LENGTH=7), ' TOTAL: ',TOTAL=(20,6,PD,LENGTH=12))

Here DETAIL receives only the 80-byte data records. REPORT receives the same data records plus a header (title and date) and a trailer line with record count and sum of the field at 20,6,PD. So you get two datasets: one detail-only, one report with summary at the end. In JCL you allocate both DETAIL and REPORT to the desired datasets (e.g. DISP=(NEW,CATLG,DELETE), SPACE=..., DCB=...).

Report Dataset Layout and Allocation

The report dataset (e.g. REPORT) typically has variable-length or fixed-length records that can hold both data lines and header/trailer lines. If you use BUILD= or OUTREC= with a fixed format for the data, and HEADER1/TRAILER1 produce lines of a certain length, the report dataset is often RECFM=FB or RECFM=VB with an LRECL large enough for the longest line (e.g. 121 or 133 for a print-style report). The number of records = number of data records + number of header lines + number of trailer lines (e.g. one HEADER1 block, one TRAILER1 line, and optionally TRAILER3 lines per section). So the record count is roughly the same as the detail count plus a small number of summary lines.

Summary-Only Output (Trailers Only)

In some cases you want a small dataset that contains only the summary—e.g. one line with “TOTAL RECORDS: 1000” and “TOTAL: 50000”. Doing this in one DFSORT step depends on the product. One approach is to use OUTFIL with a FNAMES= that receives only the trailer: some implementations allow you to specify that a particular output gets only TRAILER1 (and TRAILER3) and no data records. Another approach is to run a second step (e.g. a small program or another utility) that reads the full report and writes only the last few lines (the trailer) to a summary dataset. Check your DFSORT manual for the supported option; if “trailer-only” output is not available, the two-step approach is common.

Using FNAMES= for Multiple Outputs

FNAMES= accepts a list of DD names. Each DD receives the same logical output (same OUTFIL options). So OUTFIL FNAMES=(A,B) writes the same report to both A and B. To write different content to different datasets, use multiple OUTFIL statements, each with its own FNAMES= and its own BUILD=, HEADER1=, TRAILER1=, etc.

Ways to write summary-related output
GoalMethod
Summary records (one per group) in a datasetUse SUM FIELDS=; output goes to SORTOUT. Allocate SORTOUT as the summary dataset.
Detail in one file, report (detail + summary lines) in anotherUse two OUTFILs: one FNAMES=DETAIL with OUTREC= only; one FNAMES=REPORT with OUTREC=, HEADER1=, TRAILER1= (and TRAILER3= if SECTIONS=).
Same report to two datasetsUse one OUTFIL FNAMES=(REP1,REP2) with the same BUILD/HEADER/TRAILER; allocate REP1 and REP2 to the two datasets.
Only grand total (or section totals) in a small fileProduct-dependent: use OUTFIL option for trailer-only output if available; otherwise a second step that extracts trailer lines from the full report.

ICETOOL and Writing Summary Datasets

If you run ICETOOL, the output of a SORT or COPY operation goes to the DD named in TO(...). So for a step that does SORT FROM(INDD) TO(OUTDD) with SUM in the control statements, the summary records go to OUTDD. The same allocation rules apply: OUTDD must have the correct RECFM, LRECL, and enough space for the number of summary records (groups).

Explain It Like I'm Five

Imagine you have a list of lemonade sales and you want to save two things: (1) a short list that only has one line per stand with the total (summary records)—that’s like SUM writing to SORTOUT. (2) A full report that has the same detail plus a line at the bottom that says “TOTAL: $120” (summary line)—that’s like OUTFIL writing to REPORT with TRAILER1. You can give the short list to one friend (one dataset) and the full report to another (another dataset) by using two OUTFILs with different names. Writing summary datasets just means telling DFSORT which “box” (dataset) gets the summary records or the report with the summary line at the end.

Exercises

  1. Write control statements that sort by bytes 1–5 and sum bytes 20,6,PD. How many records will be in SORTOUT if the input has 10,000 records and 200 distinct values in bytes 1–5?
  2. Write OUTFIL statements that send detail to DD DETAIL and a report (with TRAILER1 containing COUNT= and TOTAL= for position 20,6,PD) to DD REPORT. Allocate both DDs in JCL with RECFM=FB, LRECL=80 for DETAIL; what LRECL would you use for REPORT if the trailer line is about 60 characters?
  3. Why must you allocate SORTOUT with enough space for the number of groups (not the number of input records) when using SUM?
  4. Name two differences between the dataset produced by SUM (to SORTOUT) and the dataset produced by OUTFIL FNAMES=REPORT with TRAILER1.

Quiz

Test Your Knowledge

1. Where does SUM FIELDS= write its summary records (one per group)?

  • To a separate OUTFIL DD
  • To the main output dataset—SORTOUT for a SORT step, or the TO() DD for ICETOOL
  • Only to SYSOUT
  • To SORTIN

2. How do you write both detail records and a summary report (with trailers) to different datasets in one DFSORT step?

  • Use two steps
  • Use OUTFIL with multiple FNAMES—e.g. one OUTFIL for the detail dataset (OUTREC= data) and one OUTFIL for the report dataset (same data plus HEADER1/TRAILER1 with COUNT= and TOTAL=)
  • SUM writes to both
  • Only one output is allowed

3. What must you consider when allocating the output dataset for SUM summary records?

  • Only BLKSIZE matters
  • Record count will be much lower than input (one record per group), and record layout is the same as input (or as after INREC/OUTREC)—so LRECL and RECFM must match that layout
  • SUM always produces 80-byte records
  • You must use a PDS

4. Can you write a report that contains only the summary lines (e.g. TRAILER1 and TRAILER3) to a dataset without the detail records?

  • No, trailers are always with detail
  • Yes—use OUTFIL with only HEADER1/TRAILER1 (and optionally TRAILER3 via SECTIONS=) and no OUTREC/BUILD for the data records, or use INCLUDE to exclude all data and only output the trailer; exact syntax is product-dependent
  • Only with ICETOOL
  • TRAILER1 cannot go to a dataset

5. When using OUTFIL FNAMES=REPORT for a summary report dataset, what typically goes into that dataset?

  • Only control statements
  • Data records (detail) plus header and trailer lines—so the dataset is a full report with title/headers, detail lines, and summary lines (COUNT=, TOTAL=) at the end or after sections
  • Only TRAILER1
  • Only the sort key