MainframeMaster

Pivoting Data

Pivoting data means reorganizing it so that information that appears in many rows (e.g. one record per customer per month) is presented in a different shape—for example, one row per customer with a column for each month, or one summary row per group with totals. In DFSORT you can achieve pivot-like results using SORT, SUM, and OUTFIL with SECTIONS= and trailers. For simple aggregation (one row per key with summed values), SUM is enough. For report-style summaries (one line per group with count or total), OUTFIL SECTIONS= with TRAILER3 is the tool. True row-to-column pivoting (turning 12 rows into one row with 12 columns) often requires multiple passes or a program. This page explains what pivoting means in a mainframe context and how to do it with DFSORT.

Advanced Tricks
Progress0 of 0 lessons

What Is Pivoting?

In batch and reporting, pivoting usually means one of two things. First: turning many detail rows into fewer summary rows—for example, one record per region with the sum of sales (many detail records per region become one row per region with a total). Second: turning row-based data into a column-based layout—for example, one record per customer with 12 columns for January through December (12 rows per customer become one row with 12 value columns). DFSORT supports the first kind well with SORT and SUM. It supports the second kind in a limited way: you can use multiple OUTFIL copies or passes to build pieces of the wide record, then combine them in a later step, or you may need a program or ICETOOL to assemble the pivoted row.

Pivoting with SUM: One Row per Key with Totals

The simplest pivot is aggregation: many rows with the same key become one row with summed (or min/max) values. Sort by the key that defines the groups (e.g. region, product code, or department). Then use SUM FIELDS=(position,length,format,...) to specify which numeric fields to sum. DFSORT keeps one record per unique sort key and adds the specified fields across all records in that group. The result is one “pivoted” row per key with totals. You can sum multiple fields in one SUM statement (e.g. total sales and total cost), so the output record can have several aggregated columns. This is the standard way to get summary rows without writing a program.

Example: One Row per Department with Total Amount

Input: fixed-length records with department in 1–4 and amount in 20–27 (packed decimal). Output: one record per department with department and total amount.

text
1
2
SORT FIELDS=(1,4,CH,A) SUM FIELDS=(20,8,PD)

SORT FIELDS=(1,4,CH,A) groups all records by department. SUM FIELDS=(20,8,PD) keeps one record per department and sums the amount in 20–27. The first record of each group is kept and its amount field is replaced by the sum of all amounts in that group. So you get one pivoted (aggregated) row per department.

Pivoting with OUTFIL SECTIONS: One Summary Line per Group

When you want a report where each group (e.g. each department) has its own summary line—count, total, average, or min/max—use OUTFIL with SECTIONS=. Sort by the group key first. Then in OUTFIL specify SECTIONS=(position,length,format,...) with the same key. HEADER3= prints at the start of each section; TRAILER3= prints at the end of each section and can contain COUNT=, TOTAL=, MIN=, MAX=, AVG=, and fixed text. So each group gets a trailer line that looks like a summary row. If you use NODETAIL, the detail records are suppressed and only the section headers and trailers are written—so the output is effectively a list of summary rows, one per group. That is a form of pivoting: many detail rows become one summary line per key.

Example: Summary Report with One Line per Department

Sorted by department (1–4). OUTFIL with SECTIONS and NODETAIL so only the section trailer (e.g. count and total) is written per department:

text
1
2
3
4
5
SORT FIELDS=(1,4,CH,A) OUTFIL FNAMES=REPORT, SECTIONS=(1,4,CH, TRAILER3=(1,4,C'DEPT ',5,4,X,9,8,COUNT=(10,8,ZD),17,20,TOTAL=(20,8,PD,M11)), NODETAIL

Each section is defined by the key in 1–4. TRAILER3 writes a line per department: literal “DEPT ”, the department value, the count of records (in 10,8,ZD format), and the total of the amount field (20,8,PD with edit mask M11). NODETAIL suppresses the detail records so the report contains only these summary lines—one pivoted line per department.

Row-to-Column Pivoting: The Limitation

True row-to-column pivoting—e.g. 12 input rows (one per month) becoming one output row with 12 columns—requires “collecting” multiple input records into one output record. DFSORT reads and writes records in a stream; it does not have a single control statement that says “group by customer and place month 1 value in column 20, month 2 in column 28, …”. So you have two main options. First: multiple OUTFIL copies. For each “column” (e.g. each month), use an OUTFIL with INCLUDE to select only that month’s records and BUILD= to put the value in a fixed output position. That gives you 12 small files (or 12 sections of one file), each with one value per customer in a fixed column. A second step (e.g. JOINKEYS or a program) can then merge these into one record per customer with 12 columns. Second: use a COBOL program or ICETOOL that reads the sorted data and, for each customer, reads the 12 rows and builds one wide output record. Many shops use a program for complex pivoting because it is easier to maintain than multiple OUTFIL and join steps.

Strategy Summary

Pivoting approaches in DFSORT
GoalApproach
One row per key with totalsSORT FIELDS=(key) SUM FIELDS=(pos,len,fmt,...); one aggregated record per key.
One summary line per group in a reportSORT by group key; OUTFIL SECTIONS=(key) TRAILER3=(COUNT=, TOTAL=, ...); use NODETAIL for summary-only.
One row per key with multiple value columnsMultiple passes or OUTFIL copies (one per "column" with INCLUDE + BUILD), then join; or use a program.

Explain It Like I'm Five

Imagine you have a list of lemonade sales: one line for each day of the week. Pivoting can mean two things. First: instead of 7 lines, you want one line that says “Total sales: $20.” So you “squash” all the lines into one line that has the total. That’s what SUM does: many rows with the same label become one row with a total. Second: you want one line that has Monday in one box, Tuesday in another, and so on—so the days become columns. DFSORT is really good at the first (totals). For the second (days as columns), you have to either build it in pieces (one box at a time) or ask another tool (a program) to put all the boxes in one row.

Exercises

  1. Input has (store_id, product_id, quantity). You want one row per store_id with total quantity. Write the SORT and SUM control statements.
  2. You need a report with one line per region showing “Region: xxx Count: nnn Total: ttt.” What OUTFIL feature do you use, and what must you do before it?
  3. Why is turning 12 rows (one per month) into one row with 12 columns harder in DFSORT than turning many rows into one row with a total?

Quiz

Test Your Knowledge

1. What is "pivoting" in the context of DFSORT and batch data?

  • Sorting by a key
  • Transforming row-based data (e.g. one row per customer per month) into a column-based layout (e.g. one row per customer with a column per month) or aggregating by a key to produce summary rows
  • Copying only
  • Merging two files

2. What DFSORT feature helps you build one summary row per group (e.g. one row per department with a count and total)?

  • INCLUDE only
  • OUTFIL with SECTIONS= and TRAILER3—sort by the group key, then use SECTIONS= with TRAILER3 to write a summary line (COUNT=, TOTAL=, etc.) per group; optionally NODETAIL to suppress detail rows
  • MERGE only
  • OMIT only

3. You have one record per (customer, month) with a sales amount. You want one record per customer with 12 columns (Jan–Dec). What is the main challenge in pure DFSORT?

  • DFSORT cannot sort by customer
  • Pure DFSORT does not have a single built-in "pivot" that turns N rows into one row with N columns; you typically need multiple passes (e.g. one pass per month or one OUTFIL per month with INCLUDE) or a program/ICETOOL to assemble the wide record
  • SUM cannot sum
  • OUTFIL cannot write multiple columns

4. When is SUM useful for a "pivoted" style output?

  • Only for removing duplicates
  • When you want one row per key with numeric totals (e.g. one row per region with sum of sales)—SUM FIELDS= sums numeric fields and collapses to one record per sort key; the result is an aggregated row per key
  • Only for MERGE
  • SUM cannot aggregate

5. What is the role of SORT FIELDS= when building a pivoted or summary report?

  • Optional
  • You must sort by the key that defines the groups (rows) in the pivoted output—so that all records for the same key are adjacent for SUM or SECTIONS= to work correctly
  • Only for COPY
  • Only the first field matters