Transposing records means turning row-based data into column-based layout: values that appear in separate rows (e.g. one record per month per customer) are reorganized so they appear as columns in a single row (e.g. one record per customer with columns for Jan, Feb, Mar, …). In a matrix sense, rows become columns and columns become rows. DFSORT does not have a single built-in "transpose" verb. You can achieve transposition using multiple OUTFIL copies (one per column) with INCLUDE and BUILD, then combine the results with JOINKEYS or a merge step. For variable or large numbers of columns, or when the logic is complex, a COBOL program or ICETOOL that reads sorted data and builds one wide record per key is often simpler and more maintainable. This tutorial explains what transposing means on the mainframe and how to do it with DFSORT.
In batch processing, transposing means reorganizing data so that values that were in multiple rows end up in multiple columns of a single row. For example: input has one record per customer per month (12 rows per customer); you want one record per customer with 12 columns (Jan, Feb, …, Dec). The "row" in the transposed output is identified by a key (e.g. customer ID); the "columns" are the values that came from separate input rows. DFSORT reads and writes records sequentially and does not have one statement that says "collect N records per key and output one record with N columns." So transposition is done by either building one output column at a time with multiple OUTFIL copies and then joining, or by using a program to assemble the wide record in one pass.
Before any transpose logic, sort by the key that will identify each row in the transposed output (e.g. customer ID, account number). That groups all records for the same key together. Without this, you cannot reliably assign which input record becomes column 1, column 2, and so on. Use SORT FIELDS=(start,length,format,direction) with that key. If you need a specific order within each key (e.g. month 1, 2, 3), add a secondary sort key (e.g. month field) so that the order of rows within each key is fixed. Then you can use sequence numbers or the secondary key in OUTFIL INCLUDE to select which record supplies which column.
To select "the first record per key," "the second record per key," and so on, you need a way to identify position within each key group. One method is to add a sequence number in INREC or OUTREC. For example, use OUTREC with IFTHEN to build a sequence number that resets for each key. DFSORT does not have a built-in "record number within group" in control statements; you can use OPTION EQUALS and then rely on the fact that records with the same key are adjacent, and use multiple passes or a user exit. A simpler approach for "month per customer" is to use the month value itself: OUTFIL copy 1 INCLUDE=(month_pos,2,CH,EQ,C'01'), copy 2 INCLUDE=(month_pos,2,CH,EQ,C'02'), and so on. Each copy selects one month and BUILD places the value in a fixed column. So you do not always need an explicit sequence number if you have a field that already indicates the column (e.g. month, period).
For each column in the transposed output, define one OUTFIL. Each OUTFIL uses INCLUDE to select only the records that supply that column (e.g. month=01 for column 1, month=02 for column 2). Then use BUILD= to place the selected value in a fixed position in the output record. All OUTFIL copies can write to different FNAMES (e.g. COL1, COL2, …) or you can use SPLIT or other options to separate them. The result is one dataset per column, each with the same keys and one value per key in a fixed position. A subsequent step (JOINKEYS or a program) can merge these into one record per key with all columns. Example: input has customer ID in 1–10, month in 11–12, amount in 13–20. You want one record per customer with amount for month 01 in bytes 21–28, month 02 in 29–36, etc. Sort by 1,10,CH and 11,2,CH. OUTFIL FNAMES=M01,BUILD=(1,10,21,8,13,8),INCLUDE=(11,2,CH,EQ,C'01'); OUTFIL FNAMES=M02,BUILD=(1,10,29,8,13,8),INCLUDE=(11,2,CH,EQ,C'02'); … Then join M01, M02, … on customer ID to form the wide record.
Input: fixed-length records with key in 1–5, type in 6 (values A, B, C), value in 7–14. Output: one record per key with value for A in 6–13, B in 14–21, C in 22–29. Sort by key and type so that A, B, C are in order within each key.
12345678910SORT FIELDS=(1,5,CH,A,6,1,CH,A) OUTFIL FNAMES=COLA, INCLUDE=(6,1,CH,EQ,C'A'), BUILD=(1,5,1,5,6,8,7,8) OUTFIL FNAMES=COLB, INCLUDE=(6,1,CH,EQ,C'B'), BUILD=(1,5,1,5,14,8,7,8) OUTFIL FNAMES=COLC, INCLUDE=(6,1,CH,EQ,C'C'), BUILD=(1,5,1,5,22,8,7,8)
Each OUTFIL selects one type (A, B, or C) and places the key (1–5) and the value (7–8 bytes) in a fixed position. COLA has key and value in 6–13; COLB has key and value in 14–21; COLC in 22–29. A JOINKEYS step on the key can merge COLA, COLB, COLC into one record per key with three value columns.
The maximum output record length (e.g. 32KB for some datasets) and the number of OUTFIL copies you can maintain limit how many columns you can transpose in pure DFSORT. For a large or variable number of columns, or when the layout is complex, a COBOL program or ICETOOL that reads the sorted input and, for each key, reads the N records and builds one wide output record is usually easier. The program has full control over positions and can handle variable numbers of values per key (e.g. with occurrence counts or delimiters). Some shops use DFSORT for the sort and a small program or ICETOOL for the actual transpose step.
| Goal | Approach |
|---|---|
| Fixed N rows → one row with N columns | Sort by row key. N OUTFIL copies: each INCLUDE selects Nth value (by sequence or field), BUILD to fixed column. Join/merge results. |
| Variable or many columns | Use a COBOL program or ICETOOL to read sorted input and build one wide record per key in one pass. |
| Add sequence per key for column selection | INREC or OUTREC to add sequence number (e.g. 1,2,3…) within each key group so OUTFIL can select by position. |
Imagine you have a list of fruits: one line says "apple," the next "banana," the next "cherry." Transposing is like turning that list sideways: instead of three lines, you want one line that says "apple banana cherry" in three boxes next to each other. DFSORT can do that by making three separate small lists: one that has only the apple in the first box, one that has only the banana in the second box, one that has only the cherry in the third box. Then you glue those three lists together so one row has all three. If you have lots of boxes, it's easier to ask another helper (a program) to do the gluing in one go.
1. What does "transposing" records mean in DFSORT?
2. What is a common DFSORT-only approach for transposing a fixed number of rows into one row with that many columns?
3. When might you use a COBOL program or ICETOOL instead of pure DFSORT for transposing?
4. What must you do before transposing with multiple OUTFIL copies?
5. What limits how many columns you can transpose to in pure DFSORT?