MainframeMaster

Splitting Records in DFSORT

Splitting records in DFSORT means writing the sorted (or copied) output to multiple output datasets instead of a single SORTOUT. You use the OUTFIL statement with FNAMES= to list the DD names of the output files and a split option to control how records are distributed: SPLIT (round-robin one record at a time), SPLIT=n (blocks of n in round-robin), SPLITBY=n (contiguous blocks of n cycling through the files), or SPLIT1R=n (first n records to the first file, next n to the second, and so on). Each option produces a different distribution—interleaved vs contiguous—so choosing the right one matters for downstream processing. This page explains each option, the syntax with FNAMES, how to combine splitting with BUILD/FIELDS, and typical use cases.

OUTFIL Advanced Output Control
Progress0 of 0 lessons

Why Split Output?

Sometimes you need the same sorted (or copied) data written to more than one dataset. Examples: sending the first 10,000 records to one application and the rest to another; dividing a large file into smaller chunks of equal size for parallel processing; or writing every other record to two files for a round-robin workload. DFSORT does not write to SORTOUT when you use OUTFIL; instead, each OUTFIL defines one or more output streams. By specifying FNAMES=(dd1,dd2,...) and a split option, you control how many output files there are and how records are distributed among them.

OUTFIL and FNAMES

OUTFIL can include FNAMES= to name the output dataset(s). FNAMES=(OUT1,OUT2) means two output files; the DD names OUT1 and OUT2 must be defined in your JCL (e.g. //OUT1 DD ..., //OUT2 DD ...). With a single output you can use FNAMES=OUT1. When you use multiple names, you must also specify how to split the records—otherwise the product may write all records to the first file or follow a default. The split options are SPLIT, SPLIT=n, SPLITBY=n, and SPLIT1R=n.

SPLIT (Round-Robin by Record)

SPLIT with no number distributes records one at a time in round-robin order. With FNAMES=(A,B), record 1 goes to A, record 2 to B, record 3 to A, record 4 to B, and so on. The result is that both files contain interleaved records—every other record from the sorted stream. Use this when you want to divide the workload evenly across two (or more) files by alternating, not by range. Syntax:

text
1
OUTFIL FNAMES=(OUT1,OUT2),SPLIT

Record order within each file is preserved (1st, 3rd, 5th... in OUT1; 2nd, 4th, 6th... in OUT2), but the records are not contiguous by key or by position in the original sorted sequence.

SPLIT=n (Blocks in Round-Robin)

SPLIT=n distributes records in groups of n, cycling through the output files. With FNAMES=(OUT1,OUT2) and SPLIT=500, the first 500 records go to OUT1, the next 500 to OUT2, the next 500 to OUT1, and so on. So each file gets contiguous blocks of 500 records, but the blocks alternate between the two files. This is useful when you want roughly equal-sized chunks and are okay with the chunks being interleaved by file. Exact behavior (e.g. where the remainder goes) can vary by product; check your manual.

text
1
OUTFIL FNAMES=(OUT1,OUT2),SPLIT=500

SPLITBY=n (Contiguous Blocks Cycling)

SPLITBY=n also writes contiguous blocks of n records, but the blocks are assigned to the output files in order. With FNAMES=(OUT1,OUT2) and SPLITBY=500: first 500 records → OUT1, next 500 → OUT2, next 500 → OUT1, and so on. So like SPLIT=500 you get 500-record blocks, but the key point is that each block is contiguous in the sorted order. The difference from SPLIT1R is that SPLITBY cycles through the list of files (OUT1, OUT2, OUT1, OUT2, ...), whereas SPLIT1R fills the first file completely, then the second, and so on.

text
1
OUTFIL FNAMES=(OUT1,OUT2),SPLITBY=500

SPLIT1R=n (Sequential Fill)

SPLIT1R=n writes the first n records to the first file, the next n records to the second file, the next n to the third, and so on. There is no cycling back to the first file until the next “round” (if you have more records than n × number of files). So with FNAMES=(T1,T2,T3,T4) and SPLIT1R=50: records 1–50 go to T1, 51–100 to T2, 101–150 to T3, 151–200 to T4. If you have 225 records, the last 25 typically go to the last file (T4). Use SPLIT1R when you need each output dataset to hold one contiguous range of records—for example, the first 1000 to one application and the rest to another, or equal-sized contiguous chunks for parallel jobs.

text
1
OUTFIL FNAMES=(T1,T2,T3,T4),SPLIT1R=50

Comparison of Split Methods

How each split option distributes records
OptionDistributionTypical use
SPLITRound-robin one record at a timeAlternate records across files (1→A, 2→B, 3→A, ...)
SPLIT=nGroups of n records per file in round-robinEach file gets n records at a time, cycling (e.g. 500 to OUT1, 500 to OUT2, 500 to OUT1, ...)
SPLITBY=nContiguous blocks of n, cycling through filesFirst n to file 1, next n to file 2, next n to file 1, etc.
SPLIT1R=nSequential: first n to file 1, next n to file 2, ...Contiguous ranges per file; remainder often to last file

Choose SPLIT for simple alternation (every other record). Choose SPLIT=n or SPLITBY=n when you want contiguous blocks of n that cycle across files. Choose SPLIT1R=n when you want the first n records in the first file, the next n in the second, etc., with no cycling—so each file gets one contiguous segment of the sorted stream.

Multiple OUTFIL Statements

You can use more than one OUTFIL. For example, to write the first 500 records to one file and all records (or the remainder) to another, you might use two OUTFIL statements: one with FNAMES=OUT1 and a limit (e.g. STOPAFT=500 if supported on OUTFIL), and one with FNAMES=OUT2 for the full set. Alternatively, a single OUTFIL with FNAMES=(OUT1,OUT2) and SPLIT1R=500 sends the first 500 to OUT1 and the rest to OUT2 (product-dependent). Check your DFSORT manual for STOPAFT and for how remainder records are handled with SPLIT1R when the record count is not a multiple of n.

Combining Split with BUILD or FIELDS

You can combine FNAMES and a split option with BUILD= or FIELDS= (and other OUTFIL operands). The same reformatting is applied to each record; then the record is written to whichever output file the split logic selects. Example: split into two files and reformat each record:

text
1
OUTFIL FNAMES=(OUT1,OUT2),SPLIT,BUILD=(1,20,25,10)

Records are reformatted to positions 1–20 and 25–10 (per BUILD), then written alternately to OUT1 and OUT2.

JCL for Multiple Outputs

Each DD name in FNAMES must be defined in your JCL. Example for two output datasets:

text
1
2
3
4
5
6
//OUT1 DD DSN=MY.DATA.OUT1,DISP=(NEW,CATLG),... //OUT2 DD DSN=MY.DATA.OUT2,DISP=(NEW,CATLG),... //SYSIN DD * SORT FIELDS=COPY OUTFIL FNAMES=(OUT1,OUT2),SPLIT1R=1000 /*

The first 1000 records go to OUT1, the rest to OUT2. LRECL and other DCB attributes should match what OUTFIL produces (e.g. same as input if no BUILD, or the length implied by BUILD).

Explain It Like I'm Five

Imagine you have a long line of toys (records) and two boxes (output files). SPLIT means: put the first toy in box 1, the second in box 2, the third in box 1, the fourth in box 2—take turns. SPLIT1R=10 means: put the first 10 toys in box 1, the next 10 in box 2, then the next 10 in box 1, and so on—fill each box with 10 before moving to the next. So the first box has toys 1–10, the second has 11–20, and so on. Splitting records in DFSORT is like that: you choose whether to take turns per toy or fill one box at a time.

Exercises

  1. You have 2500 records and FNAMES=(A,B),SPLIT1R=1000. How many records go to A and how many to B?
  2. With FNAMES=(X,Y),SPLIT and 100 records, how many records are in X and in Y?
  3. When would you use SPLIT instead of SPLIT1R?
  4. Write OUTFIL control statements to send the first 500 records to FILE1 and the next 500 to FILE2, then repeat (next 500 to FILE1, next 500 to FILE2) for the rest. Which option do you use?

Quiz

Test Your Knowledge

1. With OUTFIL FNAMES=(A,B),SPLIT (no number), how are records distributed?

  • First half to A, second half to B
  • Round-robin: record 1→A, 2→B, 3→A, 4→B, ...
  • All to A then all to B
  • Only first record to each

2. What is the main difference between SPLITBY=500 and SPLIT1R=500?

  • They are the same
  • SPLITBY cycles through files every 500 records (OUT1, then OUT2, then OUT1...); SPLIT1R fills the first file with 500, then the next with 500, etc., contiguously
  • SPLIT1R is round-robin
  • SPLITBY uses only one file

3. If you have OUTFIL FNAMES=(OUT1,OUT2),SPLIT=500, how many records go to OUT1 and OUT2?

  • 500 each
  • It depends on total count: 500 records go to OUT1, next 500 to OUT2, next 500 to OUT1, etc., in groups of 500
  • All to OUT1
  • SPLIT=500 is invalid

4. Why would you use SPLIT1R instead of SPLIT?

  • SPLIT1R is faster
  • When you need each output dataset to contain a contiguous range of records (e.g. first 1000 to file 1, next 1000 to file 2) rather than interleaved records
  • SPLIT is deprecated
  • Only SPLIT1R supports more than two files

5. How do you specify multiple output datasets in OUTFIL?

  • Multiple OUTFIL statements only
  • FNAMES=(ddname1,ddname2,...) or a single ddname; each name corresponds to one output dataset
  • SPLIT only
  • OUTREC FIELDS=