MainframeMaster

SPLIT Datasets

Splitting output into multiple datasets in DFSORT means writing the sorted (or copied) stream to more than one file, with records distributed either by count (e.g. first 500 to file1, next 500 to file2) or by key or condition (e.g. all DEPT1 records to one file, DEPT2 to another). For count-based splitting you use SPLIT=n, SPLITBY=n, or SPLIT1R=n together with FNAMES= listing multiple DD names. For key-based splitting you use multiple OUTFIL statements, each with INCLUDE= (or OMIT=) so that only records matching that condition go to that file. Both approaches require a DD statement in JCL for each output dataset. This page covers when to split by count vs by key, SPLIT/SPLITBY/SPLIT1R basics, and splitting by key with INCLUDE/OMIT. The SPLITBY topic covers SPLITBY and SPLIT1R in more detail.

OUTFIL Advanced
Progress0 of 0 lessons

Why Split Output?

You may need to split one large output into several smaller files for several reasons: Size limits—a downstream program or interface accepts only a fixed number of records per file (e.g. 10,000), so you break the sorted output into 10K-record chunks. Organization by key—you want one file per department, region, or record type so that each file contains only records with that value. Parallel processing—multiple small files can be processed in parallel by later steps. Distribution—different files go to different destinations or applications. DFSORT supports both count-based splitting (SPLIT, SPLITBY, SPLIT1R) and condition-based splitting (multiple OUTFILs with INCLUDE/OMIT).

Ways to split output
MethodDescriptionTypical use
SPLIT=nSplit by record count; first n to first file, next n to second, etc. (syntax may use multiple OUTFILs or FNAMES list)Fixed-size chunks per file
SPLITBY=nDistribute in blocks of n across FNAMES=(DD1,DD2,...); cycles through DDsRotating blocks across multiple files
SPLIT1R=nContiguous blocks of n records per file; remainder to last fileEqual-sized files where possible
Multiple OUTFIL + INCLUDEEach OUTFIL has INCLUDE=(key condition); different key values to different filesOne file per department, region, or record type

Splitting by Record Count

When you split by count, records are distributed by position in the stream: the first n records go to the first file, the next n to the second, and so on. The exact keyword and syntax depend on your DFSORT product. Common patterns:

  • SPLIT=n — Often used with multiple OUTFIL statements: first OUTFIL FNAMES=OUT1,SPLIT=500; second OUTFIL FNAMES=OUT2,SPLIT=500. First 500 records go to OUT1, next 500 to OUT2. Some products use a single OUTFIL with FNAMES=(OUT1,OUT2) and SPLIT=500 for the same effect.
  • SPLITBY=n — With FNAMES=(OUT1,OUT2,OUT3), the first n records go to OUT1, the next n to OUT2, the next n to OUT3, then the next n go to OUT1 again (cycle). So you rotate blocks of n records across the files. Alternatively, with two OUTFILs (first FNAMES=OUT1,SPLITBY=500; second FNAMES=OUT2), the first 500 go to OUT1 and all remaining records go to OUT2.
  • SPLIT1R=n — Similar to SPLITBY but typically gives contiguous chunks: first n to first file, next n to second, etc. Remainder may go to the last file. See your manual for SPLIT1R.

Example: First 500 to OUT1, Next 500 to OUT2

Using two OUTFILs with SPLIT (or SPLITBY) so that the first 500 records go to one file and the next 500 to another:

text
1
2
3
SORT FIELDS=COPY OUTFIL FNAMES=OUT1,SPLIT=500 OUTFIL FNAMES=OUT2,SPLIT=500

Or with SPLITBY and two files (first 500 to OUT1, remainder to OUT2):

text
1
2
OUTFIL FNAMES=OUT1,SPLITBY=500 OUTFIL FNAMES=OUT2

Your JCL must define //OUT1 DD ... and //OUT2 DD ... with appropriate DSN and LRECL.

Example: Rotating Blocks with SPLITBY and Multiple FNAMES

To send blocks of 500 records to OUT1, then OUT2, then OUT3, then repeat:

text
1
OUTFIL FNAMES=(OUT1,OUT2,OUT3),SPLITBY=500

Records 1–500 go to OUT1, 501–1000 to OUT2, 1001–1500 to OUT3, 1501–2000 to OUT1, and so on. So each file gets every third block of 500. Exact behavior is product-dependent; check your DFSORT documentation.

Splitting by Key (Condition)

To split by key (e.g. one file per department), use multiple OUTFIL statements. Each OUTFIL has FNAMES= for one output and INCLUDE=(position, length, format, operator, value) so only records that match go to that file. Example: bytes 30–34 contain department code.

text
1
2
3
4
5
SORT FIELDS=(1,10,CH,A) OUTFIL FNAMES=DEPT1,INCLUDE=(30,5,CH,EQ,C'DEPT1'),BUILD=(1,80) OUTFIL FNAMES=DEPT2,INCLUDE=(30,5,CH,EQ,C'DEPT2'),BUILD=(1,80) OUTFIL FNAMES=DEPT3,INCLUDE=(30,5,CH,EQ,C'DEPT3'),BUILD=(1,80) OUTFIL FNAMES=OTHER,SAVE,BUILD=(1,80)

Records with DEPT1 in 30–34 go to DEPT1; DEPT2 to DEPT2; DEPT3 to DEPT3; any other department goes to OTHER (SAVE). So you get four datasets split by the value in the control field. This is splitting by key, not by count—each file can have a different number of records.

Count vs Key: When to Use Which

  • Use count-based split when you need fixed-size chunks (e.g. 10,000 records per file for an API or transmission limit), or when you want to distribute records evenly across a fixed number of files regardless of content.
  • Use key-based split when you need one file per value (department, region, record type). Each file then contains only records with that value, and file sizes can vary.

JCL for Split Outputs

For every DD name used in FNAMES= (whether in one OUTFIL with a list or in multiple OUTFILs), you must have a DD statement in the same step. Specify DSN=, DISP=, and LRECL (and RECFM, SPACE) to match the record length and format that DFSORT will write. If you use BUILD= on the OUTFIL, the written length is the length of the built record; otherwise it is the full record length.

Explain It Like I'm Five

Imagine you have one long line of sorted cards and several boxes. Split by count: "Put the first 500 cards in box 1, the next 500 in box 2, the next 500 in box 1 again," and so on. Split by key: "Put all cards that say DEPT1 in box 1, all that say DEPT2 in box 2, and everything else in box 3." So SPLIT datasets = dividing one output into many files either by how many records go in each (count) or by what the record says (key).

Exercises

  1. You need the first 2000 records in file A and the next 2000 in file B. Write the OUTFIL (or OUTFILs) using SPLIT or SPLITBY.
  2. You have record type in bytes 1–2 ('01', '02', '03'). Write three OUTFILs so that each type goes to a different file (FILE01, FILE02, FILE03).
  3. What is the difference between splitting by count and splitting by key? Give one business reason for each.
  4. Look up your DFSORT manual: does SPLITBY with FNAMES=(A,B,C) cycle (A, B, C, A, B, C...) or only fill A then B then C once?

Quiz

Test Your Knowledge

1. What are the two main ways to split DFSORT output into multiple datasets?

  • Only SPLIT
  • Splitting by record count (SPLIT, SPLITBY, SPLIT1R) and splitting by key or condition (multiple OUTFILs with INCLUDE/OMIT)
  • Only INCLUDE
  • Only FNAMES

2. What does SPLIT=n do with two OUTFILs (e.g. OUT1 with SPLIT=500, OUT2 with SPLIT=500)?

  • Same as SPLITBY
  • First 500 records go to the first OUTFIL's file, the next 500 to the second OUTFIL's file
  • Round-robin
  • Only one file gets data

3. How do you split by key (e.g. one file per department)?

  • Use SPLITBY only
  • Use multiple OUTFIL statements, each with INCLUDE=(position,length,CH,EQ,C'DEPT1') etc., and different FNAMES=
  • Use SPLIT1R
  • Use one OUTFIL only

4. What is the purpose of listing multiple DD names in FNAMES=(OUT1,OUT2,OUT3) with SPLITBY?

  • Only one is used
  • Records are distributed across OUT1, OUT2, OUT3—e.g. first n to OUT1, next n to OUT2, next n to OUT3, then repeat
  • All get the same data
  • SPLITBY ignores FNAMES

5. When would you split by count instead of by key?

  • Never
  • When you need fixed-size chunks: e.g. break a large file into 10,000-record files for transmission or downstream limits, regardless of record content
  • Only for reports
  • Only when key is unknown