MainframeMaster

Multi-File Merges

DFSORT MERGE can combine more than two input streams in one step. You allocate one DD per input: SORTIN01, SORTIN02, SORTIN03, and so on (the exact maximum, e.g. SORTIN16 or higher, depends on your product). Each input must be pre-sorted by the same key as MERGE FIELDS=. This page covers how to set up and run multi-file merges: JCL, DD naming, whether input order matters, and typical use cases like combining daily files or parallel sort outputs.

MERGE Processing
Progress0 of 0 lessons

DD Names for Multiple Inputs

For a two-input merge you use SORTIN01 and SORTIN02. For three inputs you add SORTIN03. For four, SORTIN04, and so on. The pattern is SORTINnn where nn is a two-digit number (01, 02, 03, …). You must allocate exactly as many DDs as you have input datasets. DFSORT reads from each DD as a separate stream and merges them by the key you specify in MERGE FIELDS=. You do not use SORTIN for MERGE; SORTIN is only for the single-input SORT operation.

DD names by number of merge inputs
Number of inputsDD names
2SORTIN01, SORTIN02
3SORTIN01, SORTIN02, SORTIN03
4SORTIN01, SORTIN02, SORTIN03, SORTIN04
n (up to limit)SORTIN01 through SORTINnn

Does the Order of Inputs Matter?

The physical order of which file you assign to SORTIN01 vs SORTIN02 vs SORTIN03 does not change the merged output order. MERGE always produces output in key order. At each step it compares the current record from each stream and writes the one that comes next in the sort key (smallest for ascending, largest for descending). So whether a record came from SORTIN01 or SORTIN05 does not matter; only its key value matters. You can assign the inputs to SORTIN01–SORTINnn in any order that is convenient (e.g. by date, by region, or alphabetically by dataset name).

Maximum Number of Inputs

The maximum number of merge inputs depends on your DFSORT product level and installation. Many installations support at least SORTIN01 through SORTIN16 (16 inputs). Some support more. If you need to merge more files than the limit, you can do it in stages: for example, merge the first 16 into one file, then merge that result with the next batch, and so on. Or merge 8 and 8, then merge the two results. Check your shop’s documentation or ask your systems programmer for the exact limit.

Example: Merging Four Sorted Files

Suppose you have four datasets, each sorted by bytes 1–10 character ascending. You want one combined output in the same order.

jcl
1
2
3
4
5
6
7
8
9
10
11
//MERGE4 EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN01 DD DSN=REGION.NORTH.SORTED,DISP=SHR //SORTIN02 DD DSN=REGION.SOUTH.SORTED,DISP=SHR //SORTIN03 DD DSN=REGION.EAST.SORTED,DISP=SHR //SORTIN04 DD DSN=REGION.WEST.SORTED,DISP=SHR //SORTOUT DD DSN=ALL.REGIONS.MERGED,DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(20,10)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * MERGE FIELDS=(1,10,CH,A) /*

Each of the four inputs must be sorted by bytes 1–10 character ascending. The merge step reads from all four streams and writes one output in key order. Which region is SORTIN01 vs SORTIN04 does not affect the final sequence; only the key in positions 1–10 determines order.

Use Case: Daily Files to Weekly

A common pattern is to have one sorted file per day (e.g. MON.SORTED, TUE.SORTED, …) and to merge them into one weekly file. Each daily file is already sorted by the same key (e.g. customer ID or transaction time). You allocate SORTIN01 through SORTIN07 (or SORTIN01–SORTIN05 for weekdays only), point each to the corresponding daily dataset, and run MERGE FIELDS= with that key. One MERGE step produces the combined weekly file in order. The same idea applies to merging monthly files into quarterly or yearly.

Use Case: Parallel Sort Then Merge

When data is very large, some jobs split the input into partitions, run a SORT on each partition (e.g. in parallel or in separate steps), and then MERGE the sorted partitions. For example, 8 partitions produce 8 sorted datasets; you then MERGE them with SORTIN01–SORTIN08. That way the heavy work (full sort) is done in smaller pieces, and the final MERGE is a single pass that only combines already-sorted streams. Multi-file MERGE is what makes this pattern work.

INCLUDE, OMIT, INREC, OUTREC with Multi-File MERGE

Filtering and reformatting work the same as with a two-input MERGE. You can use INCLUDE or OMIT to drop records from any input before they enter the merge. INREC reformats each record before the merge; OUTREC reformats after the merge when writing to SORTOUT. So you can have different source formats or apply different filters per stream as long as the key used for merging is in the same position and format after INREC (if used). The number of inputs does not change how these statements behave.

Explain It Like I'm Five

You have four stacks of cards, and each stack is already in A-to-Z order. You want one big stack that is still A-to-Z. You don’t care which small stack a card came from. You just look at the top card of each of the four stacks, pick the one that comes first in the alphabet, put it on the result pile, and repeat. That’s a multi-file merge. It doesn’t matter if stack 1 is Monday’s cards and stack 4 is Thursday’s—the result is one combined A-to-Z pile. MERGE with SORTIN01 through SORTIN04 does the same thing with four (or more) files.

Exercises

  1. You have five sorted files. Write the JCL DD statements for the five merge inputs (dataset names can be FILE1 through FILE5).
  2. If your installation allows up to 16 merge inputs and you have 20 sorted files, how could you still produce one merged output? (Hint: merge in stages.)
  3. Does assigning the file with the smallest keys to SORTIN01 make the merge faster? Why or why not?
  4. You merge six files by bytes 5–12 PD ascending. One file was sorted by bytes 5–12 CH ascending. What risk do you run? What should you do?

Quiz

Test Your Knowledge

1. To merge four sorted files, which DD names do you use?

  • SORTIN and SORTIN02, SORTIN03, SORTIN04
  • SORTIN01, SORTIN02, SORTIN03, SORTIN04
  • SORTIN four times
  • SORTIN01 and SORTIN02 only (concatenate the rest)

2. Does the order of SORTIN01, SORTIN02, etc. affect the merged output order?

  • Yes; records are interleaved in DD order
  • No; MERGE outputs in key order regardless of which DD a record came from
  • Only SORTIN01 order matters
  • Yes; SORTIN01 records always appear first

3. What is the maximum number of merge inputs in DFSORT?

  • 2
  • 16
  • 32
  • It is product-dependent; commonly 16 or more (SORTIN01–SORTIN16); check your installation

4. You have six daily files, each sorted by date. How do you produce one weekly merged file?

  • Concatenate all six into SORTIN and run SORT
  • Allocate SORTIN01 through SORTIN06, each pointing to one daily file, and run MERGE FIELDS= with the date key
  • Run MERGE five times (merge 1+2, then 3, then 4, etc.)
  • Use JOINKEYS

5. Can you use different RECFM or LRECL for each SORTINnn in a MERGE?

  • No; all must be the same
  • Yes; DFSORT merges by key; record format can differ but key position/format must be consistent
  • Only SORTIN01 format is used
  • RECFM must match but LRECL can differ