DFSORT MERGE combines two or more inputs that are each already sorted by the same key. For the merge to produce correct output, every input must be in key order: same key position, length, format, and direction. This page explains how to prepare and verify datasets for MERGE: matching the key across inputs, when to run a prior SORT, and how to avoid wrong order or inconsistent results.
Pre-sorted means that within each input dataset, every record is in ascending (or descending) order by the merge key. The merge key is what you specify in MERGE FIELDS=: start position, length, format (CH, PD, ZD, BI, etc.), and direction (A or D). For example, if you code MERGE FIELDS=(1,10,CH,A), then each input must be sorted so that the bytes in positions 1–10 (interpreted as character) never decrease from one record to the next. DFSORT does not re-sort; it only reads the next record from each stream and writes the one that comes first in key order. So if any input is out of order, the merged output will be wrong.
All inputs must be sorted by the same key. That means:
So when you prepare inputs, use the exact same key specification (position, length, format, direction) in every SORT or process that produces those inputs, and then use that same specification in MERGE FIELDS=.
| Step | Action | Detail |
|---|---|---|
| 1 | Decide the merge key | Same position, length, format, and direction for MERGE FIELDS= |
| 2 | Ensure each input is sorted by that key | Run SORT with same SORT FIELDS= on any unsorted input |
| 3 | Use same key in MERGE FIELDS= | Match exactly what each input was sorted by |
| 4 | Allocate SORTIN01, SORTIN02, … | One DD per pre-sorted input |
Run a SORT step on an input when:
You do not need to sort an input again if you are certain it is already in the correct order by the same key as MERGE FIELDS=. For example, if a previous step in the same job produced the dataset with a SORT FIELDS=(1,10,CH,A) and your MERGE uses MERGE FIELDS=(1,10,CH,A), you can use it directly as SORTIN01 or SORTIN02.
Suppose you have two unsorted files, PART1 and PART2. You want one output merged by bytes 1–6 character ascending. First sort each part, then merge.
12345678910111213141516//SORT1 EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN DD DSN=MY.PART1,DISP=SHR //SORTOUT DD DSN=MY.SORTED.PART1,DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * SORT FIELDS=(1,6,CH,A) /* //SORT2 EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN DD DSN=MY.PART2,DISP=SHR //SORTOUT DD DSN=MY.SORTED.PART2,DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * SORT FIELDS=(1,6,CH,A) /*
Both SORT steps use the same key: 1,6,CH,A. So MY.SORTED.PART1 and MY.SORTED.PART2 are both in order by bytes 1–6 character ascending.
123456789//MERGE1 EXEC PGM=SORT //SYSOUT DD SYSOUT=* //SORTIN01 DD DSN=MY.SORTED.PART1,DISP=SHR //SORTIN02 DD DSN=MY.SORTED.PART2,DISP=SHR //SORTOUT DD DSN=MY.MERGED.OUTPUT,DISP=(NEW,CATLG,DELETE), // SPACE=(CYL,(10,5)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) //SYSIN DD * MERGE FIELDS=(1,6,CH,A) /*
MERGE FIELDS=(1,6,CH,A) matches the SORT FIELDS= used in step 1. The merge step only combines the two streams; it does not re-sort.
Format: If one input was sorted with format CH (character) and another with PD (packed decimal) for the same byte positions, the logical order can differ. For example, bytes containing the character representation of numbers "10" and "2" sort differently in CH (character) vs numeric (PD). So when you prepare inputs, use the same format in every SORT as in MERGE FIELDS=.
Direction: Ascending (A) means smallest key first; descending (D) means largest key first. MERGE assumes all streams are in the same direction. If one input is A and another D, the “next” record from the D stream is actually moving the wrong way relative to the A stream, and the merged sequence will be incorrect. Always use the same direction when sorting each input.
If you are not sure whether a dataset is sorted correctly, the safest approach is to run a SORT step on it with the exact MERGE FIELDS= key. That guarantees correct order for the MERGE. Some shops use a separate “validation” or “re-sort” step that sorts the input and compares record count or checksums to detect if the input was already in order; in practice, re-sorting with the merge key is a simple and reliable way to prepare data for MERGE.
If you merge by a multi-part key (e.g. primary 1–10 CH A, secondary 11–4 PD D), then every input must be sorted by that same multi-part key. When you run prior SORT steps, use SORT FIELDS=(1,10,CH,A,11,4,PD,D) (or the same positions/formats you use in MERGE FIELDS=). The entire key—all positions, formats, and directions—must match across all inputs.
You have two piles of cards. Each pile is already in order from A to Z. Before you can merge them into one big A-to-Z pile, you have to make sure both piles are really in A-to-Z order. If one pile was sorted by first name and the other by last name, when you merge you mix two different orderings and get a mess. So “pre-sorted for merge” means: both piles are sorted the same way (same rule, same direction). If you are not sure about a pile, sort it again with that same rule. Then you can merge them into one correct pile.
1. What must match across all inputs before you MERGE them?
2. Your MERGE inputs came from different jobs. How do you ensure they are sorted correctly for MERGE?
3. If one input is sorted ascending and another descending by the same key, what happens when you MERGE?
4. What is a safe way to prepare two unsorted files for a MERGE step?
5. Why is it important that key format (e.g. CH, PD) match across MERGE inputs?