What does pre-sorted mean for DFSORT MERGE?

Pre-sorted means each input dataset is already in ascending (or descending) order by the same key that you specify in MERGE FIELDS=. DFSORT does not sort the inputs; it only merges them. So you must ensure each input was produced by a sort (or other process) that used that exact key (position, length, format, and direction).

How do I prepare datasets for MERGE?

Ensure each dataset is sorted by the same key you will use in MERGE FIELDS=. If any dataset is unsorted or sorted by a different key, run a SORT step on it first with SORT FIELDS= equal to your MERGE FIELDS= (same position, length, format, direction). Then use those sorted outputs as SORTIN01, SORTIN02, etc. in the MERGE step.

Can MERGE inputs have different record lengths?

Typically MERGE is used with inputs that have the same or compatible record layout so that the merged output is consistent. DFSORT can merge fixed-length records; if record lengths differ, the output format (e.g. from OUTREC) must be defined accordingly. The key used for merging must be in the same position and format in each input.

What if I am not sure whether my dataset is sorted?

If in doubt, run a SORT step on it with the same SORT FIELDS= you will use in MERGE FIELDS=. That guarantees the dataset is in the correct order for the subsequent MERGE. The extra SORT step costs time but ensures correct merged output.

Do all MERGE inputs need to be sorted in the same direction (ascending or descending)?

Yes. MERGE FIELDS= specifies one direction (A or D) for the merge. All inputs must be sorted in that same direction. If one input is ascending and another descending, the merged result will not be correctly ordered.

Pre-Sorted Dataset Merging

DFSORT MERGE combines two or more inputs that are each already sorted by the same key. For the merge to produce correct output, every input must be in key order: same key position, length, format, and direction. This page explains how to prepare and verify datasets for MERGE: matching the key across inputs, when to run a prior SORT, and how to avoid wrong order or inconsistent results.

MERGE Processing

Progress0 of 0 lessons

What “Pre-Sorted” Means for MERGE

Pre-sorted means that within each input dataset, every record is in ascending (or descending) order by the merge key. The merge key is what you specify in MERGE FIELDS=: start position, length, format (CH, PD, ZD, BI, etc.), and direction (A or D). For example, if you code MERGE FIELDS=(1,10,CH,A), then each input must be sorted so that the bytes in positions 1–10 (interpreted as character) never decrease from one record to the next. DFSORT does not re-sort; it only reads the next record from each stream and writes the one that comes first in key order. So if any input is out of order, the merged output will be wrong.

Key Must Match Across All Inputs

All inputs must be sorted by the same key. That means:

Position and length: The key must be in the same byte range in each input (e.g. bytes 1–10 in every file). If one file has the key in 1–10 and another in 11–20, they are not comparable for merge.
Format: The key must be interpreted the same way. Character (CH) sorts by character code; packed decimal (PD) and zoned decimal (ZD) sort by numeric value. If one input was sorted as CH and another as PD for the same bytes, the order may differ (e.g. "2" and "10" in CH order vs numeric order). Use the same format in the prior SORT (or source process) as in MERGE FIELDS=.
Direction: All inputs must be in the same direction—all ascending (A) or all descending (D). If one stream is ascending and another descending, the merge will interleave them incorrectly.

So when you prepare inputs, use the exact same key specification (position, length, format, direction) in every SORT or process that produces those inputs, and then use that same specification in MERGE FIELDS=.

Preparation Steps in Order

Steps to prepare datasets for MERGE
Step	Action	Detail
1	Decide the merge key	Same position, length, format, and direction for MERGE FIELDS=
2	Ensure each input is sorted by that key	Run SORT with same SORT FIELDS= on any unsorted input
3	Use same key in MERGE FIELDS=	Match exactly what each input was sorted by
4	Allocate SORTIN01, SORTIN02, …	One DD per pre-sorted input

When to Run a Prior SORT

Run a SORT step on an input when:

The dataset is unsorted or you do not know its order. Sort it with SORT FIELDS= equal to the key you will use in MERGE FIELDS=.
The dataset was sorted by a different key (different position, length, format, or direction). Re-sort it with the merge key.
The dataset was produced by another system or job and you cannot guarantee order. Sort it with the merge key to be safe.

You do not need to sort an input again if you are certain it is already in the correct order by the same key as MERGE FIELDS=. For example, if a previous step in the same job produced the dataset with a SORT FIELDS=(1,10,CH,A) and your MERGE uses MERGE FIELDS=(1,10,CH,A), you can use it directly as SORTIN01 or SORTIN02.

Example: Two Unsorted Files to One Merged Output

Suppose you have two unsorted files, PART1 and PART2. You want one output merged by bytes 1–6 character ascending. First sort each part, then merge.

Step 1: Sort each input

jcl

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
//SORT1    EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTIN   DD DSN=MY.PART1,DISP=SHR
//SORTOUT  DD DSN=MY.SORTED.PART1,DISP=(NEW,CATLG,DELETE),
//            SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920)
//SYSIN    DD *
  SORT FIELDS=(1,6,CH,A)
/*
//SORT2    EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTIN   DD DSN=MY.PART2,DISP=SHR
//SORTOUT  DD DSN=MY.SORTED.PART2,DISP=(NEW,CATLG,DELETE),
//            SPACE=(CYL,(5,2)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920)
//SYSIN    DD *
  SORT FIELDS=(1,6,CH,A)
/*

Both SORT steps use the same key: 1,6,CH,A. So MY.SORTED.PART1 and MY.SORTED.PART2 are both in order by bytes 1–6 character ascending.

Step 2: MERGE the sorted outputs

jcl

1
2
3
4
5
6
7
8
9
//MERGE1   EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTIN01 DD DSN=MY.SORTED.PART1,DISP=SHR
//SORTIN02 DD DSN=MY.SORTED.PART2,DISP=SHR
//SORTOUT  DD DSN=MY.MERGED.OUTPUT,DISP=(NEW,CATLG,DELETE),
//            SPACE=(CYL,(10,5)),DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920)
//SYSIN    DD *
  MERGE FIELDS=(1,6,CH,A)
/*

MERGE FIELDS=(1,6,CH,A) matches the SORT FIELDS= used in step 1. The merge step only combines the two streams; it does not re-sort.

Format and Direction Matter

Format: If one input was sorted with format CH (character) and another with PD (packed decimal) for the same byte positions, the logical order can differ. For example, bytes containing the character representation of numbers "10" and "2" sort differently in CH (character) vs numeric (PD). So when you prepare inputs, use the same format in every SORT as in MERGE FIELDS=.

Direction: Ascending (A) means smallest key first; descending (D) means largest key first. MERGE assumes all streams are in the same direction. If one input is A and another D, the “next” record from the D stream is actually moving the wrong way relative to the A stream, and the merged sequence will be incorrect. Always use the same direction when sorting each input.

Verifying Order (When in Doubt)

If you are not sure whether a dataset is sorted correctly, the safest approach is to run a SORT step on it with the exact MERGE FIELDS= key. That guarantees correct order for the MERGE. Some shops use a separate “validation” or “re-sort” step that sorts the input and compares record count or checksums to detect if the input was already in order; in practice, re-sorting with the merge key is a simple and reliable way to prepare data for MERGE.

Multiple Keys

If you merge by a multi-part key (e.g. primary 1–10 CH A, secondary 11–4 PD D), then every input must be sorted by that same multi-part key. When you run prior SORT steps, use SORT FIELDS=(1,10,CH,A,11,4,PD,D) (or the same positions/formats you use in MERGE FIELDS=). The entire key—all positions, formats, and directions—must match across all inputs.

Explain It Like I'm Five

You have two piles of cards. Each pile is already in order from A to Z. Before you can merge them into one big A-to-Z pile, you have to make sure both piles are really in A-to-Z order. If one pile was sorted by first name and the other by last name, when you merge you mix two different orderings and get a mess. So “pre-sorted for merge” means: both piles are sorted the same way (same rule, same direction). If you are not sure about a pile, sort it again with that same rule. Then you can merge them into one correct pile.

Exercises

You will merge by bytes 20–25 PD ascending. One input was sorted by 20–25 CH ascending. Is it safe to use as-is? What should you do?
Write the SORT FIELDS= you would use to prepare an unsorted file for MERGE FIELDS=(1,8,CH,A,9,4,ZD,D).
Why must all MERGE inputs use the same key direction (all A or all D)?
You have three inputs: two from a prior job that used SORT FIELDS=(1,10,CH,A), and one from an external system with unknown order. What steps do you take before the MERGE?

Quiz

Test Your Knowledge

1. What must match across all inputs before you MERGE them?

Only record length
The sort key (position, length, format, and direction) that each input is sorted by
Only the key position
Dataset names

2. Your MERGE inputs came from different jobs. How do you ensure they are sorted correctly for MERGE?

DFSORT checks automatically
Run a SORT step on each input using the same SORT FIELDS= as you will use in MERGE FIELDS=, then MERGE those outputs
Only the first input needs to be sorted
Use INCLUDE to fix order

3. If one input is sorted ascending and another descending by the same key, what happens when you MERGE?

DFSORT merges correctly by using the first input order
The merged output will not be correctly ordered; one stream is in the wrong direction
DFSORT reverses the descending one automatically
Only the ascending input is used

4. What is a safe way to prepare two unsorted files for a MERGE step?

Concatenate them and run one SORT
Run SORT on each file with the same SORT FIELDS= as the planned MERGE FIELDS=, then MERGE the two sorted outputs
Use MERGE with SORTIN01 and SORTIN02 and hope for the best
Use INREC to fix the key

5. Why is it important that key format (e.g. CH, PD) match across MERGE inputs?

It does not matter; DFSORT converts automatically
Different formats compare differently; if one input was sorted as CH and another as PD, the key order may not match and merged order can be wrong
Only length must match
Format only affects performance