A stable sort is one that preserves the relative order of records that have equal sort keys. So if in the input file record A comes before record B and both have the same key value, then in the output A still comes before B. In DFSORT you get this behavior by specifying OPTION EQUALS in your control statements. Without it, the default is often NOEQUALS, which means the order of records with duplicate keys is not guaranteed to match the input—the sort may output them in any order. When your downstream logic depends on the order of records within a group of equal keys (for example, you sorted the input by time and now you sort by department but want to keep time order within each department), you need a stable sort. This page explains what stability means, how to use OPTION EQUALS, and when it matters.
When two records have the same value in the sort key (or in all keys), the sort has to put one before the other. A stable sort decides that order by looking at the input order: whichever record appeared first in the input stays first in the output. So the only thing that changes is the order between records with different key values; within records that are equal on the key, the original order is preserved. An unstable sort does not promise that—within equal keys, the output order can be arbitrary (or implementation-dependent). For many applications you do not care, but when you do (e.g. multi-pass processing where you rely on secondary order), you need stability.
In DFSORT, you request stable sort behavior with OPTION EQUALS. You add it to your SYSIN control statements, typically with the other OPTION keywords (e.g. OPTION EQUALS, or OPTION COPY,EQUALS if you are doing a copy). When EQUALS is in effect, records that compare equal on the sort key(s) are written in the same order they were read from the input. So the sort is stable.
If you do not specify EQUALS, the default is often NOEQUALS (site-dependent). With NOEQUALS, DFSORT does not guarantee to preserve the order of records with equal keys. The sort still orders correctly by the key (e.g. all department 10 together, then department 20), but within department 10 the order of the records may not match the input. If your job does not care about order within equal keys, NOEQUALS is fine and may allow the product to use a faster internal strategy. If you need predictable order within equal keys, explicitly specify OPTION EQUALS.
Use OPTION EQUALS when: (1) You are sorting by one key but the input was already ordered by another (e.g. time), and you want to keep that secondary order within each key group. (2) A later step or program assumes that records with the same key are in a specific order (e.g. first-in-first-out within customer). (3) You are doing a multi-key sort and want to rely on input order as a tie-breaker when all specified keys are equal. If you do not have any such requirement, you can omit EQUALS and accept the default.
The extra cost of OPTION EQUALS is typically minimal. The sort has to remember and use the input sequence when writing records with equal keys, but this is a small overhead. Do not avoid EQUALS for performance reasons unless you have measured a problem; use it when you need stable behavior.
12SORT FIELDS=(1,4,CH,A) OPTION EQUALS
Sort by bytes 1–4 character ascending, and preserve input order for records that have the same value in bytes 1–4. So within each group of equal keys, the order is the same as in the input file.
Imagine lining up by favorite color. With a "stable" sort, when two people have the same favorite color, the one who was standing in line first stays first. With an "unstable" sort, the teacher might put them in any order. If you need "first in line stays first when colors are the same," you ask for the stable sort (OPTION EQUALS).
1. What does OPTION EQUALS do in DFSORT?
2. If you do not specify EQUALS, what happens to records with duplicate keys?
3. When would you need a stable sort (OPTION EQUALS)?
4. What is the performance impact of EQUALS?
5. How do you request stable sort behavior in DFSORT?