In DFSORT you can either sort (reorder records by a key) or copy (pass records through without reordering). OPTION COPY tells DFSORT to skip the sort (and merge) phase: records are read from SORTIN, optionally filtered by INCLUDE/OMIT and reformatted by INREC/OUTREC, and written to SORTOUT in the same order they were read. When you do not need output in a specific order, COPY is faster and uses less CPU and sortwork. This page explains the difference between COPY and SORT, when to use each, and how COPY interacts with INCLUDE, OMIT, INREC, OUTREC, and SUM.
When you specify SORT FIELDS= (or MERGE FIELDS= for multiple inputs), DFSORT reorders records so that the output is in ascending or descending order by the key(s) you define. That requires reading all relevant input, comparing keys, and writing records in the new order—often using sortwork datasets and extra CPU. The result is a file sorted by position, length, and format you specified. You use SORT when the application or downstream process needs records in a specific order (e.g. by customer ID, date, or a composite key).
OPTION COPY means "do not sort and do not merge." DFSORT still reads from SORTIN (or the single input for a copy), still applies INCLUDE/OMIT and INREC if you code them, and still writes to SORTOUT (and OUTFIL) using OUTREC/OUTFIL. The only difference is that records are not reordered: the first record read (that passes any filter) is the first record written, and so on. So the output order is the input order (after filtering). No SORT FIELDS= or MERGE FIELDS= is used. Because there is no key comparison or reordering, COPY typically uses much less CPU and no (or minimal) sortwork. It is the right choice when you only need to filter, reformat, or copy data and the sequence of records does not matter (or should stay as in the input).
| Aspect | OPTION COPY | SORT FIELDS= |
|---|---|---|
| Reorder records | No; output order = input order | Yes; output order = sort key order |
| SORT FIELDS= / MERGE FIELDS= | Not used | Required (or MERGE) |
| INCLUDE / OMIT / INREC / OUTREC | Yes | Yes |
| SUM | Only correct if input already in key order | Yes; sort by key first |
| CPU / sortwork | Lower; no sort phase | Higher; full sort/merge |
Code OPTION COPY in SYSIN. Do not code SORT FIELDS= or MERGE FIELDS=. Example: copy with filter and reformat.
123OPTION COPY INCLUDE COND=(1,5,CH,EQ,C'ACTIV') OUTREC FIELDS=(1,80)
Example: copy only (no filter, minimal reformat).
12OPTION COPY OUTREC FIELDS=(1,100)
Omit OPTION COPY and specify SORT FIELDS= (or MERGE FIELDS=). Example:
123OPTION EQUALS SORT FIELDS=(1,10,CH,A) OUTREC FIELDS=(1,80)
Here records are sorted by bytes 1–10 character ascending, then written with OUTREC. Do not code OPTION COPY when you need this ordering.
SUM collapses records that have the same key into one record per key (and optionally sums numeric fields). It works by processing the file in order and comparing the key of the current record to the previous; when the key changes, it writes the previous group. So the input to SUM must be in key order. If you use OPTION COPY, the input order is whatever SORTIN had—unless that file was already sorted by the SUM key, SUM will not group correctly (only consecutive records with the same key would be collapsed). So when you need SUM, you normally use SORT FIELDS= with the same key as the SUM control fields. Use COPY only when you do not need SUM, or when your input is already sorted by the SUM key and you are doing a copy for another reason (e.g. reformat only).
A full sort reads all input, builds internal structures (or uses sortwork), compares keys, and writes output in sorted order. That can use significant CPU and I/O. COPY reads input and writes output in sequence; filtering and reformatting add some CPU but there is no key comparison or reordering. So for the same dataset size, COPY typically completes faster and uses less sortwork (often none). If your job today does a sort "just to copy" (e.g. SORT FIELDS=(1,1,CH,A) with no real requirement for order), replacing it with OPTION COPY is a simple performance improvement.
Imagine a stack of cards. With SORT, we shuffle them into order (e.g. by name A–Z). With COPY, we do not shuffle—we just take the top card, then the next, then the next, and put them in a new stack in the same order. So SORT = "reorder the cards"; COPY = "same order, just move them." If you do not care about order, COPY is quicker because we never have to sort.
1. What does OPTION COPY do?
2. When should you use OPTION COPY instead of SORT?
3. Can you use SUM with OPTION COPY?
4. Does OPTION COPY still allow INCLUDE and OMIT?
5. Why might OPTION COPY be faster than a full SORT?