MainframeMaster

COPY vs SORT

In DFSORT you can either sort (reorder records by a key) or copy (pass records through without reordering). OPTION COPY tells DFSORT to skip the sort (and merge) phase: records are read from SORTIN, optionally filtered by INCLUDE/OMIT and reformatted by INREC/OUTREC, and written to SORTOUT in the same order they were read. When you do not need output in a specific order, COPY is faster and uses less CPU and sortwork. This page explains the difference between COPY and SORT, when to use each, and how COPY interacts with INCLUDE, OMIT, INREC, OUTREC, and SUM.

OPTION Statement
Progress0 of 0 lessons

What SORT Does

When you specify SORT FIELDS= (or MERGE FIELDS= for multiple inputs), DFSORT reorders records so that the output is in ascending or descending order by the key(s) you define. That requires reading all relevant input, comparing keys, and writing records in the new order—often using sortwork datasets and extra CPU. The result is a file sorted by position, length, and format you specified. You use SORT when the application or downstream process needs records in a specific order (e.g. by customer ID, date, or a composite key).

What COPY Does

OPTION COPY means "do not sort and do not merge." DFSORT still reads from SORTIN (or the single input for a copy), still applies INCLUDE/OMIT and INREC if you code them, and still writes to SORTOUT (and OUTFIL) using OUTREC/OUTFIL. The only difference is that records are not reordered: the first record read (that passes any filter) is the first record written, and so on. So the output order is the input order (after filtering). No SORT FIELDS= or MERGE FIELDS= is used. Because there is no key comparison or reordering, COPY typically uses much less CPU and no (or minimal) sortwork. It is the right choice when you only need to filter, reformat, or copy data and the sequence of records does not matter (or should stay as in the input).

COPY vs SORT at a glance
AspectOPTION COPYSORT FIELDS=
Reorder recordsNo; output order = input orderYes; output order = sort key order
SORT FIELDS= / MERGE FIELDS=Not usedRequired (or MERGE)
INCLUDE / OMIT / INREC / OUTRECYesYes
SUMOnly correct if input already in key orderYes; sort by key first
CPU / sortworkLower; no sort phaseHigher; full sort/merge

When to Use COPY

  • Filter only. You want to keep or drop records based on a condition (INCLUDE or OMIT) and write the result to another dataset. Order does not matter. Use OPTION COPY and INCLUDE or OMIT; no SORT FIELDS=.
  • Reformat only. You want to change the layout (INREC or OUTREC)—reorder fields, add constants, convert formats—without changing the sequence of records. Use OPTION COPY and INREC/OUTREC.
  • Simple copy. You want to copy from one dataset to another (maybe with a different RECFM or LRECL). Use OPTION COPY; optionally INREC/OUTREC to adjust layout.
  • Performance. Your job currently does a SORT only to "pass through" data (e.g. SORT FIELDS=(1,1,CH,A) with no real key). If order does not matter, switching to OPTION COPY removes the sort phase and speeds up the job and reduces resource.

When to Use SORT

  • Output must be ordered. Downstream process or reporting needs records in key order (e.g. by ID, date, or region). Use SORT FIELDS= (or MERGE FIELDS= for pre-sorted inputs).
  • You use SUM. SUM collapses records with the same key into one; it expects records to be adjacent by key. So you must sort by the SUM key first. Use SORT FIELDS= with the same key as SUM control fields, then SUM. COPY is not appropriate unless the input is already sorted by that key.
  • Merge. You have multiple pre-sorted inputs to combine in order. Use MERGE FIELDS= (not COPY).

Syntax: COPY

Code OPTION COPY in SYSIN. Do not code SORT FIELDS= or MERGE FIELDS=. Example: copy with filter and reformat.

text
1
2
3
OPTION COPY INCLUDE COND=(1,5,CH,EQ,C'ACTIV') OUTREC FIELDS=(1,80)

Example: copy only (no filter, minimal reformat).

text
1
2
OPTION COPY OUTREC FIELDS=(1,100)

Syntax: SORT

Omit OPTION COPY and specify SORT FIELDS= (or MERGE FIELDS=). Example:

text
1
2
3
OPTION EQUALS SORT FIELDS=(1,10,CH,A) OUTREC FIELDS=(1,80)

Here records are sorted by bytes 1–10 character ascending, then written with OUTREC. Do not code OPTION COPY when you need this ordering.

COPY and SUM

SUM collapses records that have the same key into one record per key (and optionally sums numeric fields). It works by processing the file in order and comparing the key of the current record to the previous; when the key changes, it writes the previous group. So the input to SUM must be in key order. If you use OPTION COPY, the input order is whatever SORTIN had—unless that file was already sorted by the SUM key, SUM will not group correctly (only consecutive records with the same key would be collapsed). So when you need SUM, you normally use SORT FIELDS= with the same key as the SUM control fields. Use COPY only when you do not need SUM, or when your input is already sorted by the SUM key and you are doing a copy for another reason (e.g. reformat only).

Performance Comparison

A full sort reads all input, builds internal structures (or uses sortwork), compares keys, and writes output in sorted order. That can use significant CPU and I/O. COPY reads input and writes output in sequence; filtering and reformatting add some CPU but there is no key comparison or reordering. So for the same dataset size, COPY typically completes faster and uses less sortwork (often none). If your job today does a sort "just to copy" (e.g. SORT FIELDS=(1,1,CH,A) with no real requirement for order), replacing it with OPTION COPY is a simple performance improvement.

Explain It Like I'm Five

Imagine a stack of cards. With SORT, we shuffle them into order (e.g. by name A–Z). With COPY, we do not shuffle—we just take the top card, then the next, then the next, and put them in a new stack in the same order. So SORT = "reorder the cards"; COPY = "same order, just move them." If you do not care about order, COPY is quicker because we never have to sort.

Exercises

  1. Your job uses SORT FIELDS=(1,1,CH,A) and no SUM or order-dependent logic. What change could reduce CPU and sortwork?
  2. You need to keep only records where byte 20 is 'Y' and write them to a new file. Order does not matter. Write the control statements using OPTION COPY.
  3. Why is SUM normally used with SORT FIELDS= rather than OPTION COPY?
  4. You have two pre-sorted files to combine in order. Do you use COPY or MERGE? Why?

Quiz

Test Your Knowledge

1. What does OPTION COPY do?

  • Copies the first record only
  • Skips the sort/merge phase; records pass through in input order (after INCLUDE/OMIT and INREC)
  • Copies SYSIN to SYSOUT
  • Copies SORTOUT to SORTIN

2. When should you use OPTION COPY instead of SORT?

  • When you need records sorted by a key
  • When you only need to filter (INCLUDE/OMIT), reformat (INREC/OUTREC), or copy data without changing order
  • When you have two inputs to merge
  • When SORTOUT is full

3. Can you use SUM with OPTION COPY?

  • Yes; SUM runs after the copy phase
  • No; SUM requires records to be in key order, which COPY does not produce unless the input is already sorted
  • Only with INREC
  • SUM and COPY are the same

4. Does OPTION COPY still allow INCLUDE and OMIT?

  • No; COPY means no filtering
  • Yes; INCLUDE/OMIT and INREC/OUTREC all apply; only the sort/merge phase is skipped
  • Only OMIT
  • Only INREC

5. Why might OPTION COPY be faster than a full SORT?

  • It writes fewer records
  • It skips the sort/merge phase—no key comparison, no sortwork I/O, no reordering—so less CPU and I/O
  • It only reads the first record
  • COPY is never faster