MainframeMaster

Tape Sorting

DFSORT can read input from tape and write output to tape. You assign tape datasets to the SORTIN and SORTOUT DD names in JCL; DFSORT treats them as sequential datasets and does not require different control statements. The main considerations are performance—tape is slower than DASD—and ensuring that sortwork (temporary work datasets) is on disk, not tape, so that the sort can complete in reasonable time. This page covers using tape for SORTIN and SORTOUT, where to put sortwork, and practical considerations for tape sort jobs.

Other DFSORT Topics
Progress0 of 0 lessons

Tape as SORTIN

You can assign a tape dataset to SORTIN. In JCL you allocate a tape drive (e.g. UNIT=TAPE or a specific unit name) and point the DD to the tape dataset (by name, or by volume). DFSORT reads the dataset sequentially, record by record, just as it would from a DASD sequential dataset. There are no special SORT or OPTION statements for tape input. The main difference is performance: tape has higher latency and lower throughput than disk, so reading a large input from tape can take longer. Use a blocksize that is efficient for your tape drive (consult your installation’s tape standards) so that I/O is as fast as possible.

Tape as SORTOUT

You can write the sorted output to a tape dataset by assigning it to SORTOUT. DFSORT writes the records sequentially. Ensure the tape is mounted and has enough capacity for the full output; if the dataset is large, you may need a multi-reel (multi-volume) tape dataset so that when one reel is full, the system continues on the next. The write phase can be slow compared to DASD, so elapsed time may be longer than for a disk-only sort.

Sortwork Must Be on DASD

DFSORT uses sortwork datasets to hold and merge intermediate data. These datasets are read and written many times during the sort; the access pattern is not purely sequential. Tape is a sequential medium with high seek and rewind time; it is not suitable for sortwork. Always allocate sortwork (SORTWK01, SORTWK02, ... or dynamic allocation) on DASD. If you put sortwork on tape (if supported at all), the job would be extremely slow or would fail. So for a tape sort job: SORTIN and/or SORTOUT can be tape, but sortwork must be on disk.

Performance

When SORTIN is on tape, the time to read the entire input is often the largest part of the job’s elapsed time. Use a blocksize that your tape drive and installation support (often 32K, 64K, or larger) to minimize the number of read operations. When SORTOUT is on tape, the write phase is similarly I/O-bound. Having sortwork on fast DASD helps the merge phase complete quickly so that the job is not waiting on tape more than necessary. If possible, consider copying tape input to DASD first, sorting on DASD, then copying the result to tape if tape output is required—that can reduce elapsed time when tape I/O is the bottleneck.

JCL for Tape

For tape datasets you typically specify UNIT=TAPE (or a tape unit name), VOL=SER= for a specific reel, or let the system assign a tape. For multi-reel output you may need VOL=(,,,n) or equivalent so that multiple reels can be used. Dataset name and DISP= (e.g. DISP=(NEW,CATLG) for output) follow your site’s conventions. The key is that the DD name (SORTIN or SORTOUT) is the same as for DASD; only the unit and volume (and possibly DCB) differ.

Explain It Like I'm Five

The sorter can read cards from a long tape (slow) and write the sorted cards to another tape (slow). But while it is sorting, it needs lots of tables to put cards on—and those tables must be fast (disk). If we used a tape for those tables, the sorter would be waiting forever. So: tape in, tape out is okay; the tables in the middle have to be on the fast kind of storage.

Exercises

  1. Can SORTIN be a tape dataset? What is the main performance implication?
  2. Why should sortwork be on DASD and not on tape?
  3. Do you need different DFSORT control statements when SORTOUT is tape?
  4. What can you do in JCL to make tape I/O more efficient for a sort job?

Quiz

Test Your Knowledge

1. Can DFSORT read input from a tape dataset?

  • No
  • Yes; you can assign a tape dataset to SORTIN (or the equivalent DD). DFSORT reads it sequentially; tape is slower than DASD but works the same logically
  • Only with MERGE
  • Only with OPTION COPY

2. Where should sortwork be when sorting tape data?

  • On the same tape
  • On DASD (disk); sortwork requires random I/O and multiple passes. Tape is sequential and much slower; sortwork on tape would severely degrade performance or may not be supported
  • Only in memory
  • Sortwork is not used with tape

3. What is a main performance consideration when SORTIN is on tape?

  • Record length only
  • Tape is slower than DASD and sequential; the initial read of SORTIN can dominate elapsed time. Using a larger blocksize and ensuring enough sortwork on disk helps
  • Only OPTION FILSZ
  • Only the number of reels

4. Can SORTOUT be written to tape?

  • No
  • Yes; you can assign a tape dataset to SORTOUT. DFSORT writes the sorted output sequentially to the tape. Ensure the tape has enough capacity and is mounted
  • Only for MERGE
  • Only with INREC

5. Why might a site still use tape for sort input or output today?

  • Tape is always faster
  • Legacy data may be on tape; backup or archival data is often on tape; or data exchange with other sites may use tape. So even though DASD is faster, tape remains in use for some workflows
  • DFSORT requires it
  • Only for small files