DFSORT can read input from tape and write output to tape. You assign tape datasets to the SORTIN and SORTOUT DD names in JCL; DFSORT treats them as sequential datasets and does not require different control statements. The main considerations are performance—tape is slower than DASD—and ensuring that sortwork (temporary work datasets) is on disk, not tape, so that the sort can complete in reasonable time. This page covers using tape for SORTIN and SORTOUT, where to put sortwork, and practical considerations for tape sort jobs.
You can assign a tape dataset to SORTIN. In JCL you allocate a tape drive (e.g. UNIT=TAPE or a specific unit name) and point the DD to the tape dataset (by name, or by volume). DFSORT reads the dataset sequentially, record by record, just as it would from a DASD sequential dataset. There are no special SORT or OPTION statements for tape input. The main difference is performance: tape has higher latency and lower throughput than disk, so reading a large input from tape can take longer. Use a blocksize that is efficient for your tape drive (consult your installation’s tape standards) so that I/O is as fast as possible.
You can write the sorted output to a tape dataset by assigning it to SORTOUT. DFSORT writes the records sequentially. Ensure the tape is mounted and has enough capacity for the full output; if the dataset is large, you may need a multi-reel (multi-volume) tape dataset so that when one reel is full, the system continues on the next. The write phase can be slow compared to DASD, so elapsed time may be longer than for a disk-only sort.
DFSORT uses sortwork datasets to hold and merge intermediate data. These datasets are read and written many times during the sort; the access pattern is not purely sequential. Tape is a sequential medium with high seek and rewind time; it is not suitable for sortwork. Always allocate sortwork (SORTWK01, SORTWK02, ... or dynamic allocation) on DASD. If you put sortwork on tape (if supported at all), the job would be extremely slow or would fail. So for a tape sort job: SORTIN and/or SORTOUT can be tape, but sortwork must be on disk.
When SORTIN is on tape, the time to read the entire input is often the largest part of the job’s elapsed time. Use a blocksize that your tape drive and installation support (often 32K, 64K, or larger) to minimize the number of read operations. When SORTOUT is on tape, the write phase is similarly I/O-bound. Having sortwork on fast DASD helps the merge phase complete quickly so that the job is not waiting on tape more than necessary. If possible, consider copying tape input to DASD first, sorting on DASD, then copying the result to tape if tape output is required—that can reduce elapsed time when tape I/O is the bottleneck.
For tape datasets you typically specify UNIT=TAPE (or a tape unit name), VOL=SER= for a specific reel, or let the system assign a tape. For multi-reel output you may need VOL=(,,,n) or equivalent so that multiple reels can be used. Dataset name and DISP= (e.g. DISP=(NEW,CATLG) for output) follow your site’s conventions. The key is that the DD name (SORTIN or SORTOUT) is the same as for DASD; only the unit and volume (and possibly DCB) differ.
The sorter can read cards from a long tape (slow) and write the sorted cards to another tape (slow). But while it is sorting, it needs lots of tables to put cards on—and those tables must be fast (disk). If we used a tape for those tables, the sorter would be waiting forever. So: tape in, tape out is okay; the tables in the middle have to be on the fast kind of storage.
1. Can DFSORT read input from a tape dataset?
2. Where should sortwork be when sorting tape data?
3. What is a main performance consideration when SORTIN is on tape?
4. Can SORTOUT be written to tape?
5. Why might a site still use tape for sort input or output today?