MainframeMaster

Temporary Datasets

When DFSORT runs a sort or merge, it may need more space than available memory. It uses work (temporary) datasets to hold intermediate data. You can let DFSORT dynamically allocate these, or you can supply them explicitly with DD names SORTWK01, SORTWK02, and so on. This page explains what temporary datasets are for, when they are used, how to allocate them in JCL, and how that compares to dynamic allocation.

Environment Setup
Progress0 of 0 lessons

Why DFSORT Needs Work Space

During a sort or merge, DFSORT holds records in memory and may write intermediate results to disk when the data does not fit in the allocated memory. The disk space used for this is the sort work or work datasets. They are temporary: used only during the step and then discarded. Using disk allows DFSORT to sort or merge very large files that would not fit in central storage. The number and size of work datasets depend on the volume of data and the options (e.g. SIZE, dynamic allocation limits) in effect.

Dynamic Allocation vs Explicit SORTWKnn

You have two ways to provide work space:

  • Dynamic allocation — If you do not allocate SORTWK01, SORTWK02, … in your JCL, DFSORT can request that the system (MVS) allocate work datasets for it automatically. This is called dynamic allocation. It is controlled by DFSORT options (e.g. DYNALLOC) and by your installation's setup. When it is allowed, you do not need to code any SORTWKnn DDs; the step runs and work is allocated behind the scenes. Ease of use: no JCL to maintain. Drawback: you do not control the volume or exact space.
  • Explicit SORTWKnn — You code DD statements for SORTWK01, SORTWK02, and optionally more (e.g. through SORTWK32). You choose the DSN (often a temporary name or a generation), DISP, SPACE, and optionally VOL=SER=. DFSORT uses these datasets for work. Advantage: you control where work goes (e.g. fast DASD, specific pool). Disadvantage: you must estimate space and maintain the JCL; if you provide too little space, the step can fail.

Many shops use dynamic allocation for most jobs and reserve explicit SORTWKnn for large or performance-critical sorts where placement matters.

Allocating Temporary Work Datasets in JCL

When you allocate SORTWKnn explicitly, you typically:

  • Use a temporary name or no DSN. You can use DSN=&&TEMP1 (job temporary) or a name that is unique per run. Some sites use a single high-level qualifier for sort work (e.g. SORTWORK.TEMP). The dataset should not be cataloged.
  • Use DISP=(NEW,DELETE,DELETE). NEW creates the dataset. The first DELETE means delete on normal step completion; the second means delete on abnormal completion. So the work dataset is removed when the step ends. You do not want to keep these datasets.
  • Specify SPACE=. Allocate enough space for the work. How much depends on record count, record length, and sort complexity. A common approach is to use SPACE=(CYL,(n,m)) with n primary and m secondary cylinders; the exact values are site- and job-dependent. Under-allocation can cause out-of-space abends.
  • Optionally specify VOL=SER=. If your shop wants sort work on specific volumes (e.g. faster storage), you can point each SORTWKnn to a volume. Otherwise the system chooses a volume from the default pool.

Example

jcl
1
2
3
4
5
6
//SORTWK01 DD DSN=&&SORTWK1,DISP=(NEW,DELETE,DELETE), // SPACE=(CYL,(50,10)),UNIT=SYSDA //SORTWK02 DD DSN=&&SORTWK2,DISP=(NEW,DELETE,DELETE), // SPACE=(CYL,(50,10)),UNIT=SYSDA //SORTWK03 DD DSN=&&SORTWK3,DISP=(NEW,DELETE,DELETE), // SPACE=(CYL,(50,10)),UNIT=SYSDA

Here three work datasets are created as job temporaries (&&). Each gets 50 cylinders primary and 10 secondary. UNIT=SYSDA uses the system default direct-access device type. They are deleted when the step (or job) ends.

How Many Work Datasets and How Much Space?

The number of SORTWKnn datasets and the total space required depend on the data size and DFSORT's internal algorithm. There is no single formula that fits all jobs. In practice:

  • Start with a few (e.g. 2–4) SORTWKnn and a generous SPACE= (e.g. CYL,(50,20) each). If the step still fails with space errors, add more DDs or increase SPACE. Your site may have guidelines (e.g. "use 4 work datasets, 100 cylinders each for large sorts").
  • If you use dynamic allocation and the step fails with a work allocation error, consider switching to explicit SORTWKnn with enough space, or ask your support team to relax dynamic allocation limits or provide a larger sort region.

The OPTION statement can influence memory and work usage (e.g. SIZE, FILSZ). See the OPTION and performance-tuning tutorials for more detail.

Job Temporaries (&&) vs Specific Names

DSN=&&name creates a job temporary dataset. It exists only for the duration of the job and is not cataloged. The name is unique to the job. This is ideal for sort work: you do not need to invent a unique name each run, and the dataset is automatically cleaned up. Alternatively, you can use a fixed name (e.g. USERID.SORT.WORK01) and DISP=(NEW,DELETE,DELETE); the dataset is deleted at step end, so it does not persist. Using a cataloged permanent name for sort work is unusual and generally not recommended.

Explain It Like I'm Five

When you sort a huge pile of cards, sometimes your desk (memory) is not big enough to hold all of them at once. So you use extra tables (work datasets) to put some of the cards down while you work on the rest. Those tables are temporary—you don't keep them when you're done. DFSORT does the same: it uses temporary work datasets when the data is too big for memory. You can either let DFSORT ask for those tables itself (dynamic allocation) or set up the tables yourself in the JCL (SORTWK01, SORTWK02, …). Either way, when the sort is finished, the work tables are cleared away.

Exercises

  1. What is the main advantage of using explicit SORTWK01, SORTWK02 instead of dynamic allocation?
  2. What DISP would you use for a temporary sort work dataset that should be deleted when the step ends?
  3. Your DFSORT step fails with an out-of-space error on a work dataset. What are two things you can try?
  4. Why are sort work datasets usually not cataloged?

Quiz

Test Your Knowledge

1. What are SORTWK01, SORTWK02, ... used for?

  • Input data
  • Control statements
  • Work space when sort data does not fit in memory
  • Message output

2. Are SORTWKnn DD statements required for every DFSORT step?

  • Yes, always
  • No; DFSORT can dynamically allocate work datasets
  • Only for MERGE
  • Only when using INCLUDE

3. What DISP is typically used for temporary sort work datasets?

  • DISP=SHR
  • DISP=(NEW,DELETE,DELETE) or (NEW,PASS,DELETE)
  • DISP=OLD
  • DISP=(OLD,CATLG,KEEP)

4. Why might a shop use explicit SORTWKnn instead of dynamic allocation?

  • To make the job slower
  • To control which volumes or storage pool work files use
  • To avoid sorting
  • To reduce SYSIN size

5. What happens to temporary work datasets when the step completes normally?

  • They are cataloged
  • They are typically deleted (with DISP=DELETE)
  • They become the new SORTOUT
  • They are passed to the next step