MainframeMaster

Batch Processing Automation

DFSORT is often used inside larger batch jobs: one step might filter and normalize data, the next might sort and deduplicate, and another might produce a report or load file. Automating these flows means chaining steps so the output of one step becomes the input of the next, using temporary datasets so you do not leave intermediates in the catalog, and using JCL features like COND and RESTART so the job behaves correctly when a step fails or when you need to rerun from a given point. This page explains how to use DFSORT in automated batch: passing data between steps, using temporary datasets (&&), conditional step execution (COND), and designing for restart. It also outlines a typical multi-step pipeline (filter → sort/sum → report) so you can adapt it to your shop.

Real World Use Cases
Progress0 of 0 lessons

Chaining DFSORT Steps

In a single job you can run multiple DFSORT steps. The output of step 1 (SORTOUT or an OUTFIL DD) becomes the input of step 2 (SORTIN). To do that, step 2's SORTIN DD must point to the dataset that step 1 wrote. Two common ways: (1) Use a temporary dataset name in step 1 (e.g. DSN=&&WORK1 on SORTOUT), then in step 2 use DSN=&&WORK1 on SORTIN. The system passes the same temporary dataset from step 1 to step 2. (2) Use a backward reference: in step 2, SORTIN DD use REF=*.STEP1.SORTOUT so SORTIN is the same dataset as STEP1's SORTOUT. Either way, step 2 reads exactly what step 1 wrote.

JCL and automation patterns
PatternTechniquePurpose
Chain stepsSORTOUT → SORTIN via REF or &&TEMPPass data from one DFSORT step to the next
Temporary datasets&&WORK1, &&WORK2Avoid permanent intermediates; auto-deleted at job end
Conditional executionCOND on EXECSkip a step if a prior step failed
RestartRESTART=stepname, persistent datasetsResume job from a step without redoing completed work

Temporary Datasets (&&)

A dataset name that starts with && is a temporary (job) dataset. It is created when the step that defines it runs and is deleted at the end of the job (normal or abnormal termination). You do not need to specify DISP=(,DELETE) for cleanup—the system handles it. Temporary datasets are ideal for intermediate results between steps: e.g. step 1 writes to &&WORK1, step 2 reads &&WORK1 and writes to &&WORK2, step 3 reads &&WORK2. No permanent datasets are left behind, and you can run the job again without clearing old intermediates. The same &&name used in more than one step in the same job refers to the same dataset (passed forward). Do not use the same &&name for two different datasets in the same job.

Example: Two-Step Chain

text
1
2
3
4
5
6
7
8
9
10
11
12
13
14
//STEP1 EXEC PGM=SORT //SORTIN DD DSN=INPUT.FILE,DISP=SHR //SORTOUT DD DSN=&&WORK1,DISP=(NEW,PASS),... //SYSIN DD * OMIT COND=(1,10,CH,EQ,C' ') SORT FIELDS=(1,10,CH,A) /* //STEP2 EXEC PGM=SORT //SORTIN DD DSN=&&WORK1,DISP=(OLD,DELETE) //SORTOUT DD DSN=FINAL.OUTPUT,DISP=(NEW,CATLG),... //SYSIN DD * SUM FIELDS=(1,10,CH) SORT FIELDS=COPY /*

Step 1 reads INPUT.FILE, omits records with blank key, sorts by 1,10, and writes to &&WORK1. DISP=(NEW,PASS) creates the dataset and passes it to a subsequent step. Step 2 reads &&WORK1 with DISP=(OLD,DELETE), applies SUM to deduplicate, and writes to FINAL.OUTPUT. After the job, &&WORK1 is deleted. You could use REF=*.STEP1.SORTOUT for STEP2 SORTIN instead of naming &&WORK1 again.

Conditional Execution (COND)

Sometimes you want to skip a step if a previous step failed. The COND parameter on the EXEC statement controls that. For example COND=(4,LT,STEP1) means: skip this step if the return code from STEP1 is 4 or greater (LT = less than from the step's perspective—actually "skip if step RC >= 4"). So if STEP1 abends or returns 4, the step with this COND is skipped. You can combine conditions with COND=((4,LT,STEP1),(8,LT,STEP2)). Check your JCL manual for the exact syntax (EVEN, ONLY, etc.). Using COND on a DFSORT step that depends on a prior step avoids running it when the input from that prior step is missing or invalid.

Restartability

A job is restartable if you can run it again from a point after a failure without redoing all work. DFSORT does not checkpoint inside a step; if a step fails, that step must be rerun. To make the job restartable: (1) Write outputs that later steps need to datasets that persist (or that you can recreate). (2) Use RESTART=stepname on the JOB statement so that when you resubmit the job, it starts from that step; earlier steps are not executed. (3) Design so that the step named in RESTART can read its input from the previous run (e.g. the previous step's output was cataloged or kept). Some shops use a convention where critical intermediates are written to permanent datasets with a run identifier so that restart can point to them. Others use temporary datasets and accept that on restart they must rerun from step 1 unless they have saved intermediates elsewhere.

Typical Multi-Step Pipeline

A common pattern is: (1) Extract/filter step: read source, INCLUDE/OMIT and INREC to filter and normalize, write to &&WORK1. (2) Sort/sum step: read &&WORK1, SORT FIELDS=, optionally SUM to deduplicate or aggregate, write to &&WORK2 or to final dataset. (3) Report or copy step: read the sorted/deduped data, OUTFIL to produce a report or copy to a final format. Each step is a single DFSORT (or ICETOOL) invocation. Data flows forward via temporary or permanent datasets. You can add COND so that if step 1 fails, steps 2 and 3 are skipped, and you can use RESTART=STEP2 (for example) to resume from step 2 if step 1 already completed in a previous run and you have &&WORK1 or a copy.

Symbolic Parameters and Generics

Many shops use JCL symbolic parameters (e.g. &INPUT, &RUNID) so the same procedure or job can be used with different inputs or run identifiers. The DFSORT step does not change; only the DD statements reference the symbol. For example SORTIN DD DSN=&INPUT..FILE,DISP=SHR and SORTOUT DD DSN=&OUTPUT..FILE,DISP=(NEW,CATLG). The procedure is then invoked with INPUT=PROD and OUTPUT=PROD.RESULT. This keeps the control statements (SYSIN) generic and the data names parameterized for automation and reuse.

Best Practices

  • Use temporary datasets (&&) for intermediates between steps to avoid catalog buildup and to allow reruns.
  • Use REF=*.stepname.ddname when the next step reads the previous step's output so you do not duplicate DSN names.
  • Use COND on steps that depend on a prior step so they are skipped when the prior step fails.
  • Document the pipeline (which step produces what) and the restart strategy so operators know how to resume after a failure.
  • Keep SYSIN (control statements) in a library or inline and use symbols for dataset names so the same job can run in different environments.

Explain It Like I'm Five

Imagine a factory line: the first machine takes raw material and cleans it (step 1), then puts it on a conveyor (&&WORK1). The second machine takes from the conveyor and sorts and packs it (step 2), then puts the box on another conveyor (&&WORK2). The third machine takes the box and prints a label (step 3). The conveyors are like temporary datasets—they only exist while the factory is running, and at the end they are put away. If the first machine breaks, you do not run the second and third (COND). If the factory stops and starts again tomorrow, you might start from the second machine if the first already finished (RESTART).

Exercises

  1. Write JCL for two steps: step 1 SORT copies input to &&TEMP; step 2 SORT reads &&TEMP and writes to OUT.FILE. Use the correct DISP for &&TEMP in both steps.
  2. What does COND=(4,LT,STEP1) mean for the step that contains it? When is that step skipped?
  3. Why might you use REF=*.STEP1.SORTOUT on step 2's SORTIN instead of DSN=&&WORK1?
  4. Describe a three-step pipeline: filter, sort/sum, report. What does each step read and write?

Quiz

Test Your Knowledge

1. How do you pass the output of one DFSORT step to the next step in the same job?

  • Use SORTOUT in step 1 and SORTIN in step 2; in JCL, step 2 SORTIN DD references the step 1 SORTOUT dataset (e.g. DSN=&&TEMP or REF to step1.SORTOUT)
  • You cannot chain steps
  • Use the same DD name for both
  • Only with ICETOOL

2. Why use temporary datasets (&&name) between DFSORT steps?

  • They are faster
  • They are automatically deleted at job end, so you do not leave intermediate datasets in the catalog; they also avoid name clashes when the job is run multiple times
  • DFSORT requires them
  • Only for SORTIN

3. What is the purpose of COND on a JCL EXEC statement when running DFSORT?

  • To conditionally execute the SORT control statements
  • To skip the step if a previous step abended or returned a certain return code—so you can avoid running a dependent step when an earlier step failed
  • To sort by condition
  • COND is only for COBOL

4. How can you make a DFSORT step restartable?

  • DFSORT is always restartable
  • Write intermediate results to datasets that persist (or to checkpoint); use a job step that checks for the existence of the intermediate and only runs the steps that have not yet completed; or use job restart (JOBLIB, RESTART=stepname) so the job resumes from a given step
  • Only with OPTION RESTART
  • You cannot

5. What is a common pattern for a multi-step DFSORT pipeline in one job?

  • One step only
  • Step 1: filter/normalize with DFSORT, write to &&WORK1. Step 2: read &&WORK1, sort/merge/sum, write to &&WORK2 or final. Step 3: optional report or copy. Each step uses SORTIN from the previous SORTOUT or a DD reference
  • Only JOINKEYS
  • Only OUTFIL