How do I chain two DFSORT steps in one job?

Run the first step with SORTOUT (or OUTFIL) writing to a dataset—e.g. a temporary dataset like &&TEMP. In the second step, point SORTIN to that dataset using REF=*.STEP1.SORTOUT or DSN=&&TEMP. Then the output of step 1 is the input of step 2. Use temporary datasets (&&name) so intermediates are deleted at job end.

Why use temporary datasets between steps?

Temporary datasets (&&name) are deleted when the job ends. They avoid leaving intermediate files in the catalog and avoid name conflicts when the job is run again. They are the standard way to pass data from one step to the next within the same job without creating permanent intermediates.

How do I skip a DFSORT step if a previous step failed?

Use the COND parameter on the EXEC statement. For example COND=(4,LT,STEP1) skips this step if STEP1 ended with return code 4 or higher. So if STEP1 (e.g. a prior DFSORT or program) fails, the dependent DFSORT step is skipped and the job can still complete or fail in a controlled way.

How can I make my batch job restartable?

Write important intermediate results to datasets that persist (or use a staging area). Design the job so that re-running from a given step (using JOB RESTART=stepname) does not require re-running earlier steps; those steps can be skipped if their outputs already exist, or you run from the step after the last successful one. DFSORT does not checkpoint mid-step; the step is the unit of restart.

What is a typical multi-step DFSORT pipeline?

Step 1: Read input, filter (INCLUDE/OMIT), normalize (INREC), write to &&WORK1. Step 2: Read &&WORK1, sort (SORT FIELDS=), optionally SUM or JOINKEYS, write to &&WORK2 or final. Step 3 (optional): Read that output, produce report (OUTFIL HEADER/TRAILER) or copy to final. Each step uses SORTIN pointing to the previous step output.

Batch Processing Automation with DFSORT

Batch Processing Automation

DFSORT is often used inside larger batch jobs: one step might filter and normalize data, the next might sort and deduplicate, and another might produce a report or load file. Automating these flows means chaining steps so the output of one step becomes the input of the next, using temporary datasets so you do not leave intermediates in the catalog, and using JCL features like COND and RESTART so the job behaves correctly when a step fails or when you need to rerun from a given point. This page explains how to use DFSORT in automated batch: passing data between steps, using temporary datasets (&&), conditional step execution (COND), and designing for restart. It also outlines a typical multi-step pipeline (filter → sort/sum → report) so you can adapt it to your shop.

Real World Use Cases

Chaining DFSORT Steps

In a single job you can run multiple DFSORT steps. The output of step 1 (SORTOUT or an OUTFIL DD) becomes the input of step 2 (SORTIN). To do that, step 2's SORTIN DD must point to the dataset that step 1 wrote. Two common ways: (1) Use a temporary dataset name in step 1 (e.g. DSN=&&WORK1 on SORTOUT), then in step 2 use DSN=&&WORK1 on SORTIN. The system passes the same temporary dataset from step 1 to step 2. (2) Use a backward reference: in step 2, SORTIN DD use REF=*.STEP1.SORTOUT so SORTIN is the same dataset as STEP1's SORTOUT. Either way, step 2 reads exactly what step 1 wrote.

JCL and automation patterns
Pattern	Technique	Purpose
Chain steps	SORTOUT → SORTIN via REF or &&TEMP	Pass data from one DFSORT step to the next
Temporary datasets	&&WORK1, &&WORK2	Avoid permanent intermediates; auto-deleted at job end
Conditional execution	COND on EXEC	Skip a step if a prior step failed
Restart	RESTART=stepname, persistent datasets	Resume job from a step without redoing completed work

Temporary Datasets (&&)

A dataset name that starts with && is a temporary (job) dataset. It is created when the step that defines it runs and is deleted at the end of the job (normal or abnormal termination). You do not need to specify DISP=(,DELETE) for cleanup—the system handles it. Temporary datasets are ideal for intermediate results between steps: e.g. step 1 writes to &&WORK1, step 2 reads &&WORK1 and writes to &&WORK2, step 3 reads &&WORK2. No permanent datasets are left behind, and you can run the job again without clearing old intermediates. The same &&name used in more than one step in the same job refers to the same dataset (passed forward). Do not use the same &&name for two different datasets in the same job.

Example: Two-Step Chain

text

1
2
3
4
5
6
7
8
9
10
11
12
13
14
  //STEP1    EXEC PGM=SORT
  //SORTIN   DD DSN=INPUT.FILE,DISP=SHR
  //SORTOUT  DD DSN=&&WORK1,DISP=(NEW,PASS),...
  //SYSIN    DD *
    OMIT COND=(1,10,CH,EQ,C'          ')
    SORT FIELDS=(1,10,CH,A)
  /*
  //STEP2    EXEC PGM=SORT
  //SORTIN   DD DSN=&&WORK1,DISP=(OLD,DELETE)
  //SORTOUT  DD DSN=FINAL.OUTPUT,DISP=(NEW,CATLG),...
  //SYSIN    DD *
    SUM FIELDS=(1,10,CH)
    SORT FIELDS=COPY
  /*

Step 1 reads INPUT.FILE, omits records with blank key, sorts by 1,10, and writes to &&WORK1. DISP=(NEW,PASS) creates the dataset and passes it to a subsequent step. Step 2 reads &&WORK1 with DISP=(OLD,DELETE), applies SUM to deduplicate, and writes to FINAL.OUTPUT. After the job, &&WORK1 is deleted. You could use REF=*.STEP1.SORTOUT for STEP2 SORTIN instead of naming &&WORK1 again.

Conditional Execution (COND)

Sometimes you want to skip a step if a previous step failed. The COND parameter on the EXEC statement controls that. For example COND=(4,LT,STEP1) means: skip this step if the return code from STEP1 is 4 or greater (LT = less than from the step's perspective—actually "skip if step RC >= 4"). So if STEP1 abends or returns 4, the step with this COND is skipped. You can combine conditions with COND=((4,LT,STEP1),(8,LT,STEP2)). Check your JCL manual for the exact syntax (EVEN, ONLY, etc.). Using COND on a DFSORT step that depends on a prior step avoids running it when the input from that prior step is missing or invalid.

Restartability

A job is restartable if you can run it again from a point after a failure without redoing all work. DFSORT does not checkpoint inside a step; if a step fails, that step must be rerun. To make the job restartable: (1) Write outputs that later steps need to datasets that persist (or that you can recreate). (2) Use RESTART=stepname on the JOB statement so that when you resubmit the job, it starts from that step; earlier steps are not executed. (3) Design so that the step named in RESTART can read its input from the previous run (e.g. the previous step's output was cataloged or kept). Some shops use a convention where critical intermediates are written to permanent datasets with a run identifier so that restart can point to them. Others use temporary datasets and accept that on restart they must rerun from step 1 unless they have saved intermediates elsewhere.

Typical Multi-Step Pipeline

A common pattern is: (1) Extract/filter step: read source, INCLUDE/OMIT and INREC to filter and normalize, write to &&WORK1. (2) Sort/sum step: read &&WORK1, SORT FIELDS=, optionally SUM to deduplicate or aggregate, write to &&WORK2 or to final dataset. (3) Report or copy step: read the sorted/deduped data, OUTFIL to produce a report or copy to a final format. Each step is a single DFSORT (or ICETOOL) invocation. Data flows forward via temporary or permanent datasets. You can add COND so that if step 1 fails, steps 2 and 3 are skipped, and you can use RESTART=STEP2 (for example) to resume from step 2 if step 1 already completed in a previous run and you have &&WORK1 or a copy.

Symbolic Parameters and Generics

Many shops use JCL symbolic parameters (e.g. &INPUT, &RUNID) so the same procedure or job can be used with different inputs or run identifiers. The DFSORT step does not change; only the DD statements reference the symbol. For example SORTIN DD DSN=&INPUT..FILE,DISP=SHR and SORTOUT DD DSN=&OUTPUT..FILE,DISP=(NEW,CATLG). The procedure is then invoked with INPUT=PROD and OUTPUT=PROD.RESULT. This keeps the control statements (SYSIN) generic and the data names parameterized for automation and reuse.

Best Practices

Use temporary datasets (&&) for intermediates between steps to avoid catalog buildup and to allow reruns.
Use REF=*.stepname.ddname when the next step reads the previous step's output so you do not duplicate DSN names.
Use COND on steps that depend on a prior step so they are skipped when the prior step fails.
Document the pipeline (which step produces what) and the restart strategy so operators know how to resume after a failure.
Keep SYSIN (control statements) in a library or inline and use symbols for dataset names so the same job can run in different environments.

Explain It Like I'm Five

Imagine a factory line: the first machine takes raw material and cleans it (step 1), then puts it on a conveyor (&&WORK1). The second machine takes from the conveyor and sorts and packs it (step 2), then puts the box on another conveyor (&&WORK2). The third machine takes the box and prints a label (step 3). The conveyors are like temporary datasets—they only exist while the factory is running, and at the end they are put away. If the first machine breaks, you do not run the second and third (COND). If the factory stops and starts again tomorrow, you might start from the second machine if the first already finished (RESTART).

Exercises

Write JCL for two steps: step 1 SORT copies input to &&TEMP; step 2 SORT reads &&TEMP and writes to OUT.FILE. Use the correct DISP for &&TEMP in both steps.
What does COND=(4,LT,STEP1) mean for the step that contains it? When is that step skipped?
Why might you use REF=*.STEP1.SORTOUT on step 2's SORTIN instead of DSN=&&WORK1?
Describe a three-step pipeline: filter, sort/sum, report. What does each step read and write?

Quiz

Test Your Knowledge

1. How do you pass the output of one DFSORT step to the next step in the same job?

Use SORTOUT in step 1 and SORTIN in step 2; in JCL, step 2 SORTIN DD references the step 1 SORTOUT dataset (e.g. DSN=&&TEMP or REF to step1.SORTOUT)
You cannot chain steps
Use the same DD name for both
Only with ICETOOL

2. Why use temporary datasets (&&name) between DFSORT steps?

They are faster
They are automatically deleted at job end, so you do not leave intermediate datasets in the catalog; they also avoid name clashes when the job is run multiple times
DFSORT requires them
Only for SORTIN

3. What is the purpose of COND on a JCL EXEC statement when running DFSORT?

To conditionally execute the SORT control statements
To skip the step if a previous step abended or returned a certain return code—so you can avoid running a dependent step when an earlier step failed
To sort by condition
COND is only for COBOL

4. How can you make a DFSORT step restartable?

DFSORT is always restartable
Write intermediate results to datasets that persist (or to checkpoint); use a job step that checks for the existence of the intermediate and only runs the steps that have not yet completed; or use job restart (JOBLIB, RESTART=stepname) so the job resumes from a given step
Only with OPTION RESTART
You cannot

5. What is a common pattern for a multi-step DFSORT pipeline in one job?

One step only
Step 1: filter/normalize with DFSORT, write to &&WORK1. Step 2: read &&WORK1, sort/merge/sum, write to &&WORK2 or final. Step 3: optional report or copy. Each step uses SORTIN from the previous SORTOUT or a DD reference
Only JOINKEYS
Only OUTFIL

Batch Processing Automation

Chaining DFSORT Steps

Temporary Datasets (&&)

Example: Two-Step Chain

Conditional Execution (COND)

Restartability

Typical Multi-Step Pipeline

Symbolic Parameters and Generics

Best Practices

Explain It Like I'm Five

Exercises

Quiz

Test Your Knowledge

Related Concepts

Running DFSORT in JCL

Temporary datasets

ETL workflows

Data cleansing pipelines

Related Pages