DFSORT gets its instructions from the dataset (or stream) you allocate to the SYSIN DD. The contents of SYSIN are called control statements. They tell DFSORT whether to sort or merge, which keys to use, how to filter records, how to reformat input and output, and what options to apply. This page explains how SYSIN is used, how to supply control statements (in-stream vs dataset), and the main statement types and the order in which they typically appear.
SYSIN is the DD name in your JCL that points to the source of control statements. It does not hold your data records—it holds the commands that tell DFSORT what to do with the data in SORTIN (or SORTIN01, SORTIN02, …). DFSORT reads SYSIN once at the beginning of the step, before it opens SORTIN or SORTOUT. It parses each line (or record) as control statement text. Anything that is not a valid control statement may be ignored or cause an error, depending on the content. After SYSIN is read, DFSORT knows the operation (SORT or MERGE), the keys, the filters (INCLUDE/OMIT), the reformatting (INREC, OUTREC), and the options. It then runs the input, sort/merge, and output phases using that information.
You can put the control statements directly in the JCL using in-stream data. You code //SYSIN DD * (or DD DATA). The next lines in the JCL are treated as the content of the SYSIN dataset until DFSORT (or the system) sees a line that contains only /*. So you type your control statements line by line, then a line with /* to stop. This is convenient for small jobs or examples. The statements are in the same member as your JCL, so there is no separate dataset to create or maintain.
12345//SYSIN DD * SORT FIELDS=(1,20,CH,A) INCLUDE COND=(5,2,CH,EQ,C'NY') OUTREC FIELDS=(1,30,35,20) /*
The space (or comma) after the // and the indentation of the control statements are for readability; DFSORT typically allows leading blanks. The /* must appear on a line by itself (or as the first token on the line, depending on the system) to end the in-stream data.
Instead of in-stream data, you can point SYSIN to a cataloged dataset. For example, a sequential dataset or a member of a PDS:
1//SYSIN DD DSN=MY.PROCS(SORT1),DISP=SHR
The dataset (or member) must exist and contain the same kind of control statement text you would have put after DD *. One advantage is reuse: many jobs can reference the same member. Another is length: long or complex control statement sets are easier to maintain in a separate member. The dataset can be fixed- or variable-length; DFSORT reads record by record and parses each as control statement input.
DFSORT reads SYSIN at step initiation, before any data is read from SORTIN. The order of operations is roughly: (1) Step starts; (2) SYSIN is opened and read to end (or to /* for in-stream); (3) All control statements are parsed and validated; (4) SORTIN (or SORTIN01/02/…) and SORTOUT are opened; (5) Input phase runs (read, INREC, INCLUDE/OMIT); (6) Sort or merge phase; (7) Output phase (OUTREC, write). So by the time the first record is read from SORTIN, DFSORT has already processed the entire SYSIN. If there is a syntax error in a control statement, DFSORT may fail during this initial parse, before any data is processed.
The following are the main kinds of control statements you put in SYSIN. They are summarized here; each has its own tutorial page for full syntax and examples.
Control statements can usually appear in a logical order. A typical sequence is: OPTION (if any); SORT FIELDS= or MERGE FIELDS=; INCLUDE or OMIT (if any); INREC (if any); SUM (if any); OUTREC (if any); OUTFIL (if any). Some statements can be in different orders; the documentation for each statement describes any restrictions. Long statements can often be continued by putting a comma at the end of a line and continuing on the next line. Blanks and case (uppercase/lowercase) for keywords are often flexible, but keywords and positional values must be correct.
If you use both INCLUDE and OMIT, only one is usually meaningful for a given record (e.g. INCLUDE keeps matches, OMIT drops matches). Using both in the same job can be confusing; prefer one or the other for clarity.
DFSORT allows comment lines in SYSIN. A line that starts with /* (or in some contexts // in column 1) may be treated as a comment. Blank lines are often ignored. Lines that do not look like any known control statement may be ignored or may cause a warning. To avoid errors, keep SYSIN to valid control statements and comments as documented for your DFSORT level.
123SORT FIELDS=(10,20,CH,A,1,9,CH,A) INCLUDE COND=(30,1,CH,EQ,C'A') OUTREC FIELDS=(1,39,40,20, C'|')
This sorts by positions 10–29 (character, ascending) then by positions 1–9 (character, ascending). It keeps only records where position 30 is 'A'. The output record is built from positions 1–39, then 40–59, then a literal '|'. So SYSIN defines the whole operation in a few lines.
SYSIN is like a recipe card you give to DFSORT. The card says: "Sort the pile by the first 20 letters," "Throw away any card that doesn't have an A in spot 30," and "When you write the answer, only copy these parts of each card and add a line." DFSORT reads the recipe card once at the start, then follows it step by step. You can write the recipe on a piece of paper and hand it over (in-stream DD *) or point to a file where the recipe is stored (SYSIN as a dataset). Either way, the recipe is the control statements, and SYSIN is where DFSORT finds them.
1. Where does DFSORT get its control statements from?
2. When does DFSORT read SYSIN?
3. How do you supply control statements directly in the JCL?
4. Which control statement tells DFSORT how to sort?
5. Can the order of control statements in SYSIN matter?