When a DFSORT step is slow or fails, you need a systematic way to find the cause. Performance debugging involves reading the product messages (ICE messages) in SYSOUT, understanding what each phase of the sort did, and identifying whether the bottleneck is CPU, memory, or I/O (especially sortwork). This page explains how to analyze DFSORT job output, interpret common messages, use DEBUG when needed, and apply fixes so your sorts run faster or complete successfully.
The first place to look when debugging a DFSORT step is the step SYSOUT (job log). DFSORT writes ICE messages there—product-specific messages that report what the sort did. Typical informatics (e.g. ICE000I) include the number of records read from SORTIN, the number of records written to SORTOUT, and sometimes which phases ran (e.g. sort phase, merge phase). These numbers help you verify that the step processed the expected volume of data and that INCLUDE/OMIT did not drop more (or fewer) records than you intended. When the step fails, the last ICE message before the abend or error usually indicates the reason—for example, ICE046A (sort capacity exceeded) or a syntax error code. Keep your DFSORT Messages and Codes (or equivalent) manual handy so you can look up any message you do not recognize.
Different DFSORT versions may use slightly different message numbers; the following are representative. ICE000I (or similar) is usually informational: it might say that the sort completed and report record counts. ICE046A means the sort exceeded its capacity: it ran out of sortwork space or could not complete with the allocated memory. The fix is to increase sortwork (more or larger SORTWK datasets, or higher DYNALLOC), to provide a larger FILSZ so DFSORT allocates more work space, or to increase SIZE and REGION so more data is sorted in memory. ICE083A and similar often indicate a resource or allocation failure (e.g. could not allocate a dataset). Syntax errors (e.g. ICE1xx) point to a problem in SYSIN—invalid keyword, wrong field position, or conflicting options. Correct the control statements and rerun.
| Message | Meaning |
|---|---|
| ICE000I | Informational; often completion or phase summary (record counts, etc.). |
| ICE046A | Sort capacity exceeded; insufficient sortwork or memory—increase FILSZ, sortwork, or SIZE/REGION. |
| ICE083A | Resource or allocation failure; may indicate sortwork or system resource shortage. |
| Syntax / ICE1xx | Control statement error; check SYSIN for invalid syntax or conflicting options. |
To understand why a sort is slow, you need to know whether the step is limited by CPU or by I/O. The job report (SMF or the job log) usually shows CPU time and elapsed time for the step. If elapsed time is much larger than CPU time, the step spent a lot of time waiting—often for I/O. That suggests the sort is I/O-bound: reading input, writing to sortwork, or reading from sortwork during merge passes. In that case, increasing memory (SIZE, REGION) can reduce how much data is written to sortwork, and tuning sortwork (number of datasets, blocksize) can speed up that I/O. If CPU time is close to elapsed time and both are high, the step is likely CPU-bound: a lot of key comparison or INREC/OUTREC processing. Then you might reduce record count (INCLUDE/OMIT), simplify the sort key or reformat logic, or avoid an unnecessary sort (use COPY or MERGE where possible). DFSORT messages that mention merge passes or sortwork activity also indicate that a significant amount of I/O is happening; reducing spill to sortwork usually helps.
The DEBUG control statement tells DFSORT to produce extra diagnostic output. Depending on your product and options, DEBUG may print how control statements were interpreted, how many records passed through each phase (e.g. after INCLUDE, after sort, after OUTREC), or sample record content. That is useful when you suspect wrong results: for example, you expect 100,000 records but get 80,000—DEBUG can show whether INCLUDE/OMIT dropped 20,000 or whether the sort phase or output phase is losing records. It can also help you confirm that field positions and formats in SYSIN match the actual record layout. DEBUG does not improve performance; it adds overhead and output. Use it during problem determination, then remove it for production. See the DEBUG statement tutorial for syntax and options specific to your installation.
When a sort fails with capacity exceeded (ICE046A) or runs slowly with heavy sortwork use, check three areas. First, FILSZ: this is the estimated size of the data to be sorted. If FILSZ is too low, DFSORT may allocate too little sortwork and then exceed that allocation. Provide a realistic or slightly high estimate (in the units your product expects). Second, sortwork itself: ensure you have enough SORTWK datasets (or sufficient DYNALLOC limit) and that each has enough space. If DFSORT dynamically allocates sortwork, a higher FILSZ often leads to more or larger work datasets. Third, memory: OPTION SIZE (or MOSIZE) and the step REGION in JCL limit how much memory the sort can use. If memory is too small, more data spills to sortwork, which increases I/O and can cause capacity problems. Increase SIZE and REGION within your system’s guidelines so that the sort can hold more data in memory. See the tutorials on FILSZ estimation, sortwork datasets, dynamic allocation, and memory usage for detailed tuning.
When you are assigned to improve a slow or failing sort, follow a sequence. (1) Capture SYSOUT and read all ICE messages; note record counts and any error or capacity messages. (2) Compare CPU and elapsed time to see if the step is I/O-bound or CPU-bound. (3) If it failed, look up the ICE code and address the cause (e.g. increase sortwork or FILSZ for ICE046A). (4) If it is slow and I/O-bound, review FILSZ, SIZE, REGION, and sortwork; consider INREC to shorten records and INCLUDE/OMIT to reduce count. (5) If the step does not need to sort (order does not matter or data is already ordered), switch to COPY or MERGE. (6) Optionally add DEBUG for one run to verify record counts and control statement behavior, then remove it. (7) Re-run and compare SYSOUT and timing to confirm improvement.
When your sort is slow or breaks, it’s like when a game freezes: you have to find out why. First you look at the “scoreboard” (the messages in the job log)—that tells you how many cards were handled and if something went wrong. If the game is waiting a long time for the disk (I/O), you give it more “desk space” (memory) so it doesn’t have to put cards in drawers so often. If the game is doing too much work (CPU), you try to give it fewer cards or simpler rules. And if you didn’t really need to sort the cards at all—you only wanted to take out the red ones—you skip the sorting step entirely. So: read the messages, see if it’s waiting on I/O or busy with CPU, fix the thing that’s wrong (more space, fewer records, or no sort), and check the scoreboard again to see if it’s better.
1. Where do you look first when a DFSORT step is slower than expected?
2. What does ICE046A usually indicate?
3. How can you tell if a sort is I/O-bound vs CPU-bound?
4. When is the DEBUG statement useful for performance?
5. Your sort runs but uses far more sortwork I/O than a similar job. What might you check?