CICS system recovery is the process of bringing a CICS region back online after it has stopped, whether because of a planned shutdown, an abend, or a crash. The way CICS starts is determined by the restart type: warm restart uses the system log and global catalog to recover in-flight work and reinstall resource state; cold restart ignores the log and builds all tables from resource definitions (e.g. CSD); emergency restart is used after a failure and, like warm, uses the log and catalog but with an emphasis on speed and minimal validation. This page explains these restart types and when each is used.
When the CICS "office" closes (planned or because of a problem), we need to open it again. If we wrote down what everyone was doing (the log), we can use that list to fix things that were not finished and put the office back the way it was—that is warm (or emergency) restart. If we lost the list or the list is wrong, we have to start from the main rulebook (definitions) and set up the office from scratch—that is cold restart. Cold is slower and we cannot fix half-done work, but we know we start clean.
| Type | When used | What happens |
|---|---|---|
| Warm restart | Planned shutdown or orderly stop; log intact | Recover in-flight work from log; reinstall groups; restore state |
| Cold restart | Log missing/corrupted; major config change; warm failed | No log recovery; build all tables from definitions; clean state |
| Emergency restart | After abend or crash; automatic | Like warm but faster; use log and catalog; minimal validation |
Warm restart is used when the region was stopped in an orderly way (e.g. planned shutdown) or when the system log is intact and the region can recover from it. CICS reads the system log and global catalog to re-create the in-memory state: it identifies units of work that were in progress and backs them out, and it reinstalls resource groups that were installed at the end of the previous run. It does not re-read the CSD (system definition file) for that recovered state; the log and catalog are the source. Warm restart gives the fastest recovery to a consistent state when the log is available. If the log is missing or corrupted, warm restart cannot complete and you must use cold restart.
Cold restart does not use the system log to recover in-flight work. Instead, CICS builds all control blocks and tables from scratch from resource definitions (e.g. from the CSD or GRPLIST). No prior runtime state is restored; the region starts as if it had never run before. Cold restart is required when the log is missing or corrupted, when you have made major definition changes that are easier to apply with a full reload, or when warm restart has failed. It takes longer than warm restart and any work that was in progress when the region stopped is lost (not recovered), but you get a clean, known state.
Emergency restart is typically triggered automatically after an abend or system failure. Like warm restart, it uses the system log and global catalog to re-create state and does not use the CSD for that purpose. It is optimized for speed: CICS may skip some validation or optional steps to get the region back online as quickly as possible. So emergency restart is the "we just crashed, get back up" path; warm restart is the "we shut down normally, come back and recover" path. Both rely on the log; if the log is not available, neither can succeed and cold restart is the only option.
The system log is essential for warm and emergency restart. It records transaction and resource updates so that the recovery manager can determine which units of work had committed and which were still in progress when the region ended. During restart, CICS backs out in-flight units of work so that only committed work is reflected in recoverable resources. The log is also used to reinstall resource and group state. Protecting the system log (e.g. on durable storage, with backups) is part of recovery planning. If the log is lost or damaged, warm and emergency restart cannot recover correctly and you must fall back to cold restart.
1. Which restart type uses the system log to recover in-flight work?
2. When is cold restart typically used?
3. What is the main difference between warm and emergency restart?