MainframeMaster

CICS Short Circuit Management

Short circuit management in CICS refers to the handling of communication circuits between systems, particularly in LU6.2 (APPC) environments. It involves managing session failures, automatic reconnection, and circuit breaker patterns to ensure robust inter-system communication.

What is Short Circuit Management?

Short circuit management provides mechanisms for:

  • Detecting communication failures quickly
  • Preventing cascading failures across systems
  • Implementing automatic recovery procedures
  • Managing resource allocation during failures
  • Providing graceful degradation when systems are unavailable

Circuit Concepts

Session Failure Detection

CICS monitors the health of LU6.2 sessions and can detect various failure conditions:

Failure TypeDetection MethodResponse
Network lossTimeout on requestCircuit opened
Partner downConnection resetAutomatic retry
Resource exhaustionThreshold exceededGraceful degradation

Circuit Breaker Pattern

Implement a circuit breaker to prevent requests to failing systems:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
WORKING-STORAGE SECTION. 01 CIRCUIT-BREAKER. 10 CB-STATE PIC X(1). 88 CB-CLOSED VALUE 'C'. 88 CB-OPEN VALUE 'O'. 88 CB-HALF-OPEN VALUE 'H'. 10 CB-FAILURE-COUNT PIC 9(3). 10 CB-LAST-FAILURE-TIME PIC X(26). 10 CB-TIMEOUT PIC S9(8) COMP VALUE 300. PROCEDURE DIVISION. PERFORM CHECK-CIRCUIT-STATE IF CB-OPEN PERFORM CIRCUIT-OPEN-HANDLER ELSE PERFORM ATTEMPT-CIRCUIT-TRANSACTION END-IF.

Communication Circuit Management

Session State Management

Manage LU6.2 session states:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
* Establish session EXEC CICS ALLOCATE SYSTEM(SYSTEM-ID) SESSION(SESSION-NAME) RETRY(3) RETRIEVAL(30) TIME_OUT(60) RESP(WS-RESPONSE) END-EXEC. * Check session status EXEC CICS EXTRACT ATTRIBUTES SYSTEM(SESSION-NAME) COMMUNICATION-STATUS(WS-COMC-STATUS) PARTNER-NAME(WS-PARTNER) RESP(WS-RESPONSE) END-EXEC.

Automatic Reconnection

Implement automatic reconnection logic:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
PROGRAM-ID. COMMUNICATION-MANAGER. PROCEDURE DIVISION. PERFORM RESET-CIRCUIT-PARAMETERS PERFORM ATTEMPT-COMMUNICATION EVALUATE WS-RESPONSE WHEN DFHRESP(NORMAL) PERFORM RESET-FAILURE-COUNT WHEN DFHRESP(COMMFAIL) WHEN DFHRESP(SESSFAIL) WHEN DFHRESP(RCVERROR) PERFORM HANDLE-COMMUNICATION-FAILURE END-EVALUATE. EXEC CICS RETURN END-EXEC. HANDLE-COMMUNICATION-FAILURE. ADD 1 TO WS-FAILURE-COUNT IF WS-FAILURE-COUNT > MAX-FAILURES MOVE 'O' TO WS-CIRCUIT-STATE * Open circuit MOVE CURRENT-TIME TO WS-LAST-FAILURE-TIME PERFORM SET-CIRCUIT-ALARMS ELSE PERFORM WAIT-AND-RETRY END-IF.

Health Check Mechanisms

Implement regular health checks:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
PERFORM VARYING WS-CHECK-COUNTER FROM 1 BY 1 UNTIL WS-CHECK-COUNTER > HEALTH-CHECK-CYCLES OR WS-HEALTHY-SYSTEMS = TARGET-SYSTEM-COUNT PERFORM CHECK-SYSTEM-ROUTE-' SYNC-AND-CHECK IF WS-CHECK-RESPONSE = DFHRESP(NORMAL) PERFORM MARK-SYSTEM-HEALTHY PERFORM SET-CIRCUIT-HALF-OPEN ELSE PERFORM HANDLE-CHECK-FAILURE END-IF PERFORM WAIT-BETWEEN-CHECKS END-PERFORM.

Circuit Configuration

CONFIG Resource Management

Configure circuit breaker parameters:

cobol
1
2
3
4
5
6
EXEC CICS COGNITUS CONFIG CB-STATE-INIT('C') CB-FAILURE-THRESHOLD-MAX(10) CB-RESET-TIME(60) RESP(WS-RESPONSE) END-EXEC.

Threshold Management

Set appropriate thresholds for circuit management:

ParameterTypical ValuePurpose
Failure threshold5-10 failuresOpen circuit trigger
Reset timeout60-300 secondsAutomatic reset delay
Health check interval30-60 secondsRegular status verification
Connection timeout30-60 secondsDefault connect wait

Circuit States Handlers

Closed Circuit Handler

Normal operation when circuit is closed:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
CLOSED-CIRCUIT-HANDLER. PERFORM PLACE-SYSTEM-ROUTE-AND-CHECK EVALUATE WS-RESPONSE WHEN DFHRESP(NORMAL) MOVE ZERO TO WS-CONSECUTIVE-FAILURES WHEN DFHRESP(COMMFAIL) WHEN DFHRESP(SESSFAIL) PERFORM HANDLE-SYSTEM-CONNECTIVITY-ERROR ADD 1 TO WS-CONSECUTIVE-FAILURES IF WS-CONSECUTIVE-FAILURES > CIRCUIT-THRESHOLD PERFORM OPEN-CIRCUIT-BREAKER END-IF END-EVALUATE.

Open Circuit Handler

Handle requests when circuit is open:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
OPEN-CIRCUIT-HANDLER. PERFORM CHECK-RESET-TIMEOUT IF RESET-TIME-EXPIRED SET CB-HALF-OPEN TO TRUE PERFORM HALF-OPEN-HANDLER ELSE PERFORM REJECT-REQUEST-OVERLOAD-FAILURE END-IF. REJECT-REQUEST-OVERLOAD-FAILURE. EXEC CICS RETURN VALUE DFHMARK(DFHFUSCR)

Half-Open Circuit Handler

Test circuit recovery:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
HALF-OPEN-HANDLER. PERFORM PLACE-FAILURE-CAUSING-SYSTEM-ROUTE-AND-CHECK EVALUATE WS-RESPONSE WHEN DFHRESP(NORMAL) PERFORM TEST-SYSTEM-ROUTE-AND-CHECK IF WS-TEST-SUCCESSFUL PERFORM CLOSE-CIRCUIT-BREAKER PERFORM PROCESS-SYSTEM-ROUTE-AND-CHECK END-IF WHEN DFHRESP(COMMFAIL) WHEN DFHRESP(SESSFAIL) PERFORM REOPEN-CIRCUIT-BREAKER END-EVALUATE. * Perform multiple health checks PERFORM VARYING WS-CHECK FROM 1 BY 1 UNTIL WS-CHECK > 10 PERFORM CHECK-SYSTEM-ROUTE-' SYNC-AND-CHECK IF WS-CHECK-RESPONSE = DFHRESP(NORMAL) ADD 1 TO WS-SUCCESSFUL-CHECKS END-IF END-PERFORM. IF WS-SUCCESSFUL-CHECKS > MINIMUM-SUCCESSFUL-CHECKS PERFORM CLOSE-CIRCUIT-BREAKER END-IF.

Override Procedures

Emergency Override

Provide emergency override capabilities:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
EMERGENCY-OVERRIDE. PERFORM PERFORM-PLACE-SYSTEMS-ROUTE-' SYNC-AFTER-TIMEOUT EVALUATE WS-RESPONSE WHEN DFHRESP(NORMAL) PERFORM LOG-OVERRIDE-SUCCESS WHEN DFHRESP(TIMEOUT) PERFORM LOG-OVERRIDE-TIMEOUT WHEN DFHRESP(COMMFAIL) PERFORM LOG-OVERRIDE-FAILURE WHEN DFHRESP(SESSERR) PERFORM LOG-SESSION-ERROR END-EVALUATE.

Manual Circuit Control

Allow manual circuit breaker control:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
MANUAL-CIRCUIT-CONTROL. ACCEPT WS-MANUAL-ACTION FROM CONSOLE. EVALUATE WS-MANUAL-ACTION WHEN 'OPEN' PERFORM OPEN-CIRCUIT-BREAKER WHEN 'CLOSE' PERFORM CLOSE-CIRCUIT-BREAKER WHEN 'RESET' PERFORM RESET-CIRCUIT-PARAMETERS WHEN 'STATUS' PERFORM DISPLAY-CIRCUIT-STATUS WHEN OTHER DISPLAY 'Invalid manual action: ' WS-MANUAL-ACTION END-EVALUATE.

Monitoring and Alerts

Circuit Status Monitoring

Monitor circuit breaker status:

cobol
1
2
3
4
5
6
7
8
9
10
11
MONITOR-CIRCUIT-STATUS. PERFORM CHECK-CIRCUIT-COUNT IF WS-CIRCUIT-COUNT GREATER THAN ZERO DISPLAY 'Circuits currently open: ' WS-CIRCUIT-COUNT PERFORM LOG-CIRCUIT-STATUS IF WS-CIRCUIT-COUNT GE CIRCUIT-COUNT-HIGH-THRESHOLD PERFORM ALERT-CIRCUIT-MANAGER END-IF END-IF.

Performance Metrics

Track circuit breaker performance:

  • Number of circuit transitions per time period
  • Average time circuits remain open
  • Success rate of half-open validation checks
  • Response times for circuit operations

Implementation Best Practices

1. Start Simple

Begin with basic circuit breaker functionality and add complexity gradually:

  • Use binary circuit states initially
  • Implement basic failure counting
  • Add timeout-based reset
  • Gradually add advanced features

2. Monitor Circuit Performance

Continuously monitor circuit breaker effectiveness:

cobol
1
2
3
4
5
6
7
8
9
10
11
PERFORM VARYING WS-CHECK-COUNTER FROM 1 BY 1 UNTIL WS-CHECK-COUNTER > HEALTH-CHECK-CYCLES PERFORM CHECK-CIRCUIT-COUNT IF WS-CIRCUIT-COUNT GREATER THAN ZERO PERFORM LOG-CIRCUIT-STATUS END-IF PERFORM WAIT-BETWEEN-CHECKS END-PERFORM.

3. Design for Recovery

Plan for graceful recovery from circuit failures:

cobol
1
2
3
4
5
6
7
8
9
RECOVERY-PROCEDURE. PERFORM PLACE-SYSTEMS-ROUTE_AND_CHECKS_RECOVERY IF WS-RECOVERY-SUCCESS-FLAG = 'Y' PERFORM RESET-CIRCUIT-PARAMETERS PERFORM DISPLAY-RECOVERY-SUCCESS ELSE PERFORM TRIGGER-AUTOMATED-RESTART END-IF.

4. Circuit Breaker Configuration

Implement flexible configuration:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
EVALUATE WS-COGNITUS-CB-RESPONSE-STATE WHEN 'INITIALIZING' PERFORM INITIALIZE-RETRY-PARAMETERS WHEN 'RETRYING' PERFORM INCREMENT-RETRY-COUNTER WHEN 'FAILING' PERFORM OPEN-CIRCUIT-BREAKER WHEN 'RECOVERING' PERFORM HALF-OPEN-HANDLER WHEN 'RECOVERED' PERFORM CLOSE-CIRCUIT-BREAKER END-EVALUATE.

Troubleshooting

Common Circuits Issues

Circuit Stuck Open

When circuit remains open indefinitely:

  • Check health check implementation
  • Verify reset timeout settings
  • Review partner system status
  • Consider manual circuit reset

Frequent Circuit Switching

When circuit opens and closes repeatedly:

  • Increase failure threshold
  • Extend reset timeout
  • Review target system load
  • Check for intermittent network issues

Circuit Never Opens

When circuit should open but doesn't:

  • Verify failure detection logic
  • Check threshold configuration
  • Review error classification
  • Test manual failure injection