A channel retry loop is what operations sees when DISPLAY CHSTATUS shows RETRY hour after hour, the transmission queue depth climbs, and AMQERR fills with the same channel error every few minutes. IBM MQ is doing what it was configured to do—schedule another connect after SHORTRTY or LONGTMR—but the underlying fault never cleared. Beginners increase retry counts or restart the queue manager; experienced teams read LASTCHLERR once, fix TLS or CONNAME, and the loop ends on the next successful bind. This tutorial explains how retry loops form, how short and long retry phases interact, why XMITQ backlog is the business impact, how to distinguish flapping network from wrong certificate, when RESET CHANNEL helps, and how to prevent loops through monitoring and change control.
A sender channel (CHLTYPE SDR) reads messages from its XMITQ and opens a session to the partner receiver. If TCP fails, TLS handshake fails, or MQ channel negotiation fails, the instance moves to RETRY. The queue manager increments retry counters and waits. When the timer expires, MQ tries again. If nothing changed on the network or configuration, the same failure occurs—another RETRY. This is a loop: same channel name, same error family, predictable interval. Loops differ from a single retry after a brief blip; loops last beyond your incident threshold and correlate with monotonic XMITQ depth increase.
| Attribute | Phase | Effect on loop |
|---|---|---|
| SHORTRTY | Short retry count | How many quick attempts before long phase |
| SHORTTMR | Short retry interval | Seconds between early retries—fast loop if low |
| LONGRTY | Long retry count | Additional attempts after short phase exhausted |
| LONGTMR | Long retry interval | Slower loop—still endless if fault remains |
Firewall rule removed, wrong port in CONNAME, listener STOPPED, or DNS pointing to decommissioned host. TCP timeout produces retry loops that look like network outages. Verify telnet or nc to partner port from the sending host during the incident—not from your laptop unless that matches the channel path.
Expired personal certificate, missing intermediate CA, cipher mismatch on SSLCIPH, or SSLCAUTH REQUIRED without client cert. LASTCHLERR and AMQ9638-class messages point here. Fixing retry timers does not renew a certificate.
CHLAUTH rules block partner IP, QMNAME, or SSLPEER DN. AMQERR often names the rule. The loop continues until the rule is corrected or the partner presents the expected identity.
After restore or DR, sequence number mismatch prevents RUNNING. RESET CHANNEL on both sides may be required per runbook after confirmed consistent backup state—not as a first action.
123456DISPLAY CHSTATUS('PARIS.TO.LONDON') ALL DISPLAY QSTATUS('QM_LONDON.XMIT') CURDEPTH MAXDEPTH tail -30 /var/mqm/qmgrs/QM_PARIS/errors/AMQERR01.log * After fix: RESET CHANNEL('PARIS.TO.LONDON') START CHANNEL('PARIS.TO.LONDON')
Raising SHORTRTY or LONGRTY tolerates longer partner outages during planned maintenance—it does not fix wrong configuration. Use higher retries when business approved outage windows exceed current LONGRTY times LONGTMR total wait. Document maximum acceptable XMITQ depth for that wait. If loops run indefinitely in production without planned outage, treat as misconfiguration, not capacity tuning.
A retry loop is a hamster wheel: the channel keeps running in place—RETRY—without delivering mail. Mail piles up on the cart (XMITQ) beside the wheel. Stop fixing the wheel speed; open the door (fix TLS, network, or auth) so the hamster can exit to RUNNING.
Your toy car tries to drive to a friend's house but the bridge is out. Every few minutes it tries again and cannot cross. Toys pile up in the car trunk because they cannot be delivered. Fix the bridge—not the timer on how often the car tries.
Lab: break CONNAME, observe RETRY and XMITQ depth over three SHORTTMR cycles. Record LASTCHLERR.
Write runbook: retry loop with AMQ9638 vs AMQ9208 different actions.
Calculate XMITQ messages accumulated during 4 hours of retry at 200 msg/sec put rate.
1. Retry loop means channel keeps:
2. Fix root cause before:
3. XMITQ grows during retry loop because:
4. LASTCHLERR helps identify: