What causes sequence number errors in IBM MQ?

The sender and receiver channel partners disagree on the logical batch sequence state—often after restoring one queue manager from backup, replacing a queue manager, prolonged split-brain, or RESET CHANNEL on only one side. The message channel protocol refuses to proceed until state is aligned or administratively reset.

What is the symptom of a sequence number error?

Channel fails to enter RUNNING or cycles through RETRY with errors referencing sequence, batch, or protocol in LASTCHLERR and the error log. Transmission queue depth may remain high while the partner shows a corresponding error.

Is RESET CHANNEL safe?

RESET CHANNEL resets sequence state on the queue manager where you run it. If the partner still holds different state, you risk duplicate delivery or gaps. Always coordinate with the remote administrator, quiesce the channel pair, and follow your runbook—often RESET on both sides in the same change window after backing up or draining XMITQ per policy.

Can I avoid sequence errors during DR?

Practice DR with paired recovery: restore both QMs to consistent points, or stop channels before restore, or plan coordinated RESET after failover tests. Document which channels are critical and require message drain before reset.

Are sequence errors related to message IDs?

No. Message IDs in the MQMD identify individual messages for applications. Channel sequence numbers track batches between queue managers on a channel pair—a different layer of the stack.

MainframeMaster

Sequence Number Errors

Sequence number errors are among the most stressful IBM MQ incidents because the queue manager is protecting message integrity on purpose. The channel protocol will not enter RUNNING when the sender and receiver disagree about which batch numbers were committed—a situation that often follows disaster recovery, a restored backup on one side only, or an operator running RESET CHANNEL on one queue manager without telling the partner. Beginners see RETRY loops and rising XMITQ depth and may raise retry timers, which never fixes stale sequence state. This tutorial focuses on errors and recovery: recognizing log messages, coordinated RESET CHANNEL procedures, draining versus risking duplication, differences from ordinary network retry, and post-incident validation so payroll and audit traffic resume safely.

Why the Protocol Refuses to Continue

Message channels transfer messages in batches for efficiency. Each batch has a logical sequence position on the pair. When TCP drops mid-batch, partners reconcile on reconnect: what was acknowledged, what must be resent from XMITQ. That only works if both sides share compatible counters. Restore QM_A from Friday backup while QM_B ran through Saturday and counters diverged—the next bind presents numbers B does not expect. IBM MQ stops rather than silently duplicate or drop financial payloads. Treat the error as data safety, not as a bug to bypass.

Situations that commonly cause sequence errors
Scenario	Risk	First action
RESET on one QM only	Dup or gap	Stop channel both sides; align plan
Single-sided backup restore	State skew	Compare restore dates; coordinate
QM replacement new name	New instance zero state	New channel pair or dual RESET
Long outage both sides	In-flight ambiguity	Review XMITQ and logs

Symptoms in CHSTATUS and Logs

DISPLAY CHSTATUS shows RETRY or INACTIVE with LASTCHLERR referencing sequence or channel protocol terms depending on release wording. Search both queue managers error logs around the same second—the sender and receiver each log their view. Note whether the channel ever reached RUNNING after the last change window or failed immediately on BINDING. Check CURDEPTH on the transmission queue and the age of the oldest message—stale messages may need business approval before destructive recovery. If multiple channels between the same pair fail together, suspect a common restore event rather than individual typos.

Coordinated Recovery Runbook

shell

1
2
3
4
5
6
7
8
* Agree maintenance window with partner ops
PING CHSTATUS('PARIS.TO.LONDON')
STOP CHANNEL('PARIS.TO.LONDON')
* Partner stops matching channel on their QM
RESET CHANNEL('PARIS.TO.LONDON')
* Partner runs RESET on same channel name
START CHANNEL('PARIS.TO.LONDON')
DISPLAY CHSTATUS('PARIS.TO.LONDON')

STOP quiesces new batches. RESET clears local sequence state—meaning on your release must be confirmed in documentation. Starting before partner RESET completes can reproduce the error immediately. Some runbooks drain XMITQ to a holding queue before RESET when duplication risk is unacceptable; others accept at-least-once with idempotent consumers. Legal and audit requirements choose the path, not the MQ administrator alone.

RESET CHANNEL Risk: Duplication Versus Loss

If the sender believes batch 100 was not received but the receiver actually committed it, blind RESET and resend may duplicate. If the receiver never got batch 100 but counters jump forward after RESET, loss is possible when messages were non-persistent or already removed from XMITQ by administrative action. Persistent messages on XMITQ generally remain until successfully transferred—RESET does not delete them automatically—but protocol state after RESET may allow retransmission. Application idempotency keys and duplicate detection tables are enterprise defenses when MQ recovery cannot guarantee exactly-once across a reset boundary.

DR and Backup Scenarios

Active/passive failover with shared disk may preserve sequence state if channels and logs fail over together—still test annually. Active/active with independent disks is vulnerable: never start channels between QMs restored from different timestamps without a written procedure. Multi-instance queue managers reduce listener outages but do not remove sequence discipline after split-brain. Document for each channel whether messages in flight during failover are reconciled by MQ automatically or require business replay from source systems.

Distinguishing Sequence Errors From Other Failures

Connection refused—usually CONNAME or listener; no sequence yet.
TLS errors—handshake fails before sequence negotiation.
CHLAUTH block—security policy; fix rules not RESET.
Sequence error—often after DR or RESET; partners were recently RUNNING.

Misdiagnosis wastes hours: a team renewing certificates while sequence state is wrong will not restore RUNNING. Read the full LASTCHLERR text, not only the status color on the console.

Prevention for New Environments

Pair channel names and document both queue managers in the same CMDB entry.
Include sequence recovery in DR playbooks with named approvers.
Test RESET in lab with test messages and count puts at consumer.
Monitor XMITQ age and channel NOT RUNNING alerts.
Version-control MQSC for channel definitions separately from QM data backups.

Explainer: Page Numbers in a Shared Notebook

Two offices share a notebook counting which package batches they exchanged. If one office rips out pages and restarts at page 1 while the other still expects page 50, they stop shipping until both agree to start a fresh chapter together—that is coordinated RESET.

Explain Like I'm Five: Sequence Number Errors

You and your friend were counting puzzle pieces you sent each other. One of you forgot the count and started at one again while the other still remembers the old number—so you stop until both agree to start counting the same way again.

Practice Exercises

Exercise 1

Write a two-sided maintenance procedure for RESET CHANNEL including stop order and validation puts.

Exercise 2

List three DR scenarios and whether you would drain XMITQ first.

Exercise 3

Given LASTCHLERR mentioning sequence, explain why increasing SHORTRTY will not help.

Frequently Asked Questions

Test Your Knowledge

1. Sequence errors mean:

Sender and receiver batch state mismatch
Invalid COBOL
LDAP down
JCL class wrong

2. RESET CHANNEL should be:

Coordinated with partner
Secret on one side only
Run hourly
On DLQ

3. Common after DR:

Sequence mismatch
Higher HBINT only
New topic
Namelist change

4. XMITQ depth high with sequence error suggests:

Messages waiting for fixed channel
All delivered
No messages
Client only issue

Sequence Number Errors

Why the Protocol Refuses to Continue

Symptoms in CHSTATUS and Logs

Coordinated Recovery Runbook

RESET CHANNEL Risk: Duplication Versus Loss

DR and Backup Scenarios

Distinguishing Sequence Errors From Other Failures

Prevention for New Environments

Explainer: Page Numbers in a Shared Notebook

Explain Like I'm Five: Sequence Number Errors

Practice Exercises

Exercise 1

Exercise 2

Exercise 3

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

Channel Sequence Numbers

Channel Retrying

Channel Stuck in Retry

Store-and-Forward