IBM MQ clusters depend on a shared catalog: which queue managers belong, which cluster queue names exist on which members, and which cluster channels connect them. That catalog lives primarily on full repository queue managers and is copied to partial repositories and local caches. A repository inconsistency is any durable disagreement between members about those facts—QM_PARIS still believes ORDERS.IN exists only on QM_LONDON while QM_LONDON and the full repository show instances on both sites, or a cluster sender channel appears on one member but not another after a rushed weekend change. Symptoms reach applications as cluster puts to unknown object, messages arriving on an unexpected queue manager, auto-defined channels that never materialize, or workload algorithms that see only one instance when two are defined. This troubleshooting tutorial explains how inconsistencies form, how to compare repository views with DISPLAY commands, when REFRESH CLUSTER helps versus when it masks a deeper fault, split-brain scenarios involving two full repositories, repair playbooks for operations teams, and prevention habits so beginners do not treat the cluster repository as magic that self-heals without monitoring.
When you DEFINE QLOCAL with CLUSTER(name), the defining queue manager publishes the definition toward full repository members. Repository manager processes on full and partial hosts exchange updates over cluster channels—typically CLUSSDR and CLUSRCVR pairs that may be auto-defined. Partial members retain a cache subset sufficient for routing decisions; they are not authoritative. Two full repositories should hold matching authoritative data; IBM recommends two on independent infrastructure. If one full repository is down, the survivor continues publishing, but prolonged single-repo operation increases risk during the next partition or failed sync event.
| Pattern | Symptom | First check |
|---|---|---|
| Missing CLUSQMGR on partial member | Remote puts fail or route wrong | DISPLAY CLUSQMGR on partial vs full repo |
| Queue on one member only in repo | Workload sees single instance | DISPLAY CLUSQ(name) on all members |
| Stale channel after DELETE on one node | BINDING to dead CONNAME | DISPLAY CLUSCH on both full repos |
| Split full repositories | Different object lists per full repo | Compare both REPOS hosts immediately |
| Member left cluster but cache remains | Ghost routes to old QM | CLUSNL and REPOSNL membership |
12345678910* On putting queue manager: DISPLAY QMGR CLUSTER CLUSNL REPOS REPOSNL DISPLAY CLUSQMGR(*) CLUSTER('SALES') DISPLAY CLUSQ('ORDERS.IN') CLUSTER('SALES') DISPLAY CLUSCH(*) CLUSTER('SALES') * On each full repository host - compare output: DISPLAY CLUSQMGR('QM_PARIS') CLUSTER('SALES') DISPLAY CLUSQ('ORDERS.IN') CLUSTER('SALES') ALL * Channel path to repository: DISPLAY CHSTATUS('QM_LON.QM_REPO') WHERE(CHLTYPE EQ CLUSSDR)
Capture output from at least one full repository and from the symptomatic partial member in the same minute during an incident. Diff the CLUSQMGR list first—missing members explain many routing failures. Then diff CLUSQ for the affected queue name—attribute differences in CLWLPRTY matter for workload but missing rows mean catalog drift. CLUSCH differences show channel auto-definition failures or manual deletes that did not propagate.
REFRESH CLUSTER on a queue manager instructs the repository manager to republish cluster definitions according to IBM rules for your command variant—repository refresh versus security refresh are different scopes on some releases; read the command help before typing in production. Use refresh after correcting CLUSTER attributes, re-joining a member, or restoring a full repository from backup—not as the first action when you have not compared catalogs. REFRESH CLUSTER TYPE(REPOS) or equivalent on your platform may be required; never assume z/OS syntax matches distributed without checking. Schedule refresh during low traffic when possible because routing may fluctuate briefly while caches update.
12345* Example - confirm exact TYPE values on your release: REFRESH CLUSTER('SALES') * After planned rejoin of partial member: ALTER QMGR CLUSTER('SALES') CLUSNL('SALES') START CHANNEL(CLUSRCVR) * if needed per site standards
If DISPLAY on REPOS host A and REPOS host B show different CLUSQMGR counts or conflicting attributes for the same object, treat as severity-1. Stop uncontrolled DEFINE and ALTER on cluster objects until a designated lead picks the authoritative source—often the survivor that stayed connected to the majority of application members. Document every object difference. IBM has advanced recovery guidance for repository queue contents on SYSTEM.CLUSTER.REPOSITORY.QUEUE; involve experienced administrators before manual message manipulation on repository queues. Prevention beats cure: monitor both full repositories and alert on channel loss between them.
The cluster repository is a shared address book every office copies. Inconsistency means London's book lists a warehouse Paris tore down last week—mail still goes to the wrong dock until someone reprints every copy the same way.
Everyone in class is supposed to have the same list of who is in the club, but one kid has an old list with someone who already moved away—so invitations go to the wrong house until the teacher prints new lists for everyone.
Write a five-command DISPLAY script to compare two full repositories for cluster SALES.
List three actions that cause inconsistency and three that only fix symptoms.
ORDERS.IN exists on QM2 in MQSC but DISPLAY CLUSQ on QM1 shows one instance—what do you check next?
1. Repository inconsistency means:
2. Full repository role is set with:
3. REFRESH CLUSTER should be used:
4. Compare catalogs with: