Backup Queue Managers

A backup queue manager is the messaging hub you stand up—or promote—when the primary hub cannot serve the business. Unlike the standby instance in active/passive high availability, which shares or replicates the same logical service within one resilience story, a backup queue manager often lives in another building, region, or cloud account. It may have been idle for months. Its object definitions, channels, and logs must match what applications expect, or recovery becomes a multi-day reconfiguration project. Beginners confuse backup queue managers with secondary instances in multi-instance or RDQM; those are HA partners in the same design. Backup queue managers are the DR layer when the entire primary design is gone or untrusted. This page explains warm and cold models, naming, object synchronization, activation steps, channel repointing, z/OS considerations, and how backup queue managers connect to log shipping and cross-region recovery.

Warm, Cold, and Hot Standby

Backup queue manager readiness levels
LevelTypical stateRTO tendencyCost
HotReceiving replication or shared role in HA/DR productMinutesHighest
WarmQM defined, objects loaded, stopped or listening onlyTens of minutes to hoursMedium
ColdMedia restore required before strmqmHours to daysLower

Hot backup blurs into continuous replication—RDQM stretch clusters or vendor replication with automatic role change. Warm backup is the classic DR queue manager: MQ installed, DEFINE QLOCAL and CHANNEL scripts applied from Git, TLS keystores copied, but dspmq shows ended until declaration. Cold backup might be VM images or tape backups of queue manager data paths plus documented MQSC—acceptable only when the business accepts long RTO. Choose level per tier from your DR plan; do not run hot backup for every test queue manager in the estate.

Naming and Client Impact

Two dominant patterns exist. Same name, different host: queue manager QM1 runs in East; after DR, QM1 runs in West with DNS or load balancer pointing clients to new listeners. Applications reconnect to the same QM name—ideal for CCDT with connection name lists. Distinct DR name: primary QM1, backup QM1_DR; applications need CCDT update or dual connection definitions. The first pattern simplifies applications; the second clarifies operations and prevents accidental dual attachment. Either works if documented and tested. Never run QM1 in East and West simultaneously with the same data paths—that is split brain.

Object Synchronization

Backup queue managers need current queue, channel, authinfo, and listener definitions. Practices include: nightly runmqsc export from primary stored in secure artifact repository; infrastructure-as-code Terraform or Ansible applying DEFINE statements to DR site weekly; replication tools that copy object repositories where supported. Messages are not in object definitions—they require logs or replication. After major primary changes (new payment channel), verify DR object repo updated before the weekend. z/OS sites export CSQUTIL or automated SAVEMQOBJ procedures per shop standards.

Activation Runbook (Distributed)

  1. Confirm primary is truly failed—isolate network to prevent zombie primary.
  2. Restore latest consistent log and queue data to backup data path per log-shipping procedure.
  3. Verify file permissions and uid/gid for mqm match installation standards.
  4. Start queue manager: strmqm QM1_DR or promoted name.
  5. DISPLAY QMSTATUS; resolve any media recovery prompts.
  6. START LISTENER; verify TLS handshakes from jump host.
  7. START critical channels; DISPLAY CHSTATUS for BINDING/RUNNING.
  8. Update DNS or load balancer; notify application teams to reconnect.
  9. Monitor queue depths and error logs (AMQ errors) for 24 hours.

Channels and Remote Partners

Remote queue managers cache your hostname and channel definitions. DR changes IP addresses and possibly channel names. Maintain an appendix listing every partner: contact, channel name, CONNAME to use in DR, firewall ticket template. Some partners require days to update their side—pre-negotiate DR CONNAME and test annually. Cluster repositories complicate DR: full repository queue managers need cluster-wide DR strategy, not single-hub fixes. For hub-and-spoke, spokes may only need DNS update if the hub name is unchanged; verify XMITQ and CLNTCONN definitions on spokes.

z/OS Backup Queue Managers

On z/OS, backup may mean another LPAR or sysplex with restored page sets and BSDS, or a queue sharing group member in an alternate site if CF and connectivity support stretch sysplex—expensive and rare. More common: separate queue manager on DR LPAR with batch restore of logs and page sets, or IBM documented DR procedures for your release. Coordinate with RACF profiles: DR LPAR must have equivalent MQADMIN and channel identities. CKTI and CICS bridges need region definitions on the DR side before activation.

Capacity on the Backup Site

Size CPU, memory, and disk on the backup site for peak primary load plus backlog drain. DR activation often coincides with full transmission queues replaying—disk IOPS become the bottleneck. Undersized backup queue managers meet RTO for strmqm but fail RTO for clearing backlog. Model worst-case backlog from DR planning exercises.

Tutorial: Warm Backup MQSC Baseline

shell
1
2
3
4
5
6
7
8
9
10
11
12
* On DR site — objects only while QM stopped or in maintenance * Export from primary: runmqsc QM1 < export.mqsc > dr-baseline.mqsc DEFINE QLOCAL('PAYMENT.IN') REPLACE MAXDEPTH(500000) DEFPSIST(YES) DEFINE QLOCAL('PAYMENT.ACK') REPLACE MAXPSIST(YES) DEFINE CHANNEL('PARTNER.BANK') CHLTYPE(SDR) TRPTYPE(TCP) + CONNAME('dr-gw.bank.example(1414)') XMITQ('BANK.XMIT') REPLACE DEFINE LISTENER('LISTENER.TCP') TRPTYPE(TCP) PORT(1414) CONTROL(QMGR) REPLACE * After log restore and strmqm QM1_DR: * START LISTENER('LISTENER.TCP') * START CHANNEL('PARTNER.BANK')

Explainer: Spare Fire Station

The backup queue manager is a second fire station across town with trucks fueled and maps on the wall, but firefighters sleep at home until the main station floods. When called, they drive known routes—not inventing streets during the emergency.

Explain Like I'm Five

If your school mailbox breaks, another mailbox at the backup school is ready with the same class slots written on it. Teachers are told the backup address before the first mailbox breaks, so homework still has a place to land.

Practice Exercises

Exercise 1

Compare same-name versus distinct-name DR for a Java client using CCDT.

Exercise 2

Write five activation runbook steps specific to your fictional QM_HUB backup.

Exercise 3

List channel partners that must change CONNAME when DR site activates.

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

1. A backup queue manager exists to:

  • Replace primary after site disaster
  • Replace DLQ
  • Host only topics
  • Eliminate channels

2. Warm backup means:

  • Prepared objects, faster activation
  • No MQ installed
  • Only JMS
  • Random queue names

3. Running two actives on same data:

  • Risks corruption
  • Improves RPO
  • Required for TLS
  • Only on topics

4. Clients after DR often need:

  • Updated connection target or DNS
  • New message format only
  • Disable persistence
  • Delete all queues
Published
Read time20 min
AuthorMainframeMaster
Verified: IBM MQ 9.3 documentation