Active/Passive

Active/passive high availability is the pattern enterprises choose when one logical messaging hub must survive server loss without running two writers against the same data at once. The active node accepts connections, runs channels, and writes logs. The passive node—standby—stays ready but does not serve application traffic until failover promotes it. IBM MQ implements this on distributed platforms primarily through multi-instance queue managers with shared networked storage, and through Native HA and RDQM with replicated data instead of shared disks. On z/OS, queue sharing groups and sysplex design provide different flavors of resilience. Beginners confuse active/passive with clustering; clustering spreads work across many queue managers, while active/passive protects one name. This tutorial defines the model, maps it to IBM MQ products, explains RTO and RPO, covers fencing and split-brain risk, and compares mainframe and distributed deployments so architects can label diagrams correctly in design reviews.

Roles in Active/Passive

Node roles
RoleWhat it doesOn partner failure
ActiveMQCONN, channels, puts, gets, loggingStandby may promote
Passive (standby)Monitors active, holds ready stateBecomes active candidate
Witness / coordinatorVotes in some HA stacksFailover policy dependent

Passive does not mean powered off. Standby processes often run mqsi in standby mode, mount shared storage read-only or with coordinated access, and participate in health checks. Cold standby—spare hardware with manual strmqm—is cheaper but worse RTO.

Why Not Two Actives on One Data Set?

Queue manager logs and queue files assume a single writer ordering recovery. Two active instances writing the same log without coordination corrupt persistent messages and destroy auditability. Active/passive enforces single-writer semantics. Active/active requires either partitioned data (separate queue managers) or replication technology that merges streams safely—different products and operations.

IBM MQ Implementations

  1. Multi-instance queue managers—shared SAN or NAS, active and standby instances, mqsi control.
  2. Native HA—container-friendly replicated HA per IBM Native HA documentation for supported platforms.
  3. RDQM—Replicated Data Queue Managers for Linux with quorum and replica nodes.
  4. z/OS queue sharing groups—multiple members, shared CF queues; member bounce not identical to MIQM but operational cousin.
Active/passive options compared
OptionStorage modelTypical site
Multi-instanceShared diskTraditional enterprise Linux/Windows
RDQMReplicated volumesCloud and Linux HA without SAN
Native HAProduct-specific replicationModern container deployments
Manual cold standbyCopy or restoreDR only, poor RTO

RTO and RPO

Recovery time objective is how long applications may be down during failover—includes detection, promotion, log replay, and client reconnect. Recovery point objective is how much data loss is acceptable—persistent messages with synced logs target zero; non-persistent traffic may be lost by design. Active/passive with good storage usually protects persistent RPO; RTO depends on automation versus manual runbooks.

Application and Client Design

Applications should use reconnect options and connect to the queue manager name, not hard-coded hostname of the old active server. CCDT and DNS aliases help swing traffic. Idempotent consumers tolerate redelivery after failover when in-doubt transactions resolve ambiguously. Long-running XA transactions may block failover until coordinators decide—design shorter units of work where possible.

Fencing and Split Brain

Fencing stops the failed active from writing after standby promotes—STONITH, SCSI reservations, or cluster manager integration. Without fencing, a network partition can leave two actives; logs corrupt. Operations drills should include partitioned-network scenarios, not only clean power loss.

Tutorial: Document Active/Passive for One Hub

  1. State queue manager name and HA product (MIQM, RDQM, Native HA).
  2. Draw active, standby, storage, and network paths.
  3. Write target RTO and RPO with business sign-off.
  4. List client reconnect settings and CCDT update process.
  5. Schedule quarterly failover test with application verification.

Explainer: Spare Driver in the Car

Active/passive is one driver at the wheel while a trained relief driver waits in the passenger seat. If the driver becomes ill, the relief takes the wheel—only one steers at a time so the car does not crash.

Explain Like I'm Five

One toy cash register is open. Another is closed but ready. If the first breaks, the second opens—only one register takes money so the toy books stay correct.

Practice Exercises

Exercise 1

Compare active/passive multi-instance to active/active cluster for a hub-and-spoke bank integration.

Exercise 2

Define RTO and RPO for a payment queue with persistent messages and 2035-sensitive consumers.

Exercise 3

List three split-brain prevention mechanisms and which HA products use them.

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

1. Active/passive means:

  • One serves, one waits
  • Both serve equally always
  • No standby
  • Only z/OS

2. Multi-instance is:

  • Active/passive on shared disk
  • Only pub/sub
  • Only JMS
  • No logs

3. RTO measures:

  • Time to restore service
  • Message size
  • Cipher strength
  • Topic depth

4. Split brain is dangerous because:

  • Two actives corrupt shared data
  • TLS expires
  • Queues rename
  • Channels multiply
Published
Read time24 min
AuthorMainframeMaster
Verified: IBM MQ 9.3 documentation