Replicated Data Queue Managers

Replicated data queue managers describe a class of IBM MQ high availability: instead of two servers sharing one disk, each server keeps its own copy of queue manager data synchronized through replication. When the primary node fails, a replica promotes and continues the same queue manager name with data that should match what the primary had at the last successful sync. IBM implements this model prominently through RDQM on Linux; Native HA uses related ideas in container platforms. The phrase appears in architecture documents as the alternative to “shared filesystem multi-instance.” Beginners should understand replication versus shared disk, quorum, lag, and active/passive limits before procurement. This tutorial defines replicated data queue managers generically, maps concepts to RDQM, contrasts multi-instance, explains consistency and failover implications, and lists planning questions for network, storage, and operations teams.

Shared Disk Versus Replicated Data

Two HA storage models
ModelData copiesFailover actionIBM MQ example
Shared diskOne on SAN/NASStandby mounts same pathMulti-instance
Replicated dataOne per node, syncedStandby uses local replicaRDQM

Shared disk simplifies consistency—there is only one truth. Replication trades SAN dependency for sync complexity and network load between nodes.

Replication Flow

  1. Primary queue manager writes log and data files locally.
  2. Replication layer copies blocks or files to secondary nodes.
  3. Replicas acknowledge per policy—sync or async tradeoffs.
  4. On failure, quorum elects new primary from nodes with valid replica.
  5. New primary runs log recovery and opens listeners.

Consistency and RPO

Synchronous replication waits for replica acknowledgment before acknowledging persistent puts to applications—tighter RPO, higher latency. Asynchronous replication is faster but may lose the last milliseconds of writes if the primary vanishes. Architects document acceptable loss with business stakeholders; do not assume zero RPO without measuring lag under peak load.

Quorum and Node Count

Replicated data clusters need odd node counts or explicit witnesses so promotion always reflects majority agreement. Losing quorum may block automatic failover to prevent two divergent primaries—operations then follow manual disaster procedures. Three data centers with two replicas and one witness is a common pattern when cost allows.

RDQM as Implementation

RDQM packages replicated data queue manager operations for Linux: install RDQM nodes, create replicated queue manager, monitor replication status, fail over with documented commands. Read the RDQM tutorial for product-specific procedures; this page holds the portable concepts for architecture reviews.

text
1
2
3
4
5
6
/* Architecture review questions - How many replicas and where (AZ/rack)? - Sync or async replication default? - Quorum layout if one site lost? - Client reconnect and DNS during promotion? - Backup of replica volumes for corruption cases? */

Limits of Replicated Data HA

  • One logical primary writer at a time—active/passive, not magic active/active on one data set.
  • Does not replace cross-datacenter messaging patterns—channels still needed between hubs.
  • Corruption on primary may replicate—snapshots and backups still required.
  • Not a substitute for application idempotency and DLQ design.

Comparison Table for Architects

HA option summary
OptionStorageScale-outOps profile
Replicated data (RDQM)Per-node replicaHub HALinux quorum cluster
Multi-instanceSharedHub HASAN + two VMs
Native HAPVC / cloudHub HA in K8sOperator + platform
ClusterPer QMMany QMs activeRepository + CLWL
QSGCF + page setsSysplex membersz/OS systems prog

Explainer: Photocopiers That Stay Updated

Shared disk is one master document everyone must visit. Replicated data is every office having a copier that automatically updates when the master changes—failover means working from the branch copy when headquarters goes dark.

Explain Like I'm Five

Every friend has the same storybook copied. When the leader friend stops reading, another friend who has the same page continues— but only if enough friends agree which book is the newest.

Practice Exercises

Exercise 1

Draw shared-disk versus replicated-data diagrams for two-node HA.

Exercise 2

Define RPO for async replication when primary datacenter floods.

Exercise 3

Match business requirements to RDQM, MIQM, or cluster in a table.

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

1. Replicated data means:

  • Copies on multiple nodes
  • No logs
  • Only RAM
  • Only z/OS

2. Shared-disk MIQM has:

  • One data copy on SAN
  • Three active writers
  • No failover
  • No persistence

3. Replication lag affects:

  • RPO at failover
  • TLS cipher
  • Topic wildcards
  • JCL

4. RDQM implements:

  • Replicated data queue managers
  • Only pub/sub
  • Only MQTT
  • Only CICS
Published
Read time24 min
AuthorMainframeMaster
Verified: IBM MQ 9.3 documentation