Native HA

Native HA is IBM MQ’s cloud-native high availability story: queue managers packaged for containers, lifecycle managed by operators, storage and networking declared in YAML instead of hand-edited fstab entries on two Linux VMs. Teams moving messaging to Kubernetes or Red Hat OpenShift do not want 1990s runbooks for strmqm on shared NFS; they want a Custom Resource that says “this queue manager is HA” and lets the platform restart pods after node drains. Native HA still delivers active/passive semantics at the messaging layer—one primary queue manager image serving the name—but orchestration handles placement, health probes, and persistent volume claims. This tutorial explains what Native HA adds over classic installs, how it relates to the IBM MQ operator, storage and networking considerations, client connectivity in dynamic environments, comparison to RDQM and multi-instance, and operational habits that differ from LPAR operations.

Why Native HA Exists

Virtual machines with shared SAN remain valid. Container platforms prefer replicated volumes or storage classes and pod anti-affinity to spread risk across nodes. Native HA aligns MQ with that model so platform engineers do not bolt MIQM scripts onto Kubernetes as an afterthought.

Operator-Managed Lifecycle

The IBM MQ operator watches Custom Resources defining queue managers, versions, storage, and HA enablement. When a node fails, Kubernetes reschedules pods; the operator coordinates MQ-specific startup order and recovery. Beginners should learn operator logs alongside queue manager logs—failures often appear in operator events before AMQ messages make sense.

Native HA building blocks
BlockPurposeOps focus
QueueManager CRDeclares desired MQ stateGitOps version control
Persistent volumesHold logs and qm dataStorage class, backup
Pod anti-affinitySpread replicas across nodesZone spread in cloud
Services / routesStable client endpointsDNS and TLS certs
Health probesRestart unhealthy podsTune timeouts vs MQ start time

Storage in Kubernetes

Persistent volume claims bind queue manager data. Storage class choice affects failover time—network-attached storage in the cloud behaves differently from local SSD. Backup operators snapshot PVCs or use IBM backup tools per documentation. Test restore into a fresh namespace quarterly.

Networking and Clients

ClusterIP, NodePort, LoadBalancer, or OpenShift routes expose listeners. Clients outside the cluster need routes and TLS trust stores matching pod certificates. MQ client reconnect and CCDT should target the service DNS name, not a pod IP that changes every reschedule. Mutual TLS rotation requires coordinated cert manager updates.

text
1
2
3
4
5
/* Client design checklist - MQCONNX with reconnect options enabled - CCDT points at service hostname / route - TLS trust store includes current cert issuer - Idempotent consumers for redelivery after failover */

Native HA Versus Other HA Options

When to use which
OptionBest fitWeak fit
Native HAKubernetes standardBare metal without orchestration
Multi-instanceTwo VMs + SANCloud without shared disk
RDQMLinux VMs, no SANPure z/OS
IBM MQ on CloudManaged service preferenceOn-prem only policy

Upgrades and Day-2 Operations

Operator-driven upgrades roll queue manager versions with declared maintenance windows. Readiness probes must allow enough time for log recovery on restart. Horizontal Pod Autoscaler does not replace queue manager HA—do not confuse pod count with MQ active/passive roles unless architecture explicitly supports it.

Explainer: Self-Healing Restaurant Kitchen

Native HA is a kitchen where if one cook station shuts down, the manager automatically moves the recipe book to another station that already has a copy of ingredients—Kubernetes is the manager, Native HA is the recipe for MQ.

Explain Like I'm Five

The toy factory has backup rooms. If one room breaks, robots move the same toy list to another room that already has the same toys copied.

Practice Exercises

Exercise 1

List five differences between operating MIQM on VMs versus Native HA on OpenShift.

Exercise 2

Design CCDT for clients connecting to an OpenShift route with reconnect.

Exercise 3

Write failover verification steps using kubectl and MQ SC commands.

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

1. Native HA targets:

  • Cloud-native and Kubernetes estates
  • Only z/OS batch
  • Only FTP
  • Only IMS

2. Native HA differs from MIQM by:

  • Container operator integration
  • No persistence
  • No channels
  • No security

3. After pod failover clients need:

  • Reconnect options
  • New queue names
  • Disable logs
  • Delete CCDT

4. Native HA is active/passive in that:

  • One primary serves QM at a time
  • All pods write logs freely
  • No quorum
  • No storage
Published
Read time23 min
AuthorMainframeMaster
Verified: IBM MQ 9.3 documentation