StatefulSets

A Kubernetes StatefulSet is the workload controller most teams use when they run IBM MQ queue managers as pods without the full MQ Operator—because MQ is stateful in every sense: disk files, process identity, listener ports, and channel sequence state must line up after every restart. StatefulSets assign predictable pod names, create persistent volume claims per replica from templates, and roll out updates in order so you do not get two pods fighting for one queue manager directory. Beginners who treat StatefulSets like Deployments with storage attached once for all replicas learn painful lessons about corrupted repositories. This tutorial explains ordinal naming, headless Services, volumeClaimTemplates, podManagementPolicy choices, rolling update partitions for canaries, pairing StatefulSets with anti-affinity for node failure, differences from MQ Operator-managed QueueManagers, Native HA replica layouts at a high level, and operational runbooks for cordon, drain, and pod delete scenarios.

Stable Network Identity

Pod mq-qm1-0 keeps the hostname mq-qm1-0 across reschedules when the StatefulSet and PVC still exist. The headless Service mq-qm1 publishes DNS records so mq-qm1-0.mq-qm1.mq-prod.svc.cluster.local resolves to the pod IP. Channel definitions and client connection tables should reference this DNS or an external load balancer in front of the Service—not ephemeral pod IPs. When pod-0 moves to another node, the same PVC reattaches; MQ continues with the same /mnt/mqm data. Certificate SANs may need to include Service DNS names if TLS validates hostnames strictly.

volumeClaimTemplates and Storage Binding

volumeClaimTemplates in the StatefulSet spec auto-create PVC mqdata-mq-qm1-0 for pod ordinal 0, mqdata-mq-qm1-1 for ordinal 1, and so on. Each claim is independent—correct for three queue managers in one StatefulSet (uncommon pattern) or for Native HA layouts where IBM assigns storage per role. Wrong pattern: one manual PVC referenced by all pods in a Deployment—Kubernetes may schedule two pods that cannot mount RWO simultaneously, leaving one Pending forever or worse if your platform misbehaves. Storage class per template should match IOPS requirements discussed in the persistent volumes tutorial.

yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
apiVersion: v1 kind: Service metadata: name: mq-qm1 spec: clusterIP: None selector: app: mq-qm1 ports: - port: 1414 name: mq-tcp --- apiVersion: apps/v1 kind: StatefulSet metadata: name: mq-qm1 spec: serviceName: mq-qm1 replicas: 1 podManagementPolicy: OrderedReady updateStrategy: type: RollingUpdate selector: matchLabels: app: mq-qm1 template: metadata: labels: app: mq-qm1 spec: containers: - name: qmgr image: icr.io/ibm-messaging/mq:9.4.0.0-r1 env: - name: LICENSE value: accept - name: MQ_QMGR_NAME value: QM1 volumeMounts: - mountPath: /mnt/mqm name: qmdata volumeClaimTemplates: - metadata: name: qmdata spec: accessModes: [ReadWriteOnce] resources: requests: storage: 100Gi

Ordered Startup and Shutdown

OrderedReady creates pod-0 before pod-1. For a single-replica QM this is simple. For multi-QM StatefulSets, ordinal order may matter if pod-1 channels to pod-0. On delete, Kubernetes terminates highest ordinal first—understand impact on HA pairs. Parallel policy starts every pod at once; use only when pods are independent queue managers with separate data, not cooperative instances sharing one directory.

Rolling Updates and Partitions

RollingUpdate replaces pods one at a time. Set partition during canary: only pods with ordinal greater than or equal to partition update—leave mq-qm1-0 on old image while testing mq-qm1-1 if your architecture has multiple QMs. For single QM, test upgrades in lower environment first; log replay time may exceed progressDeadlineSeconds and mark the StatefulSet failed—increase deadlines or use OnDelete strategy during maintenance windows. OnDelete requires manual pod deletion to pick up new spec—slower but controlled.

Update strategies
StrategyBehaviorMQ use case
RollingUpdateAutomatic pod recreationPatch image with tested probes
OnDeleteManual pod delete triggers updateMaintenance window control
PartitionCanary subset of ordinalsMulti-QM StatefulSet tests

Scheduling: Affinity and Topology

podAntiAffinity spreads MQ pods across nodes so one hypervisor failure does not take all queue managers in a namespace. topologySpreadConstraints balance zones in cloud regions. Taints and tolerations dedicate node pools to MQ for noisy-neighbor isolation. CPU manager static policy is advanced tuning for low latency. Document which node lost pod-0 during drain—PVC must reattach on the new node; multi-attach errors mean old node has not released volume yet.

Explainer: Numbered Lockers

A StatefulSet is a school hallway with numbered lockers. Student mq-qm1-0 always uses locker zero and the same notebook inside. Even if the student moves classrooms (node), locker zero moves with them (PVC). Deployment-style random lockers would swap notebooks between students.

StatefulSet Versus MQ Operator

Hand-written StatefulSets teach fundamentals. The MQ Operator creates and owns StatefulSets or similar constructs from QueueManager custom resources—adding certificate management, upgrade hooks, and status conditions. Greenfield OpenShift often installs Operator once; hand YAML suits learning clusters. Migrating from StatefulSet to Operator may require data migration runbooks—do not assume kubectl apply Operator on top of orphan PVC without IBM migration guide.

Native HA and Multiple Pods

IBM MQ Native HA uses multiple pods cooperating with replication—not three independent StatefulSet replicas mounting one PVC. Follow IBM architecture diagrams for active and replica roles, storage per pod, and failover automation. Traditional multi-instance QM on VMs differs; do not conflate MP with Kubernetes replica count.

Operations Runbook Snippets

  1. Planned upgrade: snapshot PVC, update image, watch pod-0 logs until MQ ready, run smoke put/get.
  2. Node drain: delete pod-0 gracefully; confirm volume detached before force.
  3. Stuck Terminating: check finalizers, volume attachment on cloud console.
  4. Scale replicas up: only when adding new independent QMs with new ordinals and claims.

Troubleshooting

  • Pod Pending — PVC binding or volume zone mismatch with node.
  • Wrong QM after restart — empty new claim created; check volumeClaimTemplates name.
  • DNS fails — headless Service selector does not match pod labels.
  • Split brain — never run two pods on one RWO PVC; investigate stuck old pod.

Explain Like I'm Five: StatefulSets

StatefulSets give each MQ box a permanent name tag and its own drawer. The building always knows which box is box zero, and box zero always keeps the same drawer even when the box is moved.

Practice Exercises

Exercise 1

Deploy StatefulSet replicas: 1; delete pod-0; verify same QMNAME and message count via DNS name.

Exercise 2

Describe difference between ClusterIP Service and headless Service for your MQ pod.

Exercise 3

Simulate rolling update with OnDelete strategy; document manual steps.

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

1. StatefulSet pod names include:

  • Stable ordinal suffix
  • Random UUID only
  • Date stamp
  • JCL job name

2. volumeClaimTemplates create:

  • PVC per pod
  • One shared RAM disk
  • No storage
  • Ingress only

3. Headless Service provides:

  • Per-pod DNS
  • Free TLS cert
  • Automatic HA
  • COBOL compile

4. Three replicas one QM PVC:

  • Unsupported/wrong
  • Best practice
  • Doubles safety
  • IBM required
Published
Read time20 min
AuthorMainframeMaster
Verified: IBM MQ 9.4 Kubernetes documentation