Production standards for IBM MQ are how you keep Friday night deploys from becoming Saturday morning outages. The queue manager technology is stable; most production pain comes from untracked MQSC, missing DLQ, alerts nobody owns, and developers pointing UAT CCDT at production by typo. Production standards tie together naming, security hardening, performance thresholds, and HA runbooks into daily operations: who may change what, how changes flow through pipelines, what must be monitored, and how incidents escalate. Beginners who only learned DEFINE QLOCAL in a lab need this governance layer before touching a paying customer queue manager.
Separate queue managers—or minimally separate clusters and strict naming prefixes—for DEV, TST, UAT, and PRD. Never share SVRCONN passwords across environments. CCDT files are labeled and distributed per environment; CI tests fail if production hostnames appear in dev branches. Data in production queues is real; copying production messages to dev may violate privacy law—use masked synthetic data.
| Rule | Why | How to verify |
|---|---|---|
| Unique QM per environment | Blast radius containment | dspmq list matches CMDB |
| No prod credentials in dev | Prevent accidental prod access | Secret scan in repos |
| Distinct channel names or prefixes | Avoid cross-connect | DISPLAY CHANNEL audit |
| Separate TLS certificates | Trust boundary | CERTLABL per environment |
All production object changes flow from ticket to Git MQSC file to automated runmqsc with captured output. Interactive changes are emergency-only with retrospective ticket. Pipelines grep for AMQ8xxxE and fail. Peer review checks naming standard compliance, MAXDEPTH, DEFPSIST, and authority grants. ALTER on shared channels requires partner notification when CONNAME or TLS changes.
1234# Pre-change snapshot (operations standard) dmpmqcfg -m QM1 -a > /backup/QM1-$(date +%Y%m%d-%H%M).mqsc runmqsc QM1 < change-12345.mqsc > change-12345.out 2>&1 grep AMQ8.*E change-12345.out && exit 1 || exit 0
Every production queue in standards catalog has an owner team, depth alert thresholds, and escalation path. Channel RETRY alerts route to middleware operations, not only application L1. AMQERR rate spikes feed SIEM rules. Dashboards show XMITQ depth alongside application queues. On-call runbooks link to every-mq-channel-error and every-mq-amq-message pages for triage vocabulary.
Business queues define BOQNAME or use queue manager DLQ where appropriate. Operators review DLQ depth daily in high-volume systems. Messages on DLQ require ticket to replay or discard with business approval. Applications set reasonable backout threshold; infinite retry loops violate production standards.
Backup frequency matches HA tier RPO. Restore drills prove dmpmqcfg plus data restore recreates objects. Document channel sequence reset after restore. z/OS includes CF structure backup strategy. Cloud deployments document persistent volume snapshots and operator behavior.
Incidents log queue manager name, object names, MQRC, AMQ IDs, CHSTATUS, and timestamps. First actions follow layer model: dspmq, listener, channel, TLS, OAM—not random RESET. Severity maps to payment impact and depth growth rate. Post-incident review updates standards when gaps found.
Lab flying is solo in a field. Production is controlled airspace: filed flight plans, tower clearance, radar, and rules when weather fails. Standards are the rulebook controllers and pilots share.
The real toy store has rules: only grown-ups with a name tag change prices, the alarm rings if too many toys pile up, and there is a plan if the lights go out.
Complete go-live checklist for a new queue pair in lab.
Audit one production QM for interactive changes outside pipeline in logs.
Write incident template fields for MQ tickets.
1. Production MQSC should be:
2. Before change, capture config with:
3. DLQ in production is:
4. DEV and PRD queue managers should: