Prometheus

Prometheus monitoring for IBM MQ brings queue depth, channel health, and queue manager status into the same observability stack many cloud-native teams already use for Kubernetes and microservices. Instead of operators manually running DISPLAY QLOCAL during every incident, metrics scrape every fifteen or thirty seconds into a time-series database where Grafana charts trend lines and Alertmanager pages on-call engineers. Beginners hear Prometheus and install an exporter without labeling strategy—six months later Prometheus disk is full from high-cardinality queue names on every metric. This tutorial explains the scrape model, IBM MQ metrics exporter role, useful metric families, PromQL alert examples at a conceptual level, cardinality discipline, pairing with Grafana, and how Prometheus complements—not replaces—MQ Console, Explorer, and runmqsc.

Pull Model Architecture

  1. MQ queue manager runs messaging workload.
  2. Metrics exporter connects to QM via PCF or supported API, caches counters.
  3. Exporter serves HTTP /metrics in Prometheus text format.
  4. Prometheus server scrapes exporter on interval (scrape_interval).
  5. Grafana queries Prometheus; Alertmanager fires rules.

Pull means Prometheus reaches out to exporters—firewalls must allow Prometheus hosts to exporter ports, not the reverse for basic setups.

Metric families (typical exporter)
Metric conceptOperational useCaution
Queue depthBacklog alertsPer-queue labels multiply cardinality
Channel statusNot RUNNING pagesMap numeric codes clearly
QM statusInstance down detectionOne target per QM
Put/get countsRate derivatives in PromQLCounter resets on restart
Listener upClient connectivity riskFalse positives during maintenance

Installing and Scrape Config

Deploy the IBM-supported MQ metrics exporter per your platform documentation—container sidecar, standalone VM, or operator-managed pod. Configure queue manager connection: hostname, port, channel, credentials. In prometheus.yml add a job scraping the exporter port. Use TLS on scrape targets in production. Store secrets in vaults, not plain text in Git.

yaml
1
2
3
4
5
6
7
# prometheus.yml fragment (illustrative) scrape_configs: - job_name: 'ibmmq' scrape_interval: 30s static_configs: - targets: ['mq-exporter.ops.svc:9157'] # tls_config: ... # production TLS

PromQL and Rates

Counters like messages put increase forever until reset on restart. Use rate() over five minutes to get puts per second: rate(mq_queue_puts_total[5m]). Compare put rate to get rate for imbalance. Histograms for latency require exporter support—verify your version exposes them.

Alerting Examples (Conceptual)

  • Queue depth greater than eighty percent of MAXDEPTH for ten minutes.
  • Channel status not equal to RUNNING for five minutes on critical SDR names.
  • Queue manager up metric zero for two minutes.
  • Exporter scrape failure—monitor monitoring.

Cardinality Discipline

Avoid labeling every queue name on cluster-wide aggregates—thousands of queues create millions of series. Prefer: aggregate metrics per queue manager; label only tier-1 queue names; use recording rules to pre-aggregate; drop high-cardinality labels in relabel configs where safe. Review series count after onboarding new applications.

z/OS and Hybrid

Run exporter on Linux gateway with client connection to z/OS QM. Network path and SVRCONN must be reliable. Latency between exporter and QM adds scrape skew—acceptable for ops metrics, not microsecond latency science.

Maintenance Windows

Silence alerts in Alertmanager during planned strmqm maintenance. Document which metrics flap when channels restart—avoid alert fatigue with sensible for: durations and severity levels.

Explainer: Thermometer on the Wall

Prometheus is a thermometer that checks itself every thirty seconds and writes the temperature in a notebook—Grafana draws the graph; Alertmanager calls you when it gets too hot.

Explain Like I'm Five

Prometheus is a robot that keeps asking the marble jar how full it is and writes the answer in a book so grown-ups can see a picture of fullness over time.

Practice Exercises

Exercise 1

Write three alert rules for a payment queue tier-1 list of five names only.

Exercise 2

Explain why labeling all 10,000 queue names hurts Prometheus.

Exercise 3

Draw scrape path from QM to Grafana in five boxes.

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

1. Prometheus collects metrics by:

  • Scraping HTTP endpoints
  • Only reading JCL
  • Browsing DLQ manually
  • 3270 screens

2. MQ metrics in Prometheus are used for:

  • Trends and alerting
  • Replacing all MQSC
  • Deleting queues
  • Issuing SYNCPOINT

3. High cardinality labels can:

  • Overwhelm Prometheus storage
  • Speed all queries
  • Remove TLS
  • Increase MAXDEPTH

4. Exporter connects to MQ via:

  • PCF or statistics APIs
  • Only FTP
  • CICS LINK
  • JES spool
Published
Read time20 min
AuthorMainframeMaster
Verified: IBM MQ 9.3 documentation