What is Prometheus in IBM MQ monitoring?

Prometheus is an open-source metrics database that scrapes HTTP endpoints on a schedule. IBM MQ metrics exporters expose queue depth, channel status, and other counters as Prometheus time series for alerting and dashboards.

How does MQ expose metrics to Prometheus?

Typically through the IBM MQ metrics exporter (or platform operator) that connects to queue managers via PCF or published statistics and serves /metrics for Prometheus to scrape.

Is Prometheus a replacement for MQ Console?

No. Prometheus stores numeric time series for trends and alerts. Console and Explorer are for interactive administration and configuration.

What MQ metrics should I alert on first?

Queue depth high, channel not RUNNING, queue manager status, disk related errors, and sustained put-get imbalance. Start small and add cardinality carefully.

Does Prometheus work on z/OS MQ?

Exporters usually run on distributed agents or gateways that connect to z/OS queue managers remotely. Architecture varies by site—confirm supported collector for your platform.

MainframeMaster

Prometheus

Prometheus monitoring for IBM MQ brings queue depth, channel health, and queue manager status into the same observability stack many cloud-native teams already use for Kubernetes and microservices. Instead of operators manually running DISPLAY QLOCAL during every incident, metrics scrape every fifteen or thirty seconds into a time-series database where Grafana charts trend lines and Alertmanager pages on-call engineers. Beginners hear Prometheus and install an exporter without labeling strategy—six months later Prometheus disk is full from high-cardinality queue names on every metric. This tutorial explains the scrape model, IBM MQ metrics exporter role, useful metric families, PromQL alert examples at a conceptual level, cardinality discipline, pairing with Grafana, and how Prometheus complements—not replaces—MQ Console, Explorer, and runmqsc.

Pull Model Architecture

MQ queue manager runs messaging workload.
Metrics exporter connects to QM via PCF or supported API, caches counters.
Exporter serves HTTP /metrics in Prometheus text format.
Prometheus server scrapes exporter on interval (scrape_interval).
Grafana queries Prometheus; Alertmanager fires rules.

Pull means Prometheus reaches out to exporters—firewalls must allow Prometheus hosts to exporter ports, not the reverse for basic setups.

Metric families (typical exporter)
Metric concept	Operational use	Caution
Queue depth	Backlog alerts	Per-queue labels multiply cardinality
Channel status	Not RUNNING pages	Map numeric codes clearly
QM status	Instance down detection	One target per QM
Put/get counts	Rate derivatives in PromQL	Counter resets on restart
Listener up	Client connectivity risk	False positives during maintenance

Installing and Scrape Config

Deploy the IBM-supported MQ metrics exporter per your platform documentation—container sidecar, standalone VM, or operator-managed pod. Configure queue manager connection: hostname, port, channel, credentials. In prometheus.yml add a job scraping the exporter port. Use TLS on scrape targets in production. Store secrets in vaults, not plain text in Git.

yaml

1
2
3
4
5
6
7
# prometheus.yml fragment (illustrative)
scrape_configs:
  - job_name: 'ibmmq'
    scrape_interval: 30s
    static_configs:
      - targets: ['mq-exporter.ops.svc:9157']
    # tls_config: ...  # production TLS

PromQL and Rates

Counters like messages put increase forever until reset on restart. Use rate() over five minutes to get puts per second: rate(mq_queue_puts_total[5m]). Compare put rate to get rate for imbalance. Histograms for latency require exporter support—verify your version exposes them.

Alerting Examples (Conceptual)

Queue depth greater than eighty percent of MAXDEPTH for ten minutes.
Channel status not equal to RUNNING for five minutes on critical SDR names.
Queue manager up metric zero for two minutes.
Exporter scrape failure—monitor monitoring.

Cardinality Discipline

Avoid labeling every queue name on cluster-wide aggregates—thousands of queues create millions of series. Prefer: aggregate metrics per queue manager; label only tier-1 queue names; use recording rules to pre-aggregate; drop high-cardinality labels in relabel configs where safe. Review series count after onboarding new applications.

z/OS and Hybrid

Run exporter on Linux gateway with client connection to z/OS QM. Network path and SVRCONN must be reliable. Latency between exporter and QM adds scrape skew—acceptable for ops metrics, not microsecond latency science.

Maintenance Windows

Silence alerts in Alertmanager during planned strmqm maintenance. Document which metrics flap when channels restart—avoid alert fatigue with sensible for: durations and severity levels.

Explainer: Thermometer on the Wall

Prometheus is a thermometer that checks itself every thirty seconds and writes the temperature in a notebook—Grafana draws the graph; Alertmanager calls you when it gets too hot.

Explain Like I'm Five

Prometheus is a robot that keeps asking the marble jar how full it is and writes the answer in a book so grown-ups can see a picture of fullness over time.

Practice Exercises

Exercise 1

Write three alert rules for a payment queue tier-1 list of five names only.

Exercise 2

Explain why labeling all 10,000 queue names hurts Prometheus.

Exercise 3

Draw scrape path from QM to Grafana in five boxes.

Frequently Asked Questions

Test Your Knowledge

1. Prometheus collects metrics by:

Scraping HTTP endpoints
Only reading JCL
Browsing DLQ manually
3270 screens

2. MQ metrics in Prometheus are used for:

Trends and alerting
Replacing all MQSC
Deleting queues
Issuing SYNCPOINT

3. High cardinality labels can:

Overwhelm Prometheus storage
Speed all queries
Remove TLS
Increase MAXDEPTH

4. Exporter connects to MQ via:

PCF or statistics APIs
Only FTP
CICS LINK
JES spool

Prometheus

Pull Model Architecture

Installing and Scrape Config

PromQL and Rates

Alerting Examples (Conceptual)

Cardinality Discipline

z/OS and Hybrid

Maintenance Windows

Explainer: Thermometer on the Wall

Explain Like I'm Five

Practice Exercises

Exercise 1

Exercise 2

Exercise 3

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

Grafana

Queue Depths

Message Rates

MQ Console