Log utilization tells operators how close an IBM MQ queue manager is to a logging crisis. Unlike queue depth, which measures application backlog, log utilization measures pressure on the circular recovery log and the archive infrastructure that keeps that log reusable. When utilization stays high, persistent puts slow down or stop, channels may stall on committed messages, and disaster recovery windows shrink because archives are not moving to safe storage fast enough. Beginners often monitor queue depth dashboards while the real fire is a log directory at ninety-nine percent disk full on a Friday night batch. This tutorial explains what log utilization means on distributed and z/OS MQ, which attributes and commands expose it, how to build metrics and alerts in Prometheus and Grafana, the relationship between primary log wrap, archive lag, and media recovery, tuning responses that do not sacrifice DR policy, and troubleshooting playbooks when utilization spikes without an obvious traffic increase.
The primary log is a circular set of files (LOGFIL files under LOGPATH on distributed platforms). As transactions commit, MQ writes log records. When a log file fills, it may become eligible for archiving; the archiver copies it to the archive location and marks primary space reusable. Log utilization in the operational sense has two layers: how much of the active circular log is consumed before wrap or archive free-up, and how much free space remains on the filesystem holding archives. Both can hurt you—full circular log with slow archive is different from plenty of circular room but archive disk full so archiving cannot complete.
| Signal | Typical source | Risk if ignored |
|---|---|---|
| Primary log percent used | DISPLAY QMSTATUS, exporter metric | Wrap stall; put inhibition |
| Archive filesystem free % | OS disk monitor on LOGARCH path | Archive failure; QM stop |
| Archive lag time | Oldest unarchived log vs now | Extended recovery window; full log |
| Log write latency | Host iowait, storage metrics | Slow puts; apparent high CPU |
| Persistent put rate | Statistics, queue metrics | Predictable utilization spikes |
123456789DISPLAY QMSTATUS ALL * Review log-related fields for your MQ version, for example: * LOG path, log extent in use, archive path status DISPLAY QMGR * Confirm LOGPATH, LOGARCHMETH, LOGARCHPATH, LOGFIL, LOGFILE size attributes df -h /var/mqm/log /var/mqm/archive * OS-level check on distributed Linux — paths match your site
Exact attribute names vary by MQ version and platform; always compare DISPLAY output to your version documentation before automating parsers. On z/OS, log data sets and archive volumes use different commands—pair with RMF and storage management reports for utilization. The beginner habit to build: every time CURDEPTH spikes on persistent queues, glance at log utilization on the same timeline.
Exporters may expose ibmmq_log_utilization or similar—validate against DISPLAY during a controlled test. Combine with node_exporter filesystem metrics on mount points that hold LOGPATH and archives. Dynatrace and Instana host agents surface disk saturation that explains utilization growth when logging I/O waits on slow storage.
Log tuning (LOGFIL size, disk tier, sync policy) sets capacity. Log utilization monitoring tells you when capacity is exhausted in production. Capacity planning should use peak batch persistent rate times commit size, not average midday traffic. DR requirements may mandate minimum archive retention on disk—utilization alerts must account for retention consuming space even when current put rate is low.
Log and archive data sets on different volumes; monitor volume free space and catalog constraints. Coupling facility and queue sharing groups add shared recovery context—utilization incidents may affect multiple queue managers in a group. Coordinate with storage teams for SMS-managed volumes and automatic expansion where policy allows.
The circular log is a bathtub filling from the faucet (persistent puts). The archive is the drain. Utilization is how full the tub is when the drain clogges or the faucet opens fully during batch hour.
The computer keeps a notebook of every important message so it can remember after a restart. Log utilization is how full that notebook is. If the notebook fills and nobody copies old pages to a filing cabinet (archive), the computer stops writing new important notes.
Design two Grafana alerts: log percent high and archive disk low. Write threshold values and durations.
During a simulated batch, list commands and OS checks you run every five minutes while utilization rises.
Explain to an application team why reducing persistent puts helps log utilization but non-persistent puts do not.
1. High log utilization primarily threatens:
2. Archive disk full can cause:
3. Log utilization should be monitored with:
4. Reducing persistent put rate during log crisis: