IBM MQ trace is the microscope operators use when AMQERR messages and DISPLAY CHSTATUS are not enough. Trace records internal decisions—channel state transitions, security checks, API flows—at a level of detail that would overwhelm normal error logs. That power comes with cost: disk consumption, CPU overhead, and sensitive data in trace files. Beginners enable trace on production Friday afternoon without an end time; experts schedule a fifteen-minute window, reproduce one failure, run endmqtrc, and attach formatted output to a ticket. This tutorial explains what MQ trace is, how it differs from AMQERR and FDC, trace level concepts, the strmqtrc and endmqtrc command pair, formatting and handling trace files, coordination with client-side trace, and governance so tracing helps instead of becoming its own incident.
Start every investigation with AMQERR and object/display commands. Escalate to FDC when AMQERR cites internal faults. Enable MQ trace when you need a time-ordered narrative of what the queue manager did leading to failure—especially for race conditions, short-lived BINDING states, or CONNAUTH sequences. Add GSKit or SSL trace separately when handshake bytes matter; MQ trace shows MQ layer decisions, not every TLS byte. Client applications may have their own trace (Java, .NET, C)—correlate timestamps across client and server trace in UTC.
| Level (example) | Detail | When appropriate |
|---|---|---|
| 1 | Minimal extra detail | Lightweight sanity check in lab |
| 4 | Moderate channel/API flow | Many channel bind issues |
| 9+ | Very verbose | Short lab only; IBM Support directed |
| User attribute filters | Limit to one channel or function | Reduce noise on busy QM |
Depending on options passed to strmqtrc, trace can focus on the entire queue manager, specific channels, or functional areas (names vary by platform—consult IBM MQ trace parameters for Linux, Windows, and z/OS). Channel trace shows state machine movement from INACTIVE through BINDING to RUNNING or back to RETRY. API-related trace helps when a server-connected application receives unexpected reason codes without channel involvement. Trace does not replace accounting statistics or Prometheus metrics—it is a forensic tool, not a dashboard.
Trace output location is platform-specific; often under /var/mqm/trace or paths referenced in IBM documentation. Files may be binary until formatted. The fmtmqver utility (spelling confirmed in IBM docs for your version) produces text suitable for grep. Search formatted trace for channel names, reason codes, and USERID strings. Redact traces before sending outside the company—they may contain message payload fragments if applications put sensitive data. For regulated industries, trace handling falls under the same policy as application logs.
123456# Example pattern—see strmqtrc/endmqtrc tutorials for exact syntax on your OS: strmqtrc -m QM1 -t 4 -b -l /var/mqm/trace/QM1-incident-20260517 # reproduce problem once endmqtrc -a # format per IBM documentation: fmtmqver /var/mqm/trace/QM1-incident-20260517*
AMQERR is a written incident report. Trace is security camera footage showing every step before the door alarm. You only roll the camera when you need to see the steps—not all year, or the storage room fills up.
A 2009 or 2278 client error may require client trace from the application host plus server trace from the queue manager. Align clocks with NTP. Mark the same connect attempt in both files using a unique test message or comment in the change ticket. If only server trace is collected, you may see the queue manager reject a connection without seeing which IP or cipher the client offered.
Trace is a very detailed diary that records every step the mail machine takes. It is huge, so you only turn the diary on for a few minutes while you watch one letter get stuck.
Write a one-page trace plan template: symptom, level, duration, who runs endmqtrc.
List three problems where AMQERR alone is enough and three where trace is justified.
Identify your platform's trace directory and current free disk space.
1. MQ trace should be enabled:
2. Higher trace levels generally mean:
3. After trace collection you should:
4. First diagnostic step before trace: