A poison message is the one message on a busy queue that never succeeds—every consumer gets it, throws an error, and backs out, while thousands of healthy messages wait behind it or compete for the same reader. IBM MQ does not delete the message automatically because at-least-once semantics assume transient failures might heal on retry. Without BOTHRESH and a backout queue, one bad XML file can stall an entire payment region. This page defines poison messages, explains BackoutCount in the MQMD, BOTHRESH and BOQNAME queue attributes, syncpoint backout loops, relationship to dead letter queues, monitoring and support playbooks, and how idempotent design changes but does not eliminate poison handling.
Consumer MQGET with MQGMO_SYNCPOINT retrieves a message. Validation fails—unknown message type, missing customer id, divide by zero in mapping. Application calls MQBACK. The message returns to the queue; BackoutCount in the descriptor increments. Another consumer or the same instance gets it again. After enough cycles, BOTHRESH routes the message to BOQNAME so the primary queue moves forward. Beginners see high CPU and flat throughput with CURDEPTH barely moving: one message is cycling.
| Attribute | Purpose | Beginner tip |
|---|---|---|
| BOTHRESH | Max backouts before special handling | Set low enough to quarantine quickly, high enough for transient glitches |
| BOQNAME | Backout queue for exceeded threshold | Monitor depth; assign owners for repair |
| MAXDEPTH | Queue capacity | Poison plus backlog can fill queue; alert early |
| DEFPSIST | Default persistence | Poison on persistent queues survives restarts—plan repair |
BackoutCount appears in the MQMD when you browse or get a message. Support compares it to BOTHRESH to see if the next failure will route to the backout queue. Log BackoutCount on every failure with MsgId hex. Sudden spikes after a deployment often mean schema mismatch—all messages look poison until rollback. Gradual single-message growth means one bad record in a batch feed.
Transient failures (database down for thirty seconds) may succeed on retry without hitting BOTHRESH. Poison failures repeat identically until code or data changes. Design consumers to distinguish: retry with backoff for SQL timeout; route to backout immediately for unknown record type. Do not retry forever on logic errors—that burns operations goodwill.
Some estates forward from backout to SYSTEM.DEAD.LETTER.QUEUE for a single monitoring pane. Report messages can notify applications when a message cannot be delivered. Poison handling is primarily application backout; DLQ is broader routing failure—both appear in incident reviews together.
Ten competing consumers might all hit the same poison message in rotation, wasting capacity. BOTHRESH limits damage by moving it aside. Partitioning by message type into separate queues isolates risky feeds from core orders. Schema validation at put time (API gateway) prevents some poison from entering MQ at all.
123456DEFINE QLOCAL('ORDERS.IN') REPLACE + BOTHRESH(3) BOQNAME('ORDERS.BACKOUT') MAXDEPTH(500000) DEFINE QLOCAL('ORDERS.BACKOUT') REPLACE MAXDEPTH(10000) * After repeated consumer MQBACK, message may appear on ORDERS.BACKOUT DISPLAY QLOCAL('ORDERS.BACKOUT') CURDEPTH DISPLAY QLOCAL('ORDERS.IN') BOTHRESH BOQNAME
Everyone in line for lunch has a tray that works except one broken tray that crashes every time someone picks it up. The line stops moving. The teacher moves the broken tray to a side table (backout queue) so everyone else can eat. Poison message is that broken tray; fixing it means repair or throw away with permission.
BOTHRESH=3, message fails twice then succeeds. Where does it process? What if it fails four times?
After release, all messages back out. Poison or bad release? List checks.
Write five steps for operator when ORDERS.BACKOUT depth goes from 0 to 50.
1. BackoutCount increases when:
2. BOTHRESH is used to:
3. A poison message often has:
4. BOQNAME specifies: