Poison Messages

A poison message is the one message on a busy queue that never succeeds—every consumer gets it, throws an error, and backs out, while thousands of healthy messages wait behind it or compete for the same reader. IBM MQ does not delete the message automatically because at-least-once semantics assume transient failures might heal on retry. Without BOTHRESH and a backout queue, one bad XML file can stall an entire payment region. This page defines poison messages, explains BackoutCount in the MQMD, BOTHRESH and BOQNAME queue attributes, syncpoint backout loops, relationship to dead letter queues, monitoring and support playbooks, and how idempotent design changes but does not eliminate poison handling.

How the Backout Loop Forms

Consumer MQGET with MQGMO_SYNCPOINT retrieves a message. Validation fails—unknown message type, missing customer id, divide by zero in mapping. Application calls MQBACK. The message returns to the queue; BackoutCount in the descriptor increments. Another consumer or the same instance gets it again. After enough cycles, BOTHRESH routes the message to BOQNAME so the primary queue moves forward. Beginners see high CPU and flat throughput with CURDEPTH barely moving: one message is cycling.

Queue attributes for poison handling
AttributePurposeBeginner tip
BOTHRESHMax backouts before special handlingSet low enough to quarantine quickly, high enough for transient glitches
BOQNAMEBackout queue for exceeded thresholdMonitor depth; assign owners for repair
MAXDEPTHQueue capacityPoison plus backlog can fill queue; alert early
DEFPSISTDefault persistencePoison on persistent queues survives restarts—plan repair

Reading BackoutCount

BackoutCount appears in the MQMD when you browse or get a message. Support compares it to BOTHRESH to see if the next failure will route to the backout queue. Log BackoutCount on every failure with MsgId hex. Sudden spikes after a deployment often mean schema mismatch—all messages look poison until rollback. Gradual single-message growth means one bad record in a batch feed.

Poison vs Transient Failure

Transient failures (database down for thirty seconds) may succeed on retry without hitting BOTHRESH. Poison failures repeat identically until code or data changes. Design consumers to distinguish: retry with backoff for SQL timeout; route to backout immediately for unknown record type. Do not retry forever on logic errors—that burns operations goodwill.

Backout Queue Operations

  1. Alert on backout queue depth above zero in production.
  2. Browse message; capture MsgId, CorrelId, payload sample (mask PII).
  3. Identify root cause: producer bug, consumer version, reference data.
  4. Fix and requeue to main queue, or discard with business approval.
  5. Document incident; add validation to reject bad format at the edge.

DLQ and Report Messages

Some estates forward from backout to SYSTEM.DEAD.LETTER.QUEUE for a single monitoring pane. Report messages can notify applications when a message cannot be delivered. Poison handling is primarily application backout; DLQ is broader routing failure—both appear in incident reviews together.

Multiple Consumers

Ten competing consumers might all hit the same poison message in rotation, wasting capacity. BOTHRESH limits damage by moving it aside. Partitioning by message type into separate queues isolates risky feeds from core orders. Schema validation at put time (API gateway) prevents some poison from entering MQ at all.

Tutorial: MQSC for Backout

shell
1
2
3
4
5
6
DEFINE QLOCAL('ORDERS.IN') REPLACE + BOTHRESH(3) BOQNAME('ORDERS.BACKOUT') MAXDEPTH(500000) DEFINE QLOCAL('ORDERS.BACKOUT') REPLACE MAXDEPTH(10000) * After repeated consumer MQBACK, message may appear on ORDERS.BACKOUT DISPLAY QLOCAL('ORDERS.BACKOUT') CURDEPTH DISPLAY QLOCAL('ORDERS.IN') BOTHRESH BOQNAME

Explain Like I'm Five: Poison Message

Everyone in line for lunch has a tray that works except one broken tray that crashes every time someone picks it up. The line stops moving. The teacher moves the broken tray to a side table (backout queue) so everyone else can eat. Poison message is that broken tray; fixing it means repair or throw away with permission.

Practice Exercises

Exercise 1: Threshold

BOTHRESH=3, message fails twice then succeeds. Where does it process? What if it fails four times?

Exercise 2: Deployment

After release, all messages back out. Poison or bad release? List checks.

Exercise 3: Runbook

Write five steps for operator when ORDERS.BACKOUT depth goes from 0 to 50.

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

1. BackoutCount increases when:

  • Message is committed
  • Get is backed out without commit
  • Channel starts
  • Queue is empty

2. BOTHRESH is used to:

  • Set message priority
  • Trigger move to backout queue after repeated backouts
  • Start the listener
  • Define TLS cipher

3. A poison message often has:

  • Payload the consumer cannot process
  • Always priority 9
  • No MsgId
  • Only non-persistent

4. BOQNAME specifies:

  • Backout queue name
  • Dead letter queue manager
  • Transmission queue
  • Model queue
Published
Read time14 min
AuthorMainframeMaster
Verified: IBM MQ 9.3 documentation