The scatter-gather pattern solves problems that one slow monolith cannot: a customer order needs credit check, inventory hold, and tax calculation, each owned by a different team and runtime, and the web page should wait only as long as the slowest leg—not the sum of three sequential calls. Scatter-gather on IBM MQ splits the incoming request message into sub-tasks on separate queues, lets multiple consumers work in parallel, then gathers replies into a single correlated response. MQ provides durable queues, transactional gets and puts, and MsgId or CorrelId fields in the message descriptor so the aggregator knows which replies belong to which parent request. Beginners confuse scatter-gather with simple load balancing on one queue—scatter explicitly creates different message types or routes to different services. This tutorial walks through scatter and gather roles, correlation design, expected reply counts, timeout and partial failure, idempotency when workers redeliver, integration with the aggregator pattern, monitoring scatter fan-out queues, and anti-patterns such as losing correlation when a worker forgets to copy CorrelId on the reply.
| Role | Typical queue | Action |
|---|---|---|
| Scatter | ORDER.SCATTER.IN | GET request; PUT N sub-messages |
| Worker | CREDIT.WORK, STOCK.WORK | Process; PUT reply |
| Gather | ORDER.GATHER.IN | Collect replies; build response |
| Client reply | ORDER.RESPONSE | Single answer to caller |
When the scatter service receives a request, it reads or generates a correlation token. Common pattern: use the request MsgId as CorrelId on every sub-message PUT. Each worker copies CorrelId from input to reply MQMD so gather can MQGET with a correlation selector. For multiple scatter rounds, nest tokens in a user property scatterGroupId. Document whether workers must preserve MsgId of sub-message or only CorrelId—mixing rules breaks gather. JMS clients use JMSCorrelationID; MQI uses CorrelId byte field; ensure encoding matches across platforms including EBCDIC mainframe.
12345678Request arrives on ORDER.SCATTER.IN (MsgId = R1) Scatter PUT to CREDIT.WORK CorrelId = R1 body = credit slice Scatter PUT to STOCK.WORK CorrelId = R1 body = stock slice Scatter PUT to TAX.WORK CorrelId = R1 body = tax slice Scatter records expectedCount = 3 on state store or header Worker on CREDIT.WORK: GET, process, PUT ORDER.GATHER.IN CorrelId = R1 Gather: when 3 messages with CorrelId R1 received → merge → PUT ORDER.RESPONSE
Scatter-gather is asking three friends to each bring one ingredient while you stay at the table with the recipe card number in your hand. When all three return with the matching card number, you make the cake. MQ is the delivery service for the ingredient lists and returned bags.
Static scatter always expects three replies. Dynamic scatter may query a database to decide how many legs to fire—gather must read expectedCount from message property or side table, not assume three forever. Mismatch causes gather to wait until timeout when a leg was never sent, or to complete early if count is wrong low.
Scatter GET and multiple PUTs can share one syncpoint if all targets must appear together. If one PUT fails, entire scatter backs out and request retries. Alternative: scatter commits then workers are independent—gather handles missing legs via timeout. Trade atomicity for availability per business case.
Each work queue uses competing consumers. Credit service scales pods independently of tax service. Monitor depth per work queue to find slow leg. Avoid one giant queue for all leg types—that loses clear ownership and routing.
Alert on GATHER.IN depth growing—aggregator slow or missing replies. Alert on WORK queue age—stuck consumer. Metric: scatter rate versus gather completion latency p99. Trace one CorrelId through logs end to end.
Scatter-gather is sending three friends to find different puzzle pieces at the same time, then putting the pieces together when they all come back with the same puzzle number written on their bags.
Draw queue diagram for one request and three workers plus gather.
Define timeout policy for missing tax reply.
List MQMD fields each worker must copy to reply.
1. Scatter step sends:
2. CorrelId links:
3. Gather waits for:
4. Pub/sub differs because: