What is latency in IBM MQ?

Latency is the elapsed time from when a producer sends a message until a consumer receives and processes it—or any segment such as MQPUT completion to MQGET completion. Measured in milliseconds or seconds.

How is latency different from throughput?

Throughput is how many messages per second. Latency is how long one message waits. High batching can raise throughput while increasing latency for individual messages.

What increases MQ latency?

Deep queues (waiting behind other messages), syncpoint and disk fsync, TLS handshakes, WAN round trip, large messages, channel batching waiting for BATCHSZ fill, and slow consumer applications.

Does persistence affect latency?

Yes. Persistent puts wait for log records to reach durable media before MQPUT completes (with default sync options). Non-persistent puts complete faster with weaker durability.

How do I measure end-to-end latency?

Embed timestamps in message payload or MQMD fields, correlate at consumer, or use distributed tracing. Compare put time, get time, and business processing time separately.

MainframeMaster

Latency

Latency is the waiting time a message spends between creation and consumption—or between any two checkpoints you measure along the IBM MQ path. Trading systems chase sub-millisecond LAN latency; batch warehouses accept minutes. Asynchronous messaging deliberately decouples producers from consumer speed, which means latency becomes a design choice, not only a bug. Beginners measure MQPUT return time and assume the business saw the message—ignoring queue depth, channel batching, and COBOL processing seconds later. Operations see channel RUNNING and assume low latency—ignoring fifty thousand messages ahead in the queue. This tutorial decomposes latency into queue wait, queue manager processing, log I/O, channel transit, and application time; contrasts persistent and non-persistent behavior; explains the throughput tradeoff; covers syncpoint and XA; and gives practical measurement patterns for distributed and z/OS MQ without requiring exotic tooling.

Segments of End-to-End Latency

Latency segments on a typical path
Segment	Typical range	Main drivers
Producer MQPUT	Sub-ms to tens of ms	Persistence, syncpoint, network to QM
Queue wait	Zero to unbounded	Depth, consumer speed, priority
Channel transit	WAN ms to seconds	Distance, TLS, BATCHSZ, size
Consumer MQGET	Sub-ms to ms	Match options, browse vs get
Business logic	Varies	Application code, DB calls

Document which segments your SLA covers. A middleware team might own through MQGET; application teams own processing after. Blame games end when timestamps exist in a shared format—ISO-8601 in RFH2 usr folder or corporate correlation ID in MQMD.

Queue Wait and Little's Law Intuition

Average queue wait approximates average depth divided by dequeue rate when the system is stable (Little's Law). If ten thousand messages sit on a queue and consumers remove one thousand per second, new arrivals wait roughly ten seconds on average before their turn—plus service time. Latency spikes during catch-up after outages. Priority queues (where supported) and separate queues for express traffic reduce head-of-line blocking. Do not point low-latency flows at the same queue as bulk batch unless consumers can always keep depth near zero.

Persistence, Logging, and Syncpoint

Persistent MQPUT with default synchronous logging waits for the log record to reach durable storage before returning success. That fsync dominates LAN latency budgets. Applications can use MQPMO_NO_SYNCPOINT or asynchronous log settings only where documentation and risk acceptance allow—many payment flows cannot. MQGET under syncpoint holds locks until commit—adding latency for competing consumers. XA two-phase commit coordinates with databases—correctness over speed. Non-persistent messages reduce put latency dramatically; use only when loss is acceptable.

Channels and Batching

Sender channels batch messages up to BATCHSZ before sending across TCP. A batch of fifty messages might wait until the fiftieth small message arrives or a timer fills the batch—throughput wins, last message in batch waits longest. For low latency over WAN, smaller BATCHSZ and HBINT tuning reduce wait; for bulk replication, large batches help. TLS session reuse avoids repeated handshake latency on new connections—prefer connection pooling on clients.

Client Connection Latency

Each new TCP and TLS handshake costs round trips. Client reconnect after failover adds seconds. CCDT with multiple hosts lets clients try the next connection name quickly. Local binding (server connection on same host as app) removes network for co-located workloads. Remote clients over VPN add tens of milliseconds per hop—measure from client machine, not only from queue manager host.

Browse Versus Destructive Get

Browsing inspects messages without removing them—useful for monitoring, not for low-latency consumption paths that need exactly-once dequeue. Destructive MQGET removes the message; browse-then-get patterns double I/O. Match options (MQMO_MATCH_MSG_ID) add search latency on deep queues—design keys and queue depth accordingly.

Pub/Sub Latency

Publish to multiple subscribers adds fan-out cost. Durable subscribers may write additional logs. Topic routing through hierarchies is usually fast; subscriber application lag is not. For market data, many shops use specialized middleware or non-persistent topics with dedicated consumers—MQ can work when depth stays shallow.

z/OS Considerations

Queue sharing groups add coupling facility access time—usually low microseconds to milliseconds when CF is healthy. CF structure contention raises latency for all members. Shared message sets for large messages behave differently than small messages in CF structures. Measure on-CP time with SMF and MQ accounting where available.

Latency Versus Throughput Tradeoff

Optimizing one often pressures the other. Large channel batches and aggressive consumer batching raise throughput but stretch per-message latency. A low-latency design uses shallow queues, fast disks, minimal sync where safe, co-located apps, small messages, and enough consumer threads to keep depth near zero. A high-throughput batch design accepts seconds of latency overnight. State your goal per queue in architecture documents.

Tutorial: Timestamp Correlation

text

1
2
3
4
5
6
7
8
9
10
11
Producer payload (JSON example):
{ "orderId": "A123", "producedAt": "2026-05-17T10:00:00.123Z" }
 
Consumer on MQGET:
  latency_ms = now() - parse(producedAt)
  log histogram: p50, p95, p99
 
Also log:
  putCompletionTime (app after MQPUT)
  getArrivalTime (before business logic)
Separate middleware latency from processing latency.

Tutorial: DISPLAY Depth During Spike

shell

1
2
3
4
DISPLAY QLOCAL('ORDERS.IN') CURDEPTH IPPROCS OPPROCS
* Rising CURDEPTH + flat OPPROCS -> consumer starvation
* High IPPROCS + low OPPROCS -> producers faster than consumers
* After fix: CURDEPTH stable near zero -> latency queue wait drops

Explainer: Waiting in Line

Latency is how long you stand in line at the café. Throughput is how many customers leave per minute. A barista making ten drinks at once (batching) serves the line faster but the last person in the batch waits longer.

Explain Like I'm Five

Latency is how long you wait for your turn on the slide. If ten kids are ahead of you, you wait longer—even if the slide is very fast once you climb on.

Practice Exercises

Exercise 1

If depth is 20,000 and consumers dequeue 2,000 per second, estimate average queue wait ignoring service time.

Exercise 2

Name two tuning changes that lower latency but might lower peak throughput.

Exercise 3

Design a timestamp scheme to split middleware latency from application latency.

Frequently Asked Questions

Test Your Knowledge

1. Latency measures:

Time for a message to travel the path
Messages per hour only
Queue name length
Cipher bit count

2. Deep queues usually:

Increase wait time for new messages
Reduce wait time
Eliminate TLS
Remove logs

3. Large channel BATCHSZ can:

Raise throughput but add batching delay
Remove all latency
Disable listeners
Delete messages

4. Syncpoint on put/get:

Adds coordination and I/O latency
Removes all delay
Only affects topics
Skips channels

Latency

Segments of End-to-End Latency

Queue Wait and Little's Law Intuition

Persistence, Logging, and Syncpoint

Channels and Batching

Client Connection Latency

Browse Versus Destructive Get

Pub/Sub Latency

z/OS Considerations

Latency Versus Throughput Tradeoff

Tutorial: Timestamp Correlation

Tutorial: DISPLAY Depth During Spike

Explainer: Waiting in Line

Explain Like I'm Five

Practice Exercises

Exercise 1

Exercise 2

Exercise 3

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

Throughput

Message Persistence

Asynchronous Messaging

HBINT