Network Issues

IBM MQ does not replace the network—it depends on reliable TCP between clients, listeners, and remote queue managers. When operators say “MQ is down,” the queue manager process is often healthy while packets never reach port 1414. Network issues manifest as connection refused in LASTCHLERR, endless channel RETRY, client reason code 2059 connection broken, or intermittent failures during long-running batches when firewalls drop idle sessions. Beginners troubleshoot application code while the firewall team has not opened the return path for ephemeral ports. This tutorial teaches layered diagnosis: DNS and CONNAME, listener and port, firewall and NAT, VPN and cloud security groups, MTU and fragmentation, load balancers, and how to collaborate with network teams using evidence from both queue managers—not only ping from a laptop that is not the MQ server host.

TCP Path for Message Channels

The sender queue manager channel initiator opens an outbound TCP connection to CONNAME—for example payments.corp.example.com(1414). The packet crosses corporate routing, possibly NAT, and must arrive at the receiver host where a LISTENER binds the same port. Return traffic must be permitted for established connections. Asymmetric firewall rules—allow outbound from A to B but block return—produce classic “works one way” mysteries. Document the full five-tuple path for production pairs: source IP, source port range, destination IP, destination port, protocol TCP.

Network symptom to likely cause
SymptomLikely causeWho fixes
Connection refusedListener down or wrong portMQ ops
TimeoutFirewall drop or wrong IPNetwork
Intermittent RETRYIdle timeout or flaky linkNetwork + MQ HBINT
Works by IP not nameDNS failureDNS team
Fails only large messagesMTU or VPN limitNetwork

CONNAME DNS and Hostnames

CONNAME embeds a hostname or IP and port in MQSC—ALTER CHANNEL CONNAME('host(1414)'). If DNS for host fails on the sending server, connect never starts. If DNS returns a load balancer VIP that does not forward to MQ listener, you see timeout. Test resolution from the actual MQ server host, not from your workstation. After data center migrations, stale DNS entries are a top cause of Monday outages. Prefer fully qualified names in CMDB and CONNAME to avoid ambiguous short names.

Listeners and Ports

shell
1
2
3
4
DISPLAY LSSTATUS('TCP.LISTENER') ALL DISPLAY LISTENER('TCP.LISTENER') PORT TRPTYPE * Sender must match PORT in CONNAME DISPLAY CHANNEL('PARIS.TO.LONDON') CONNAME

LISTENER STATUS must be RUNNING and PORT must match CONNAME on the sender. Multiple listeners on different ports require consistent documentation—partners often connect to 1414 while your new standard is 1415. Cloud images may expose a different external port via NAT than internal PORT attribute. Map external to internal in the firewall ticket explicitly.

Firewalls NAT and Ephemeral Ports

Outbound connect from QM_A uses a source port from the ephemeral range on A's host. The firewall must allow return traffic to that port. Stateful firewalls usually handle this; asymmetric ACLs do not. NAT at the edge changes source IP seen by the partner—update ADDRESSMAP CHLAUTH if IP-based rules are used. Long idle channel periods may hit firewall session timeout; heartbeats (HBINT KAINT) keep sessions alive where supported—network and MQ teams should align timeout values with firewall idle timers.

VPN MPLS and Cloud Security Groups

  • VPN down—all channels to partner subnet fail together; check VPN monitor first.
  • MPLS routing change—partial reachability; traceroute from MQ host.
  • AWS security group—allow inbound 1414 from partner CIDR only.
  • Kubernetes NetworkPolicy—pod egress to MQ namespace blocked.

MTU Fragmentation and Large Messages

Very large messages over VPN with low MTU can fail or hang while small test messages succeed. Network teams diagnose with MTU ping tests. MQ MAXMSGL on channel and queue must also allow size—network and MQ limits both matter. Do not confuse message size limits with pure network blackholes.

Load Balancers and Proxies

Layer-4 passthrough preserves MQ protocol end-to-end. SSL termination at the balancer requires the balancer to present certificates partners trust and may break mutual TLS unless reconfigured. Sticky sessions help when multiple queue manager instances sit behind VIP—cluster and multi-instance designs need architecture review before load balancer insertion.

Evidence Package for Network Teams

  1. Source and destination IP port from DISPLAY CHSTATUS and CONNAME.
  2. Exact UTC timestamp of failed attempt.
  3. TCP test result from MQ server host—not laptop.
  4. Listener DISPLAY LSSTATUS on receiver.
  5. Whether TLS or plain TCP per channel TRPTYPE and SSLCIPH.
  6. Packet capture snippet if policy allows.

Explainer: Phone Line Versus Conversation

MQ messages are the conversation. TCP is the phone line. Network issues mean the phones cannot connect—fixing what you say in the conversation (application code) does not help until the line works.

Explain Like I'm Five: Network Issues

You are trying to call your friend but the phone wires between your houses are broken—so your message never gets there no matter how loud you shout into the phone.

Practice Exercises

Exercise 1

Write a firewall request ticket for SDR from 10.1.1.5 to listener 10.2.2.5:1414 including return traffic.

Exercise 2

Compare symptoms: connection refused versus timeout versus 2035.

Exercise 3

List checks when CONNAME hostname works from nslookup but channel still RETRY.

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

1. CONNAME specifies:

  • Host and port to connect
  • Queue depth
  • COBOL program
  • JCL class

2. Connection refused usually means:

  • No listener on port
  • 2035
  • Successful put
  • DLQ only

3. XMITQ grows during outage because:

  • Messages wait for channel
  • All messages deleted
  • TLS only
  • No persistence

4. DNS failure affects:

  • CONNAME hostname resolution
  • MAXDEPTH
  • DEFBIND
  • TRIGTYPE
Published
Read time20 min
AuthorMainframeMaster
Verified: IBM MQ 9.3 documentation