Log Shipping

Log shipping is how many enterprises move IBM MQ recovery data from a primary datacenter to a disaster recovery site when they are not using continuous synchronous replication products for every queue manager. The queue manager writes persistent changes to circular logs; when logs wrap, archive logs land on disk or tape. Shipping copies those files—by rsync, object storage replication, SAN async mirror, or mainframe tape pipeline—to the backup location. After primary loss, the backup queue manager replays shipped logs to reconstruct queues and messages as they existed at the last consistent recovery point. Beginners hear logs and think only troubleshooting. For DR, logs are the message insurance policy. This tutorial explains circular versus archive logs, shipping schedules, consistency, replay on the backup queue manager, lag and RPO, tooling patterns, z/OS differences, and pitfalls that leave DR queue managers empty or inconsistent.

Circular Logs and Archive Logs

Distributed queue managers use primary log files in a circular set—LOGFIL and LOGPATH attributes define count and size. While the queue manager runs, it writes all durable puts, gets, transactional commits, and object changes to the current log extent. When an extent fills, it becomes an archive log and a new extent becomes active. Archive logs accumulate in the archive directory until pruned per retention policy. Shipping can target archive logs only (simpler, higher lag) or include coordinated copy of active log and queue files during a controlled quiesce (lower lag, requires brief outage). Understand your release documentation for dspmqlog, rcrmqobj, and backup utilities—names vary slightly by platform.

Log shipping approaches
ApproachTypical lagRPO impactNotes
Archive log copy on scheduleMinutes to hoursInterval-boundScript after log archive event
Storage array async mirrorSecondsSmall but non-zeroWhole volume; watch consistency groups
Quiesced snapshotNear zero at snapshotLow for that pointBrief stop or hold traffic
Product replication (RDQM etc.)Milliseconds–secondsOften lowestHA/DR product, not manual ship

What Gets Shipped

A minimal shipping package includes archive log files and the queue manager log configuration metadata needed for replay. Queue files (.q files on distributed) or page sets on z/OS hold message data; logs describe how to apply changes consistently. DR procedures often ship the entire mqmdir data tree after quiesce, or ship logs plus periodic queue file sync—your vendor and IBM documentation for your version govern supported combinations. Never mix logs from different queue manager instances or different start times without media recovery procedures. Label shipments with log range, timestamp, and checksum.

Shipping Pipeline Design

  1. Primary generates archive log—monitor directory growth.
  2. Agent copies new files to secure transfer (SFTP, S3 replication, mainframe tape).
  3. DR landing zone validates integrity and virus-scans if policy requires.
  4. Catalog records highest log sequence applied on backup.
  5. Backup queue manager media recovery applies logs on activation.

Automate alerts when shipping falls behind—if the newest archive on DR is four hours older than primary, RPO is four hours. Operations dashboards should show lag in minutes, not only success/fail ping.

Replay on the Backup Queue Manager

When DR activates, operators place shipped logs in the paths the backup queue manager expects, then start or restart with recovery. MQ replays logs to rebuild in-memory and on-disk state. You may see lengthy restart times on large hubs—factor that into RTO. If logs are missing a range, recovery stops or warns; do not ignore AMQ messages about log integrity. Practice replay quarterly on an isolated DR queue manager name to measure duration without touching production names.

RPO and Lag

Recovery point objective equals the worst-case gap between the last shipped log and the failure moment, plus any in-flight transactions not yet committed to log. Fifteen-minute shipping implies fifteen-minute RPO unless you also mirror active logs. Non-persistent messages may never appear on DR—state that in the plan. Syncpoint boundaries matter: in-doubt transactions after replay need transaction manager coordination identical to HA failover scenarios.

z/OS Log Shipping

z/OS queue managers use BSDS to catalog log data sets and page sets for messages. DR may copy archive logs to tape and restore on DR LPAR, or use storage replication for log and page set volumes. Coupling facility queue sharing groups add complexity—member recovery differs from standalone QM DR. Work with IBM z/OS MQ documentation for your release; terminology includes archive log data sets, copy bundles, and recovery procedures in the operations guide. RACF and SMF may audit who mounted DR volumes.

Security and Compliance

Logs contain message payloads for persistent traffic—encrypt shipping paths (TLS on SFTP, encrypted buckets). Restrict DR landing zone access to break-glass roles. Retention on DR must meet legal hold requirements without exceeding storage budgets—automate prune after successful DR test apply where safe.

Common Pitfalls

  • Shipping logs but never testing replay—discover corruption during real disaster.
  • Clock skew breaking file ordering—use NTP on primary and DR.
  • Primary continues writing after DR promotion—split brain if both restart.
  • Insufficient archive retention on primary—logs pruned before DR copy arrives.
  • Wrong endian or platform mix—Linux primary to Windows DR needs supported paths only.

Tutorial: Monitor Archive Lag

shell
1
2
3
4
5
6
7
8
9
10
* Primary — list newest archive (paths vary by install) ls -lt /var/mqm/log/QM1/active/ | head * DR — compare highest RBA or log number cataloged # dr-log-catalog.txt should record: # LAST_SHIPPED_ARCHIVE=S0000123.LOG # LAST_SHIPPED_TIME=2026-05-17T14:32:00Z * Alert if primary newest minus DR newest > 15 minutes * On DR test: place logs, strmqm QM1_DR, review AMQ7xxx recovery messages

Explainer: Photocopying the Diary

The queue manager keeps a diary of every important change. Log shipping photocopies new diary pages to a safe house. If the original diary burns, the backup copy lets you reconstruct what happened—up to the last page you copied.

Explain Like I'm Five

Every time you finish a chapter in your coloring book, Mom takes a picture and sends it to Grandma's house. If our house floods, Grandma has pictures of all chapters up to the last picture—maybe not the page you were coloring right when the flood came.

Practice Exercises

Exercise 1

Calculate RPO if archive logs ship every ten minutes and primary fails five minutes after a successful ship.

Exercise 2

Design an alert for shipping lag using log sequence numbers.

Exercise 3

Contrast log shipping with RDQM replication in three bullet points.

Frequently Asked Questions

Frequently Asked Questions

Test Your Knowledge

Test Your Knowledge

1. Log shipping moves:

  • Log data to DR for replay
  • Only topic strings
  • JCL job cards
  • TCP buffers only

2. Archive logs are created when:

  • Active log wraps and archives
  • Channel binds
  • Listener starts
  • Topic is published

3. Longer shipping interval generally means:

  • Worse RPO
  • Better RPO
  • No effect
  • Faster RTO always

4. Backup QM recovery uses:

  • Replay of shipped logs
  • Only DISPLAY QLOCAL
  • Deleting BSDS
  • Disabling TLS
Published
Read time21 min
AuthorMainframeMaster
Verified: IBM MQ 9.3 documentation