Copying, merge-like patterns, and inline data with REPRO

REPRO is deliberately simple: one reader, one writer, deterministic record movement. Real projects still say “merge” because business language conflates combining two files with copying one file into another while overwriting duplicates. On z/OS, those richer patterns almost always become a small pipeline: SORT steps to order and merge streams, temporary datasets to hold intermediate results, optional key windows on REPRO, and REPLACE semantics when you truly intend overlays. This page explains how to think about merges without expecting REPRO to become a database engine, how “inline” datasets fit into jobs, and when to reach for SORT or application code instead of another REPRO flag you misremembered from a forum post.

REPRO as a copy primitive, not a join engine

Treat REPRO like a pipe between two ends. It can subset by key or RBA range, skip leading records, cap counts, and replace duplicates per policy, but it does not understand business rules like “take the newest address per customer from two sources with different columns.” When stakeholders say merge, translate their sentence: if they mean union-sort-load, use SORT. If they mean overlay existing keys with a nightly file, consider REPRO REPLACE after proving ordering. If they mean survivorship rules, you need a program or ETL tool.

Merge patterns that survive audits

Common multi-source patterns (illustrative)
PatternWhen to useCaution
Sort merge then single REPROTwo sequential extracts must become one ordered stream before KSDS load.DFSORT MERGE requires sorted inputs; follow with duplicate handling consistent with DEFINE.
Sequential REPRO legs into scratch VSAMStage vendor files separately, then combine with a final sorted pass.More steps, clearer restart boundaries, easier job restart design.
REPRO with FROMKEY/TOKEY extractsSplitting a large KSDS into regional targets.Each leg still obeys ordering inside the selected key range.

Instream and temporary datasets

Instream DD *

Instream data appears directly in the JCL stream. It is convenient for tiny character samples and classroom jobs. For production VSAM loads containing packed fields or binary keys, prefer a sequential dataset created by a trusted upstream system. If you must embed test rows, keep LRECL and code page implications in mind—FTP and editor conversions love to corrupt binary silently.

Temporary datasets (&&dsn)

A job might SORT to &&TEMP1, then REPRO from &&TEMP1 into the VSAM cluster. The double ampersand names a temporary catalog entry deleted at successful job end unless your site policy retains them. This pattern keeps the job self-contained and avoids polluting master catalogs with scratch DSNs you forget to delete.

text
1
2
3
4
5
6
7
//TEMP1 DD DISP=(,PASS),UNIT=SYSDA,SPACE=(CYL,5), // DCB=(LRECL=80,RECFM=FB,BLKSIZE=0) // ... write sorted rows to TEMP1 ... //VSOUT DD DISP=OLD,DSN=PROD.CUST.MASTER //SYSIN DD * REPRO INFILE(TEMP1) OUTFILE(VSOUT) REPLACE /*

REPLACE as controlled overlay

REPLACE is powerful for rerunning a nightly file into a KSDS when keys represent natural business identifiers and the file is authoritative. It is dangerous when two feeds accidentally reuse keys for different meanings. Pair REPLACE decisions with data stewardship: who owns the key namespace, what happens on partial job failure, and whether audit requires row-level change history instead of blind overwrite.

Key-range extracts as “logical copy”

FROMKEY and TOKEY let you carve a region of a large KSDS into a smaller extract for testing or regional distribution. Each extract still needs correct ordering within the range if you reload into another KSDS. For ESDS, analogous operands work on byte addresses; for RRDS, on relative record numbers. Naming these operands correctly in run books prevents on-call engineers from guessing whether the failing job used keys or RBAs.

Operational checklist for multi-step pipelines

  • Restart boundaries: Document which step reruns after each abend code.
  • Space peaks: Temporary datasets may need as much space as the full merged stream.
  • Catalog contention: Parallel jobs copying into related clusters can serialize on user catalogs; schedule consciously.
  • Message archives: Keep SORT and IDCAMS SYSPRINT together in the ticket for forensic timelines.

Practice exercises

  1. Sketch a three-step JCL diagram: two extracts, SORT MERGE, REPRO into KSDS.
  2. Explain when you would refuse REPLACE and insist on DELETE+DEFINE instead.
  3. Write a FROMKEY/TOKEY example comment block for a regional extract (use fake keys).
  4. List two failure modes specific to temporary datasets that disappear before debugging finishes.

Explain like I'm five

REPRO is one funnel pouring one pitcher of juice into one bottle. If you want to mix apple juice and orange juice first, you need a bowl and a spoon—that is SORT. “Inline” is pouring a tiny taste straight from a paper cup you taped to the fridge door: fine for a sip, silly for a gallon. Big restaurants use big pitchers on the counter, not sticky notes with juice drawn in crayon.

Test your knowledge

Test Your Knowledge

1. You must combine two sorted sequential files into one sorted stream for KSDS load. Which component is the usual workhorse?

  • IEFBR14 alone
  • DFSORT MERGE or equivalent sort product
  • LISTCAT
  • DISPLAY

2. Why are instream DD * datasets uncommon for production VSAM loads?

  • IBM forbids them
  • Binary and packed data do not survive line-oriented JCL instream comfortably; sequential files are safer
  • They are always faster
  • They replace the catalog

3. FROMKEY/TOKEY on REPRO primarily do what?

  • Sort the file
  • Limit the copy to a primary key range on a KSDS
  • Define a cluster
  • Change CI size
Published
Read time12 min
AuthorMainframeMaster
Reviewed by MainframeMaster teamVerified: IBM REPRO single-input model; multi-source via SORTSources: IBM z/OS DFSMS Access Method Services; DFSORT guidesApplies to: z/OS VSAM and sequential copy pipelines