REPRO is deliberately simple: one reader, one writer, deterministic record movement. Real projects still say “merge” because business language conflates combining two files with copying one file into another while overwriting duplicates. On z/OS, those richer patterns almost always become a small pipeline: SORT steps to order and merge streams, temporary datasets to hold intermediate results, optional key windows on REPRO, and REPLACE semantics when you truly intend overlays. This page explains how to think about merges without expecting REPRO to become a database engine, how “inline” datasets fit into jobs, and when to reach for SORT or application code instead of another REPRO flag you misremembered from a forum post.
Treat REPRO like a pipe between two ends. It can subset by key or RBA range, skip leading records, cap counts, and replace duplicates per policy, but it does not understand business rules like “take the newest address per customer from two sources with different columns.” When stakeholders say merge, translate their sentence: if they mean union-sort-load, use SORT. If they mean overlay existing keys with a nightly file, consider REPRO REPLACE after proving ordering. If they mean survivorship rules, you need a program or ETL tool.
| Pattern | When to use | Caution |
|---|---|---|
| Sort merge then single REPRO | Two sequential extracts must become one ordered stream before KSDS load. | DFSORT MERGE requires sorted inputs; follow with duplicate handling consistent with DEFINE. |
| Sequential REPRO legs into scratch VSAM | Stage vendor files separately, then combine with a final sorted pass. | More steps, clearer restart boundaries, easier job restart design. |
| REPRO with FROMKEY/TOKEY extracts | Splitting a large KSDS into regional targets. | Each leg still obeys ordering inside the selected key range. |
Instream data appears directly in the JCL stream. It is convenient for tiny character samples and classroom jobs. For production VSAM loads containing packed fields or binary keys, prefer a sequential dataset created by a trusted upstream system. If you must embed test rows, keep LRECL and code page implications in mind—FTP and editor conversions love to corrupt binary silently.
A job might SORT to &&TEMP1, then REPRO from &&TEMP1 into the VSAM cluster. The double ampersand names a temporary catalog entry deleted at successful job end unless your site policy retains them. This pattern keeps the job self-contained and avoids polluting master catalogs with scratch DSNs you forget to delete.
1234567//TEMP1 DD DISP=(,PASS),UNIT=SYSDA,SPACE=(CYL,5), // DCB=(LRECL=80,RECFM=FB,BLKSIZE=0) // ... write sorted rows to TEMP1 ... //VSOUT DD DISP=OLD,DSN=PROD.CUST.MASTER //SYSIN DD * REPRO INFILE(TEMP1) OUTFILE(VSOUT) REPLACE /*
REPLACE is powerful for rerunning a nightly file into a KSDS when keys represent natural business identifiers and the file is authoritative. It is dangerous when two feeds accidentally reuse keys for different meanings. Pair REPLACE decisions with data stewardship: who owns the key namespace, what happens on partial job failure, and whether audit requires row-level change history instead of blind overwrite.
FROMKEY and TOKEY let you carve a region of a large KSDS into a smaller extract for testing or regional distribution. Each extract still needs correct ordering within the range if you reload into another KSDS. For ESDS, analogous operands work on byte addresses; for RRDS, on relative record numbers. Naming these operands correctly in run books prevents on-call engineers from guessing whether the failing job used keys or RBAs.
REPRO is one funnel pouring one pitcher of juice into one bottle. If you want to mix apple juice and orange juice first, you need a bowl and a spoon—that is SORT. “Inline” is pouring a tiny taste straight from a paper cup you taped to the fridge door: fine for a sip, silly for a gallon. Big restaurants use big pitchers on the counter, not sticky notes with juice drawn in crayon.
1. You must combine two sorted sequential files into one sorted stream for KSDS load. Which component is the usual workhorse?
2. Why are instream DD * datasets uncommon for production VSAM loads?
3. FROMKEY/TOKEY on REPRO primarily do what?