Initial load is the moment an empty VSAM cluster becomes a populated production file—or the moment a conversion project proves it can. Unlike a tiny test file where you type six records by hand, real loads move millions of rows from sequential extracts, legacy dumps, or another VSAM cluster into a freshly defined structure. The mechanical tool is almost always IDCAMS REPRO, but the engineering work is everything around REPRO: proving the cluster attributes, sorting keyed input, choosing REPLACE semantics, reconciling counts, and making sure the catalog story matches what applications will open Monday morning. This page focuses on that surrounding discipline so beginners understand why “just run REPRO” is never the whole job.
Initial load establishes the first complete population of records (and for a KSDS, the index structure that mirrors those keys). It usually happens once per major release, after a full redefine, or during platform migration. Steady-state processing afterward might insert or update records through COBOL programs, CICS transactions, or smaller batch extracts. Because initial load often runs with elevated authority and tight downtime windows, change tickets should list prerequisites explicitly: sort product level, temporary dataset space, restart checkpoints, and back-out steps if REPRO abends halfway.
A KSDS keeps records ordered by the primary key. REPRO writes sequentially and expects ascending key order for the input stream so the index can be built consistently as records arrive. If your extract is keyed on customer number but sorted by load timestamp, you must re-sort by customer number before REPRO. For files with duplicate keys allowed, your sort should be stable and consistent with UNIQUEKEY versus NONUNIQUEKEY expectations on DEFINE. Skipping the sort because “the data looked sorted” is a common source of weekend pages.
1234567891011121314//S001 EXEC PGM=SORT (illustrative pattern) //SORTIN DD DISP=SHR,DSN=extract.seq //SORTOUT DD DISP=(,CATLG),DSN=&&SORTED, // SPACE=(CYL,(50,10)),UNIT=SYSDA //SYSIN DD * SORT FIELDS=(1,10,CH,A) /* //R001 EXEC PGM=IDCAMS //SYSPRINT DD SYSOUT=* //IN1 DD DISP=SHR,DSN=&&SORTED //OUTVS DD DISP=OLD,DSN=PROD.CUST.MASTER //SYSIN DD * REPRO INFILE(IN1) OUTFILE(OUTVS) /*
Entry-sequenced loads preserve arrival order. You still care about record length and block size compatibility, but you do not sort by a VSAM primary key because there is none. Logical delete flags or reorganization strategies may still apply depending on application design.
Relative record datasets load by slot. Input ordering interacts with SKIP/COUNT and with how your application maps business keys to RRNs. Initial load jobs should document which RRNs are intentionally empty so later online programs do not treat them as errors.
| Phase | What to verify |
|---|---|
| Prove the empty cluster | LISTCAT ALL; confirm RECORDSIZE, KEYS, volumes, and free space align with expected input LRECL and key layout. |
| Prepare sequential input | Validate record length, key field alignment, and character set. For KSDS, sort by primary key ascending with stable tie-breakers if duplicates are allowed. |
| Run REPRO | Use INFILE/OUTFILE or INDATASET/OUTDATASET consistently with site standards; add REPLACE only when policy allows overlay of duplicate keys. |
| Post-load verification | LISTCAT again, run record-count reconciliation, execute a read-only sample program or utility, and archive SYSPRINT. |
REPLACE tells REPRO it may overlay records in the target when keys or RRNs collide. That is powerful during reruns of a load job after a failure, but dangerous if two different business feeds accidentally share keys. REUSE relates to cluster reuse patterns on DEFINE and should be interpreted strictly per IBM text for your release. Security and audit teams may require DELETE+DEFINE instead of overlay semantics for certain datasets—follow governance, not personal taste.
Loading a KSDS is like filling a sticker book where every sticker has a number and the book only works if you put numbers in order. If you jump from 5 to 9 then back to 7, the book’s tabs get confused and the pages tear. Sorting first is your grown-up checking the numbers on the kitchen table before you paste. REPRO is the pasting step; it does not babysit the numbering for you.
1. You load a KSDS from a sequential file. The sort step failed but the operator restarted only the REPRO step. What is the highest risk?
2. Which IDCAMS command typically precedes the first REPRO into a brand-new cluster?
3. Why capture SYSPRINT from the load job?