When you migrate from an older or legacy system to a new one, the data often does not match the new system's expected layout and formats. Record lengths and field positions differ; dates may be 2-digit year or in a different format; codes and status values may use different values. DFSORT can transform legacy data into the format required by the new system: reordering and rebuilding records with INREC and OUTREC, replacing codes with FINDREP or with a lookup table (JOINKEYS), converting dates with built-in date conversion and century windowing, normalizing case and padding, and converting fixed-length to variable-length output when the target expects VB. This page explains how to use DFSORT for legacy system data transformation: record layout changes, code mapping, date conversion, and format normalization so that the output is ready for the new system.
Legacy systems were built with different record layouts, date conventions, and code sets. The new system expects specific column positions, 4-digit years, and standard codes. If you simply copy the old file to the new, loads may fail or data may be misinterpreted. Transformation converts the old layout and values into the new: same information, different format. DFSORT can do much of this in one or a few steps without writing a custom program.
| Technique | DFSORT tool | Purpose |
|---|---|---|
| Record layout | INREC/OUTREC BUILD | Reorder fields, change positions, add constants for target layout |
| Code replacement | FINDREP or JOINKEYS lookup | Map legacy codes to new system codes |
| Date conversion | INREC/OUTREC date conversion | YYMMDD to YYYYMMDD, format normalization |
| Case/padding | Translation, UPPERCASE, edit masks | Normalize text and numeric format |
| FB to VB | OUTFIL FTOV | Produce variable-length output for target system |
The target system expects fields at certain positions and lengths. Use BUILD= in INREC or OUTREC to define the new record. You specify each segment as position, length, format (e.g. 1,20,CH for the first 20 bytes; 21,8,PD for a packed field). You can take fields from any position in the input and place them in the output at new positions. Add constants (e.g. C\' \' for blanks, 1,8,\'DEFAULT\') and use edit masks for numeric fields. INREC runs before the sort and INCLUDE/OMIT, so use INREC when the new layout or converted values must be used for sorting or filtering. Use OUTREC when only the final output needs the new layout.
Legacy systems often used different code values than the new system. For a simple one-to-one replacement (e.g. old "X" to new "ACTIVE"), use FINDREP in INREC or OUTREC: FINDREP=(IN=C\'X\',OUT=C\'ACTIVE\'). You can list multiple IN/OUT pairs. For a many-to-one or table-driven mapping, use a lookup file. The lookup file has two columns: legacy code and new code. Use JOINKEYS: F1 is your data (key = legacy code field), F2 is the lookup (key = legacy code). REFORMAT to output the new code from F2 and the rest of the record from F1. Every record gets the translated code from the lookup. If a legacy code has no match, JOINKEYS UNPAIRED handling determines whether the record is dropped or passed through with a default.
Legacy data often has 2-digit years (YYMMDD or MMDDYY) or other formats. The new system usually expects YYYYMMDD or a standard format. Use DFSORT date conversion in INREC or OUTREC. You specify the input format (e.g. position and format of the legacy date) and the output format (e.g. YYYYMMDD at a new position). Use century windowing so that 2-digit years are interpreted correctly (e.g. 50–99 as 1950–1999, 00–49 as 2000–2049). The date conversion and century windowing tutorials describe the exact control statement syntax. After conversion, all date fields are in one format and sort/compare correctly.
Legacy data may have mixed case or inconsistent padding (e.g. "ABC" vs "abc", or "7" vs "007"). The new system may expect uppercase and leading zeros. Use translation (TR=) or UPPERCASE in INREC/OUTREC to normalize case. Use edit masks or BUILD with fixed-length output to normalize numeric padding. The data normalization tutorial covers these in detail. Normalize in INREC if the sort key or INCLUDE/OMIT uses these fields; otherwise OUTREC is enough.
If the legacy system produced fixed-length records (RECFM=FB) and the target expects variable-length (RECFM=VB), use OUTFIL with FTOV. Build the data portion with BUILD= so that you only include the needed bytes (e.g. trim trailing spaces). DFSORT writes each record with an RDW followed by the data; the output DD must be allocated with RECFM=VB. So the transformation step can both change layout and convert FB to VB. See the fixed-to-variable conversion tutorial for syntax.
123456789INREC BUILD=(1,10, /* key - move to 1-10 */ 11,20, /* name - move to 11-30 */ 31,2, /* legacy status */ C' ', /* filler for new code at 33-34 */ 35,8, /* legacy date YYMMDD */ ...) OUTREC FINDREP=(IN=(33,2,C'01'),OUT=(33,2,C'A1'), IN=(33,2,C'02'),OUT=(33,2,C'A2')), BUILD=(1,34,35,8,DATE1=(Y2Y,CE1),...) /* expand date */
This is illustrative. INREC builds a record with key, name, legacy status, filler, and legacy date. OUTREC applies FINDREP to replace legacy status codes 01/02 with new codes A1/A2, then BUILD includes a date conversion (DATE1= with century) so the 8-byte legacy date becomes a 4-digit year format. Exact syntax (DATE1=, CE1, etc.) is product-dependent; see your manual.
When one step is too complex, split the work. Step 1: normalize and convert dates, write to &&WORK. Step 2: join with lookup for code translation, write to &&WORK2. Step 3: final BUILD and FTOV if needed, write to target. Each step is easier to test and maintain. Use temporary datasets between steps as in batch processing automation.
Imagine the old system wrote labels in a different language and different order (name first, then number). The new system wants the number first and the name in capital letters. DFSORT is like a helper that rewrites each label: it moves the number to the front, writes the name in capitals, and replaces old words with new words using a dictionary (lookup table). At the end, every label looks the way the new system expects.
1. What is the main use of INREC and OUTREC in legacy data transformation?
2. When would you use FINDREP in a legacy transformation?
3. How can you map legacy codes to new codes when the mapping is not a simple find-replace?
4. Why normalize dates when moving data from a legacy system?
5. What is the role of fixed-to-variable (FTOV) in legacy transformation?