How do I transform legacy data layout with DFSORT?

Use INREC or OUTREC to build a new record: specify field positions and lengths from the input, reorder them, add constants, convert dates (DATE1, JFY, etc.), and edit numeric fields. The output record layout matches the target system. Use FINDREP for simple code replacements and JOINKEYS with a lookup table for code translation.

How do I replace legacy codes with new system codes?

For a direct replacement (one old value to one new value), use FINDREP=(IN=old,OUT=new) in INREC or OUTREC. For a table-driven mapping, use a lookup file and JOINKEYS: join on the legacy code, REFORMAT to output the new code from the lookup and the rest from the main file.

How do I convert legacy dates to the new format?

Use DFSORT date conversion in INREC or OUTREC. For 2-digit year (YYMMDD) to 4-digit (YYYYMMDD), use the product date conversion with century windowing (e.g. window 1950–2049). Specify input format and output format so all dates are normalized. See the date conversion and century windowing tutorials for syntax.

Can DFSORT change record length (FB to VB) for legacy output?

Yes. Use OUTFIL with FTOV (fixed to variable) so the output is RECFM=VB. Build the data portion with BUILD= (e.g. only needed fields, trimmed). DFSORT writes the RDW and variable-length data. Allocate the output DD with RECFM=VB and appropriate LRECL. See fixed-to-variable conversion.

When should I use INREC vs OUTREC for legacy transformation?

Use INREC when the transformed data must be used for sorting or filtering (INCLUDE/OMIT)—e.g. converted date or new code used as sort key or in conditions. Use OUTREC when only the final output needs the new layout and formats; sort and filter use the original data.

Legacy System Data Transformation with DFSORT

Legacy System Data Transformation

When you migrate from an older or legacy system to a new one, the data often does not match the new system's expected layout and formats. Record lengths and field positions differ; dates may be 2-digit year or in a different format; codes and status values may use different values. DFSORT can transform legacy data into the format required by the new system: reordering and rebuilding records with INREC and OUTREC, replacing codes with FINDREP or with a lookup table (JOINKEYS), converting dates with built-in date conversion and century windowing, normalizing case and padding, and converting fixed-length to variable-length output when the target expects VB. This page explains how to use DFSORT for legacy system data transformation: record layout changes, code mapping, date conversion, and format normalization so that the output is ready for the new system.

Real World Use Cases

Why Transform Legacy Data?

Legacy systems were built with different record layouts, date conventions, and code sets. The new system expects specific column positions, 4-digit years, and standard codes. If you simply copy the old file to the new, loads may fail or data may be misinterpreted. Transformation converts the old layout and values into the new: same information, different format. DFSORT can do much of this in one or a few steps without writing a custom program.

Transformation techniques with DFSORT
Technique	DFSORT tool	Purpose
Record layout	INREC/OUTREC BUILD	Reorder fields, change positions, add constants for target layout
Code replacement	FINDREP or JOINKEYS lookup	Map legacy codes to new system codes
Date conversion	INREC/OUTREC date conversion	YYMMDD to YYYYMMDD, format normalization
Case/padding	Translation, UPPERCASE, edit masks	Normalize text and numeric format
FB to VB	OUTFIL FTOV	Produce variable-length output for target system

Record Layout: INREC and OUTREC

The target system expects fields at certain positions and lengths. Use BUILD= in INREC or OUTREC to define the new record. You specify each segment as position, length, format (e.g. 1,20,CH for the first 20 bytes; 21,8,PD for a packed field). You can take fields from any position in the input and place them in the output at new positions. Add constants (e.g. C\' \' for blanks, 1,8,\'DEFAULT\') and use edit masks for numeric fields. INREC runs before the sort and INCLUDE/OMIT, so use INREC when the new layout or converted values must be used for sorting or filtering. Use OUTREC when only the final output needs the new layout.

Code Replacement: FINDREP and Lookup Tables

Legacy systems often used different code values than the new system. For a simple one-to-one replacement (e.g. old "X" to new "ACTIVE"), use FINDREP in INREC or OUTREC: FINDREP=(IN=C\'X\',OUT=C\'ACTIVE\'). You can list multiple IN/OUT pairs. For a many-to-one or table-driven mapping, use a lookup file. The lookup file has two columns: legacy code and new code. Use JOINKEYS: F1 is your data (key = legacy code field), F2 is the lookup (key = legacy code). REFORMAT to output the new code from F2 and the rest of the record from F1. Every record gets the translated code from the lookup. If a legacy code has no match, JOINKEYS UNPAIRED handling determines whether the record is dropped or passed through with a default.

Date Conversion

Legacy data often has 2-digit years (YYMMDD or MMDDYY) or other formats. The new system usually expects YYYYMMDD or a standard format. Use DFSORT date conversion in INREC or OUTREC. You specify the input format (e.g. position and format of the legacy date) and the output format (e.g. YYYYMMDD at a new position). Use century windowing so that 2-digit years are interpreted correctly (e.g. 50–99 as 1950–1999, 00–49 as 2000–2049). The date conversion and century windowing tutorials describe the exact control statement syntax. After conversion, all date fields are in one format and sort/compare correctly.

Normalizing Case and Padding

Legacy data may have mixed case or inconsistent padding (e.g. "ABC" vs "abc", or "7" vs "007"). The new system may expect uppercase and leading zeros. Use translation (TR=) or UPPERCASE in INREC/OUTREC to normalize case. Use edit masks or BUILD with fixed-length output to normalize numeric padding. The data normalization tutorial covers these in detail. Normalize in INREC if the sort key or INCLUDE/OMIT uses these fields; otherwise OUTREC is enough.

Fixed to Variable (FTOV)

If the legacy system produced fixed-length records (RECFM=FB) and the target expects variable-length (RECFM=VB), use OUTFIL with FTOV. Build the data portion with BUILD= so that you only include the needed bytes (e.g. trim trailing spaces). DFSORT writes each record with an RDW followed by the data; the output DD must be allocated with RECFM=VB. So the transformation step can both change layout and convert FB to VB. See the fixed-to-variable conversion tutorial for syntax.

Example: layout and code transformation

text

1
2
3
4
5
6
7
8
9
  INREC BUILD=(1,10,           /* key - move to 1-10 */
            11,20,          /* name - move to 11-30 */
            31,2,           /* legacy status */
            C'  ',          /* filler for new code at 33-34 */
            35,8,           /* legacy date YYMMDD */
            ...)
  OUTREC FINDREP=(IN=(33,2,C'01'),OUT=(33,2,C'A1'),
                  IN=(33,2,C'02'),OUT=(33,2,C'A2')),
          BUILD=(1,34,35,8,DATE1=(Y2Y,CE1),...)  /* expand date */

This is illustrative. INREC builds a record with key, name, legacy status, filler, and legacy date. OUTREC applies FINDREP to replace legacy status codes 01/02 with new codes A1/A2, then BUILD includes a date conversion (DATE1= with century) so the 8-byte legacy date becomes a 4-digit year format. Exact syntax (DATE1=, CE1, etc.) is product-dependent; see your manual.

Multi-Step Transformation

When one step is too complex, split the work. Step 1: normalize and convert dates, write to &&WORK. Step 2: join with lookup for code translation, write to &&WORK2. Step 3: final BUILD and FTOV if needed, write to target. Each step is easier to test and maintain. Use temporary datasets between steps as in batch processing automation.

Best Practices

Document the legacy layout and the target layout (field list, positions, formats) so the BUILD specification is clear and auditable.
Use a lookup table for code mapping when there are many codes or when the mapping may change; keep the table in a dataset and join in DFSORT.
Validate date conversion with known dates (e.g. 991231 → 19991231) and document the century window.
Test with a small extract first; compare record counts and spot-check key fields before full migration.
Consider writing a reject file (OUTFIL with OMIT for records that fail lookup or conversion) so you can fix and reprocess.

Explain It Like I'm Five

Imagine the old system wrote labels in a different language and different order (name first, then number). The new system wants the number first and the name in capital letters. DFSORT is like a helper that rewrites each label: it moves the number to the front, writes the name in capitals, and replaces old words with new words using a dictionary (lookup table). At the end, every label looks the way the new system expects.

Exercises

Legacy record: positions 1–10 key, 11–30 name, 31–32 status code. Target wants: 1–10 key, 11–40 name (padded), 41–42 new status. Status "01" → "A1", "02" → "A2". Write INREC BUILD and OUTREC FINDREP (conceptually).
Why use JOINKEYS instead of FINDREP for code translation when there are 100 different legacy codes?
Legacy date at 41,6 is YYMMDD. Target wants YYYYMMDD at 41,8. What DFSORT feature do you use, and in INREC or OUTREC?
When would you use INREC for transformation instead of OUTREC?

Quiz

Test Your Knowledge

1. What is the main use of INREC and OUTREC in legacy data transformation?

Only sorting
To change record layout and content: reorder fields, convert dates and numeric formats, insert constants, translate codes, and build records that match the target system layout
Only INCLUDE
Only for VB

2. When would you use FINDREP in a legacy transformation?

Only for dates
When the legacy system used specific codes or strings that must be replaced with the new system codes or standard values (e.g. old status "A" to new "ACTIVE"); FINDREP does literal replacement
Only in SUM
Only for dedup

3. How can you map legacy codes to new codes when the mapping is not a simple find-replace?

Only FINDREP
Use a lookup table: one file has legacy code and new code; use JOINKEYS to join the legacy data with the lookup on the legacy code and REFORMAT to output the new code (and the rest of the record) so each record gets the translated value
Only INCLUDE
DFSORT cannot do lookups

4. Why normalize dates when moving data from a legacy system?

Only for display
Legacy systems often used 2-digit years (YYMMDD) or other formats; the new system may require 4-digit years (YYYYMMDD) or a specific format. Converting in INREC/OUTREC ensures correct sort order and avoids century errors
Dates cannot be converted
Only in OUTFIL

5. What is the role of fixed-to-variable (FTOV) in legacy transformation?

Only for sorting
Legacy systems often produced fixed-length (FB) files; the new system may expect variable-length (VB). Use OUTFIL with FTOV (and BUILD to trim or shorten records) to produce VB output from FB input
Only for INREC
FTOV is only for reports