Character conversion in DFSORT means translating the bytes in a field from one form to another—for example lowercase to uppercase (a–z to A–Z), uppercase to lowercase, EBCDIC to ASCII, or binary to hexadecimal character representation. You can do this in both INREC and OUTREC using the TRAN= parameter on the field in FIELDS= (or equivalent). When you use conversion in INREC, the translated value is what the sort and INCLUDE/OMIT see, because INREC runs before the sort. So if you uppercase a name field in INREC and sort on that field, the order is by the uppercase value. This page covers the main TRAN= options (LTOU, UTOL, ETOA, ATOE, HEX, UNHEX, ALTSEQ), how to apply them to specific fields in INREC, and when to use INREC vs OUTREC for character conversion.
Mainframe data is typically in EBCDIC. Sometimes you need to normalize text (e.g. all uppercase for sort or display), produce lowercase for an application, or convert to ASCII for a downstream system. When the converted value must drive the sort order or INCLUDE/OMIT logic, you do the conversion in INREC so that the sort and filters see the translated record. For example: uppercase the name in INREC and sort on that field for case-insensitive order; or convert a key field to ASCII in INREC if you are building a key for a later comparison that expects ASCII. If you only need the converted value in the output file and the sort should use the original data, use OUTREC instead.
In INREC FIELDS= you list the output record as a sequence of items. Each item can be (position,length) to copy bytes as-is, or (position,length,TRAN=option) to copy and translate. The position and length refer to the input field; that many bytes are read, translated, and placed in the output at the next available position (or at a specified output position depending on product syntax). So to convert the first 20 bytes to uppercase:
1INREC FIELDS=(1,20,TRAN=LTOU,21,60)
The first 20 bytes are translated with TRAN=LTOU (lower to upper) and placed at the start of the output; bytes 21–80 are copied unchanged. The reformatted record is 80 bytes. The sort and any INCLUDE/OMIT then see the uppercased first 20 bytes.
Common translation options and what they do:
| Option | Meaning | When to use |
|---|---|---|
| LTOU | Lower to upper: a-z → A-Z | Normalize names or keys for sort |
| UTOL | Upper to lower: A-Z → a-z | Produce lowercase output |
| ETOA | EBCDIC to ASCII | Output for ASCII consumers |
| ATOE | ASCII to EBCDIC | Input came from ASCII |
| HEX | Binary → hex characters (2 chars per byte) | Readable hex dump |
| UNHEX | Hex characters → binary | Restore from hex form |
| ALTSEQ | Custom table (ALTSEQ statement) | Custom mapping |
LTOU and UTOL affect only letters; digits and symbols are unchanged. ETOA and ATOE convert between EBCDIC and ASCII code pages; the field length stays the same. HEX produces two EBCDIC characters per input byte (e.g. 0–9 and A–F), so the output length for that field doubles; UNHEX does the reverse. ALTSEQ uses a table defined by the ALTSEQ control statement for custom byte-to-byte mapping. BIT and UNBIT convert between binary and EBCDIC '0'/'1' character representation. Check your DFSORT manual for the full list and any product-specific options.
You can translate some fields and copy others in one INREC. List each item in order. For example: uppercase bytes 1–20, copy 21–50 as-is, lowercase 51–70, copy 71–80:
1INREC FIELDS=(1,20,TRAN=LTOU,21,30,51,20,TRAN=UTOL,71,10)
The output record is built left to right: 20 bytes (uppercased), 30 bytes (unchanged), 20 bytes (lowercased), 10 bytes (unchanged)—80 bytes total. The sort sees this reformatted record; so if you SORT FIELDS=(1,20,CH,A), the order is by the uppercased name.
INREC: Conversion happens before the sort. The sort key and INCLUDE/OMIT see the translated data. Use when you need case-insensitive sort (uppercase key in INREC, sort on it), or when a filter must compare against the converted value. OUTREC: Conversion happens after the sort. Only the written output is translated; the sort used the original bytes. Use when you want sort order based on original data but the output file should show uppercase, lowercase, or ASCII. You can combine both: e.g. INREC to uppercase a key for sort, OUTREC to format the final record (the output can still show original case if you copy from a saved position or use a different strategy).
TRAN=ETOA (EBCDIC to ASCII) and TRAN=ATOE (ASCII to EBCDIC) convert between the two encodings. EBCDIC is the native mainframe encoding; ASCII is used on many other platforms. When you send a file to a Unix or Windows system that expects ASCII, you might use OUTREC with TRAN=ETOA on the relevant fields (or the whole record). When you read a file that was created in ASCII and need to process it on the mainframe, you might use INREC with TRAN=ATOE so that the sort and logic see EBCDIC. The field length in bytes is unchanged; only the byte values are translated according to the code page or default mapping.
TRAN=HEX converts each byte to two EBCDIC characters representing its hexadecimal value. So byte X'C1' becomes the two characters 'C' and '1'. The output length for that field is twice the input length. This is useful for creating a readable hex dump of a portion of the record. TRAN=UNHEX does the reverse: two hex characters (0–9, A–F) become one binary byte. If you use HEX in INREC, the record layout changes (that field doubles in size); ensure SORT FIELDS= and INCLUDE/OMIT use the correct positions and lengths for the reformatted record.
For variable-length records, the first 4 bytes are the RDW (Record Descriptor Word). Do not translate the RDW; it must remain valid. Typically you specify the RDW as the first item (e.g. 1,4) and then apply TRAN= only to data fields (e.g. 5,76,TRAN=LTOU). Check your product manual for INREC and variable-length record rules.
Imagine you have a list of names written in mixed small and capital letters. Before sorting the list, you want to treat "smith" and "SMITH" as the same. So you make a copy of each name in all capitals (or all small letters) and sort by that copy. The original list still has the mixed letters, but the order is decided by the "all one case" copy. Character conversion in INREC is like that: we change the letters (or the whole alphabet/code) in the record before the sort looks at it, so the sort sees the converted version. We can turn small letters into big letters (LTOU), big into small (UTOL), or change the whole alphabet (EBCDIC to ASCII). We do it in INREC when we want the sort or the "keep/drop" rules to use the converted version.
1. What does TRAN=LTOU do in INREC FIELDS?
2. When should you use character conversion in INREC instead of OUTREC?
3. What is TRAN=ETOA used for?
4. Can you translate only part of the record in INREC?
5. What does TRAN=HEX do?