MainframeMaster

Character Translation

Character translation in DFSORT means converting the bytes in a field from one form to another—for example lowercase to uppercase, uppercase to lowercase, EBCDIC to ASCII, or binary to hexadecimal character representation. You do this with the TRAN= parameter in INREC or OUTREC FIELDS=, specifying the starting position and length of the field and the translation option. When you use INREC, the translated value is what the sort and INCLUDE/OMIT see, because INREC runs before the sort. When you use OUTREC, the sort and filter see the original record; only the final output shows the translated value. This page covers the main TRAN= options, when to use INREC vs OUTREC, and how to use a custom translation table with ALTSEQ.

Data Transformation
Progress0 of 0 lessons

What Is Character Translation?

In mainframe data, characters are stored in EBCDIC. Sometimes you need to change them: make all letters uppercase for a sort key, make them lowercase for a downstream system, or convert the whole field to ASCII for a file that will be sent to a PC or Unix. Translation means replacing each byte in a field according to a rule: one set of codes maps to another. DFSORT provides built-in rules (e.g. LTOU, ETOA) and lets you define your own with ALTSEQ. The result is a new value in place of the original bytes (except for HEX/UNHEX and BIT/UNBIT, where the length can change).

TRAN= Options

The following options are commonly available. Exact names and availability depend on your DFSORT product and release.

Common TRAN= options
OptionMeaningWhen to use
LTOULowercase to uppercase (a-z → A-Z)Normalize names or keys for sort or display
UTOLUppercase to lowercase (A-Z → a-z)Produce lowercase output (e.g. for Unix)
ETOAEBCDIC to ASCIIOutput for ASCII systems (PC, Unix)
ATOEASCII to EBCDICInput originated from ASCII
HEXBinary → two EBCDIC hex chars per byteReadable hex dump or export
UNHEXTwo hex chars → one binary byteRestore from hex form
BITBinary → EBCDIC '0'/'1' (8 chars per byte)Bit representation
UNBITEight '0'/'1' chars → one byteRestore from bit form
ALTSEQUser-defined table (ALTSEQ statement)Custom character mapping

LTOU and UTOL affect only letters (a–z and A–Z); other characters (digits, spaces, punctuation) are unchanged. ETOA and ATOE convert the entire byte value according to the EBCDIC/ASCII code page. HEX turns each input byte into two output characters (e.g. X\'C1\' → "C1"), so the output field is twice the input length; UNHEX does the reverse. ALTSEQ uses a table you define in the ALTSEQ control statement.

Syntax: Where to Specify TRAN=

In both INREC and OUTREC you specify the field by starting position and length, then the translation option. The general form in FIELDS= is:

text
1
2
INREC FIELDS=(start,length,TRAN=option,...) OUTREC FIELDS=(start,length,TRAN=option,...)

You can list multiple fields. Each translated field has its own (start, length, TRAN=option). Fields that are just copied have (start, length) without TRAN. Example: translate the first 20 bytes to uppercase and the next 30 to lowercase, copy the rest:

text
1
OUTREC FIELDS=(1,20,TRAN=LTOU,21,30,TRAN=UTOL,51,30)

INREC vs OUTREC for Translation

INREC runs before the sort and before INCLUDE/OMIT. So any field you translate in INREC is the value that the sort key and conditions see. If you uppercase a name in INREC and sort on that name, the order is by the uppercase value. If you use INCLUDE/OMIT on that field, the comparison is against the translated value.

OUTREC runs after the sort. The sort and INCLUDE/OMIT have already run on the original record. So you use OUTREC for translation when you only need the converted value in the final output—for example, writing a file in ASCII for a Unix application, or producing a report with lowercase labels. The sort order and filtering are based on the untranslated data.

Rule of thumb: if the translated value must participate in sort order or INCLUDE/OMIT, use INREC. If it is only for the final output layout, use OUTREC.

Example: Lowercase to Uppercase (LTOU)

Input has a 20-byte name field at position 11. You want the output to have that name in uppercase (e.g. for a report or for a key). Using OUTREC:

text
1
2
SORT FIELDS=COPY OUTREC FIELDS=(1,10,11,20,TRAN=LTOU,31,50)

Bytes 1–10 are copied, bytes 11–20 are translated to uppercase, bytes 31–80 are copied. If you needed to sort by that name in uppercase order, you would use INREC instead and then SORT FIELDS=(11,20,CH,A).

Example: EBCDIC to ASCII (ETOA)

You are writing a file that will be transferred to a Unix or Windows system. The receiving system expects ASCII. Translate the entire record (or the character portions) with ETOA in OUTREC so that the output file is in ASCII. Do not sort or filter on the translated value unless you first convert in INREC and then sort/filter.

text
1
2
SORT FIELDS=COPY OUTREC FIELDS=(1,80,TRAN=ETOA)

Custom Translation with ALTSEQ

When you need a mapping that is not one of the built-in options, use the ALTSEQ control statement to define a translation table. In ALTSEQ you specify which input character codes map to which output codes. Then in INREC or OUTREC you use TRAN=ALTSEQ on the field. The table is applied to each byte in that field. Use this for custom code pages, replacing specific characters (e.g. strip control characters), or any one-to-one byte mapping. The ALTSEQ syntax is product-dependent; see your DFSORT manual.

Explain It Like I'm Five

Imagine you have a line of blocks with letters on them. "Translation" means we have a rule: every small letter we turn into a big letter. So "a" becomes "A", "b" becomes "B", and so on. We don't change numbers or spaces. DFSORT does that for a whole chunk of the record: you say "translate bytes 11 to 30 with the rule: lowercase to uppercase," and it does it. If you do it before the sort (INREC), the sort sees the big letters. If you do it only at the end (OUTREC), the sort still saw the small letters, but the written file has big letters.

Exercises

  1. Write OUTREC FIELDS= to translate bytes 41–80 to uppercase and leave the rest of the record unchanged (assume record length 80).
  2. You need to sort by a 10-byte name at position 5, but the data has mixed case. Should you use INREC or OUTREC for TRAN=LTOU? Why?
  3. When would you use TRAN=ETOA in OUTREC instead of INREC?

Quiz

Test Your Knowledge

1. What is character translation in DFSORT?

  • Sorting by character
  • Converting characters in a field from one set to another—e.g. lowercase to uppercase, EBCDIC to ASCII—using the TRAN= parameter in INREC or OUTREC
  • Only copying fields
  • Only in OUTFIL

2. When should you use INREC vs OUTREC for translation?

  • Always OUTREC
  • Use INREC when the translated value must be used for sorting or INCLUDE/OMIT—INREC runs before the sort. Use OUTREC when you only need the translated value in the final output
  • Always INREC
  • Only OUTREC for reports

3. What does TRAN=ALTSEQ do?

  • Sort by alternate key
  • Translates characters using a user-defined table specified in the ALTSEQ control statement—so you can map specific input bytes to specific output bytes
  • Only for numeric fields
  • Converts to ASCII only

4. What is the difference between TRAN=HEX and TRAN=UNHEX?

  • They are the same
  • HEX converts each binary byte to two EBCDIC hex characters (0-9, A-F). UNHEX converts pairs of EBCDIC hex characters back to one binary byte per pair
  • Only HEX is valid
  • UNHEX is for uppercase only

5. Can you translate multiple different fields in one INREC or OUTREC?

  • No, only one field
  • Yes—list each field with its own (position, length, TRAN=option) in FIELDS=; each field can have a different TRAN= value
  • Only in OUTREC
  • Only with ALTSEQ