UTF-16 is a Unicode encoding using 16-bit code units and surrogate pairs for non-BMP characters. Many COBOL runtimes store NATIONAL text as UTF-16 or UCS-2 internally.
123456* NATIONAL conversions (often UTF-16 internally) 01 WIDE-TEXT PIC N(40). 01 DISP-TEXT PIC X(120). MOVE FUNCTION NATIONAL-OF("𐍈 – Gothic letter") TO WIDE-TEXT MOVE FUNCTION DISPLAY-OF(WIDE-TEXT) TO DISP-TEXT DISPLAY DISP-TEXT
Aspect | Description | Example |
---|---|---|
Encoding | 16-bit code units | UTF-16BE/LE |
Non-BMP | Surrogate pairs | U+D800..U+DFFF |
Conversions | NATIONAL-OF/DISPLAY-OF | DISPLAY-OF(WIDE) |
1. What distinguishes UTF-16 from UCS-2?
2. Which COBOL feature commonly maps to UTF-16?
3. What must be considered when writing UTF-16 to files?
4. How to convert between UTF-16 NATIONAL and DISPLAY?
5. Which encoding is best for ASCII-heavy data interchange?