MainframeMaster

COBOL Character Encoding

Character encoding in COBOL is fundamental to text data handling, determining how characters are represented as numeric values. Understanding character encoding is essential for proper text processing, internationalization, and data exchange between different systems and platforms.

Understanding Character Encoding

Character encoding defines the mapping between characters and their numeric representations. In COBOL applications, proper character encoding ensures accurate text processing, data integrity, and compatibility across different systems and platforms.

Common Character Encodings

1. EBCDIC Encoding

EBCDIC (Extended Binary Coded Decimal Interchange Code) is the primary character encoding used in mainframe COBOL applications.

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
WORKING-STORAGE SECTION. 01 EBCDIC-CHARACTERS. 05 LETTER-A PIC X(1) VALUE 'A'. 05 LETTER-Z PIC X(1) VALUE 'Z'. 05 DIGIT-0 PIC X(1) VALUE '0'. 05 DIGIT-9 PIC X(1) VALUE '9'. 05 SPACE-CHAR PIC X(1) VALUE SPACE. 05 PERIOD-CHAR PIC X(1) VALUE '.'. 01 EBCDIC-VALUES. 05 A-VALUE PIC X(1) VALUE X'C1'. 05 Z-VALUE PIC X(1) VALUE X'E9'. 05 ZERO-VALUE PIC X(1) VALUE X'F0'. 05 NINE-VALUE PIC X(1) VALUE X'F9'. 05 SPACE-VALUE PIC X(1) VALUE X'40'. PROCEDURE DIVISION. DISPLAY-EBCDIC-VALUES. DISPLAY 'Letter A EBCDIC: ' A-VALUE DISPLAY 'Letter Z EBCDIC: ' Z-VALUE DISPLAY 'Digit 0 EBCDIC: ' ZERO-VALUE DISPLAY 'Digit 9 EBCDIC: ' NINE-VALUE DISPLAY 'Space EBCDIC: ' SPACE-VALUE.

2. ASCII Encoding

ASCII (American Standard Code for Information Interchange) is commonly used in distributed systems and requires conversion when interfacing with mainframe COBOL applications.

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
WORKING-STORAGE SECTION. 01 ASCII-CHARACTERS. 05 ASCII-A PIC X(1) VALUE 'A'. 05 ASCII-Z PIC X(1) VALUE 'Z'. 05 ASCII-0 PIC X(1) VALUE '0'. 05 ASCII-9 PIC X(1) VALUE '9'. 01 ASCII-VALUES. 05 ASCII-A-VALUE PIC X(1) VALUE X'41'. 05 ASCII-Z-VALUE PIC X(1) VALUE X'5A'. 05 ASCII-0-VALUE PIC X(1) VALUE X'30'. 05 ASCII-9-VALUE PIC X(1) VALUE X'39'. PROCEDURE DIVISION. DISPLAY-ASCII-VALUES. DISPLAY 'Letter A ASCII: ' ASCII-A-VALUE DISPLAY 'Letter Z ASCII: ' ASCII-Z-VALUE DISPLAY 'Digit 0 ASCII: ' ASCII-0-VALUE DISPLAY 'Digit 9 ASCII: ' ASCII-9-VALUE.

3. Unicode Support

Unicode provides comprehensive character support for international text processing, including UTF-8, UTF-16, and UTF-32 encodings.

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
WORKING-STORAGE SECTION. 01 UNICODE-DATA. 05 UTF8-STRING PIC X(100). 05 UTF16-STRING PIC X(200). 05 UNICODE-LENGTH PIC 9(3). 05 CONVERSION-STATUS PIC X(1). 88 CONVERSION-SUCCESS VALUE 'S'. 88 CONVERSION-FAILED VALUE 'F'. 01 INTERNATIONAL-TEXT. 05 ENGLISH-TEXT PIC X(50) VALUE 'Hello World'. 05 SPANISH-TEXT PIC X(50) VALUE 'Hola Mundo'. 05 FRENCH-TEXT PIC X(50) VALUE 'Bonjour Monde'. 05 GERMAN-TEXT PIC X(50) VALUE 'Hallo Welt'. PROCEDURE DIVISION. PROCESS-UNICODE-TEXT. PERFORM CONVERT-TO-UTF8 PERFORM VALIDATE-UNICODE-DATA PERFORM PROCESS-INTERNATIONAL-TEXT. CONVERT-TO-UTF8. CALL 'UNICODE-CONVERTER' USING ENGLISH-TEXT UTF8-STRING CONVERSION-STATUS ON EXCEPTION MOVE 'F' TO CONVERSION-STATUS END-CALL. VALIDATE-UNICODE-DATA. IF CONVERSION-SUCCESS DISPLAY 'Unicode conversion successful' ELSE DISPLAY 'Unicode conversion failed' END-IF.

Character Encoding Conversion

1. EBCDIC to ASCII Conversion

Convert between EBCDIC and ASCII encodings for system interoperability.

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
WORKING-STORAGE SECTION. 01 CONVERSION-TABLE. 05 EBCDIC-ASCII-MAP OCCURS 256 TIMES. 10 EBCDIC-CHAR PIC X(1). 10 ASCII-CHAR PIC X(1). 01 CONVERSION-DATA. 05 INPUT-STRING PIC X(100). 05 OUTPUT-STRING PIC X(100). 05 STRING-LENGTH PIC 9(3). 05 CONVERSION-TYPE PIC X(1). 88 EBCDIC-TO-ASCII VALUE 'E'. 88 ASCII-TO-EBCDIC VALUE 'A'. PROCEDURE DIVISION. CONVERT-CHARACTER-ENCODING. PERFORM INITIALIZE-CONVERSION-TABLE PERFORM CONVERT-STRING PERFORM VALIDATE-CONVERSION. CONVERT-STRING. PERFORM VARYING CHAR-INDEX FROM 1 BY 1 UNTIL CHAR-INDEX > STRING-LENGTH IF EBCDIC-TO-ASCII PERFORM CONVERT-EBCDIC-TO-ASCII ELSE PERFORM CONVERT-ASCII-TO-EBCDIC END-IF END-PERFORM. CONVERT-EBCDIC-TO-ASCII. MOVE INPUT-STRING(CHAR-INDEX:1) TO SEARCH-CHAR PERFORM FIND-ASCII-EQUIVALENT MOVE ASCII-CHAR TO OUTPUT-STRING(CHAR-INDEX:1). FIND-ASCII-EQUIVALENT. PERFORM VARYING TABLE-INDEX FROM 1 BY 1 UNTIL TABLE-INDEX > 256 IF EBCDIC-CHAR(TABLE-INDEX) = SEARCH-CHAR MOVE ASCII-CHAR(TABLE-INDEX) TO ASCII-CHAR EXIT PERFORM END-IF END-PERFORM.

2. Character Set Validation

Validate character data to ensure it conforms to expected encoding standards.

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
WORKING-STORAGE SECTION. 01 VALIDATION-RULES. 05 ALLOWED-CHARS PIC X(100) VALUE 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 '. 05 INVALID-CHARS PIC 9(2) VALUE 0. 05 MAX-INVALID PIC 9(2) VALUE 5. 01 VALIDATION-RESULTS. 05 VALIDATION-STATUS PIC X(1). 88 VALIDATION-PASSED VALUE 'P'. 88 VALIDATION-FAILED VALUE 'F'. 05 ERROR-MESSAGE PIC X(50). PROCEDURE DIVISION. VALIDATE-CHARACTER-ENCODING. MOVE 'P' TO VALIDATION-STATUS MOVE 0 TO INVALID-CHARS PERFORM VARYING CHAR-INDEX FROM 1 BY 1 UNTIL CHAR-INDEX > STRING-LENGTH PERFORM CHECK-CHARACTER-VALIDITY END-PERFORM IF INVALID-CHARS > MAX-INVALID MOVE 'F' TO VALIDATION-STATUS MOVE 'Too many invalid characters' TO ERROR-MESSAGE END-IF. CHECK-CHARACTER-VALIDITY. MOVE INPUT-STRING(CHAR-INDEX:1) TO CURRENT-CHAR PERFORM SEARCH-VALID-CHARACTERS IF CHARACTER-NOT-FOUND ADD 1 TO INVALID-CHARS END-IF. SEARCH-VALID-CHARACTERS. MOVE 'N' TO CHARACTER-FOUND-FLAG PERFORM VARYING VALID-INDEX FROM 1 BY 1 UNTIL VALID-INDEX > 100 IF ALLOWED-CHARS(VALID-INDEX:1) = CURRENT-CHAR MOVE 'Y' TO CHARACTER-FOUND-FLAG EXIT PERFORM END-IF END-PERFORM.

Internationalization Support

1. Locale-Specific Processing

Implement locale-specific character processing for international applications.

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
WORKING-STORAGE SECTION. 01 LOCALE-DATA. 05 CURRENT-LOCALE PIC X(5). 88 US-LOCALE VALUE 'en_US'. 88 SPANISH-LOCALE VALUE 'es_ES'. 88 FRENCH-LOCALE VALUE 'fr_FR'. 88 GERMAN-LOCALE VALUE 'de_DE'. 05 LOCALE-CHARSET PIC X(10). 05 LOCALE-ENCODING PIC X(10). 01 INTERNATIONAL-TEXT. 05 LOCALIZED-TEXT PIC X(100). 05 ORIGINAL-TEXT PIC X(100). 05 TRANSLATION-STATUS PIC X(1). 88 TRANSLATION-SUCCESS VALUE 'S'. 88 TRANSLATION-FAILED VALUE 'F'. PROCEDURE DIVISION. PROCESS-LOCALE-SPECIFIC-TEXT. PERFORM DETERMINE-LOCALE-SETTINGS PERFORM SET-LOCALE-ENCODING PERFORM PROCESS-LOCALIZED-TEXT. DETERMINE-LOCALE-SETTINGS. EVALUATE CURRENT-LOCALE WHEN 'en_US' MOVE 'ASCII' TO LOCALE-CHARSET MOVE 'UTF-8' TO LOCALE-ENCODING WHEN 'es_ES' MOVE 'ISO-8859-1' TO LOCALE-CHARSET MOVE 'UTF-8' TO LOCALE-ENCODING WHEN 'fr_FR' MOVE 'ISO-8859-1' TO LOCALE-CHARSET MOVE 'UTF-8' TO LOCALE-ENCODING WHEN 'de_DE' MOVE 'ISO-8859-1' TO LOCALE-CHARSET MOVE 'UTF-8' TO LOCALE-ENCODING END-EVALUATE. PROCESS-LOCALIZED-TEXT. CALL 'LOCALE-PROCESSOR' USING ORIGINAL-TEXT CURRENT-LOCALE LOCALIZED-TEXT TRANSLATION-STATUS ON EXCEPTION MOVE 'F' TO TRANSLATION-STATUS END-CALL.

Best Practices for Character Encoding

Common Character Encoding Issues