MainframeMaster

COBOL Tutorial

Internationalization in COBOL

Progress0 of 0 lessons

Internationalization (i18n) is the practice of designing and implementing COBOL programs that can be easily adapted to different languages, regions, and cultural conventions. This includes supporting multiple character sets, date formats, currency symbols, numeric formats, and text messages.

Why Internationalization Matters

  • Enables applications to serve global markets with a single codebase
  • Reduces maintenance costs by centralizing language-specific data
  • Improves user experience in different regions
  • Supports compliance with international standards and regulations
  • Facilitates easier translation and localization efforts

Understanding Internationalization vs Localization

Internationalization (i18n) is the technical design that enables a program to support multiple locales without code modifications. It involves:

  • Separating text from program logic
  • Using locale-aware functions and data structures
  • Supporting multiple character encodings
  • Designing flexible formatting routines

Localization (l10n) is the process of adapting the program for a specific locale, including:

  • Translating user messages
  • Adjusting date/time formats
  • Applying regional currency formats
  • Setting locale-specific conventions

Character Set Handling

COBOL supports internationalization through character set management. The ALPHABET clause in SPECIAL-NAMES allows you to specify alternate character sets beyond the default EBCDIC or ASCII.

ALPHABET Clause Example

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
IDENTIFICATION DIVISION. PROGRAM-ID. I18N-CHAR-SET. ENVIRONMENT DIVISION. CONFIGURATION SECTION. SPECIAL-NAMES. ALPHABET NATIONAL-CHARS IS NATIVE. ALPHABET INTERNATIONAL-CHARS IS "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789" " !#$%&'()*+,-./:;<=>?@[]^_{|}~" OTHER CHARS. WORKING-STORAGE SECTION. 01 TEXT-FIELD PIC X(100). 88 IS-ALPHANUMERIC VALUE IS ALPHABETIC-NATIONAL.

For modern applications, use Unicode support when available in your COBOL compiler:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
IDENTIFICATION DIVISION. PROGRAM-ID. UNICODE-SUPPORT. ENVIRONMENT DIVISION. CONFIGURATION SECTION. SPECIAL-NAMES. ALPHABET UTF8-CHARS IS UTF-8. WORKING-STORAGE SECTION. 01 UNICODE-TEXT PIC X(200) ENCODING UTF-8. 01 ASCII-FIELD PIC X(100) ENCODING ASCII.

Locale-Aware Data Handling

Date and Time Formatting

Store dates internally in a canonical format (YYYYMMDD, Julian, or UNIX timestamp) and format them for display based on locale:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
WORKING-STORAGE SECTION. 01 WS-INTERNAL-DATE PIC 9(8) VALUE 20240315. 01 WS-FORMATTED-DATE PIC X(20). 01 WS-LOCALE PIC X(5) VALUE "en-US". PROCEDURE DIVISION. COMPUTE WS-INTERNAL-DATE = FUNCTION INTEGER-OF-DATE( FUNCTION DATE-OF-INTEGER( FUNCTION INTEGER-OF-DATE(WS-INTERNAL-DATE) ) ) *> Format based on locale EVALUATE WS-LOCALE WHEN "en-US" STRING WS-MONTH, "/", WS-DAY, "/", WS-YEAR DELIMITED BY SIZE INTO WS-FORMATTED-DATE WHEN "en-GB" STRING WS-DAY, "/", WS-MONTH, "/", WS-YEAR DELIMITED BY SIZE INTO WS-FORMATTED-DATE WHEN "fr-FR" STRING WS-DAY, "/", WS-MONTH, "/", WS-YEAR DELIMITED BY SIZE INTO WS-FORMATTED-DATE END-EVALUATE.

Currency Formatting

Store currency values as numbers and format them with appropriate symbols and separators:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
WORKING-STORAGE SECTION. 01 WS-AMOUNT PIC S9(10)V99 VALUE 1234567.89. 01 WS-FORMATTED-AMT PIC X(30). 01 WS-CURRENCY-SYMBOL PIC X(3). 01 WS-DECIMAL-SEP PIC X VALUE "X". 01 WS-THOUSAND-SEP PIC X VALUE "X". PROCEDURE DIVISION. *> Determine formatting based on locale EVALUATE WS-LOCALE WHEN "en-US" MOVE "$" TO WS-CURRENCY-SYMBOL MOVE "." TO WS-DECIMAL-SEP MOVE "," TO WS-THOUSAND-SEP WHEN "de-DE" MOVE "EUR" TO WS-CURRENCY-SYMBOL MOVE "," TO WS-DECIMAL-SEP MOVE "." TO WS-THOUSAND-SEP WHEN "fr-FR" MOVE "€" TO WS-CURRENCY-SYMBOL MOVE "," TO WS-DECIMAL-SEP MOVE " " TO WS-THOUSAND-SEP END-EVALUATE. *> Format the amount STRING WS-CURRENCY-SYMBOL, " ", WS-AMOUNT-FORMATTED DELIMITED BY SIZE INTO WS-FORMATTED-AMT.

Numeric Formatting

Numeric display should respect regional conventions:

  • Decimal separator: period (.) in US, comma (,) in Europe
  • Thousands separator: comma (,) in US, period (.) or space in Europe
  • Digit grouping: varies by region (3 digits, 2 digits, etc.)

Message Externalization

Hardcoded messages make internationalization impossible. Instead, externalize all user-facing text:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
WORKING-STORAGE SECTION. 01 WS-MESSAGE-CODE PIC X(10). 01 WS-LANGUAGE PIC X(5) VALUE "en-US". 01 WS-MESSAGE-TEXT PIC X(100). 01 MESSAGE-TABLE. 05 FILLER OCCURS 100 TIMES INDEXED BY MSG-IDX. 10 MSG-ID PIC X(10). 10 MSG-LANG PIC X(5). 10 MSG-TEXT PIC X(100). PROCEDURE DIVISION. *> Load message table at program start PERFORM LOAD-MESSAGE-TABLE. *> Retrieve message by ID and language MOVE "ERR001" TO WS-MESSAGE-CODE MOVE "en-US" TO WS-LANGUAGE PERFORM GET-MESSAGE DISPLAY WS-MESSAGE-TEXT. LOAD-MESSAGE-TABLE. *> In production, load from database or file MOVE "ERR001" TO MSG-ID(1) MOVE "en-US" TO MSG-LANG(1) MOVE "File not found" TO MSG-TEXT(1) MOVE "ERR001" TO MSG-ID(2) MOVE "es-ES" TO MSG-LANG(2) MOVE "Archivo no encontrado" TO MSG-TEXT(2) MOVE "ERR001" TO MSG-ID(3) MOVE "fr-FR" TO MSG-LANG(3) MOVE "Fichier introuvable" TO MSG-TEXT(3). GET-MESSAGE. SET MSG-IDX TO 1 SEARCH MSG-ENTRY WHEN MSG-ID(MSG-IDX) = WS-MESSAGE-CODE AND MSG-LANG(MSG-IDX) = WS-LANGUAGE MOVE MSG-TEXT(MSG-IDX) TO WS-MESSAGE-TEXT END-SEARCH.

Locale Detection and Configuration

Create a mechanism to detect and configure the current locale:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
IDENTIFICATION DIVISION. PROGRAM-ID. LOCALE-DETECT. DATA DIVISION. WORKING-STORAGE SECTION. 01 WS-CURRENT-LOCALE PIC X(5). 01 WS-LOCALE-INFO. 05 LOCALE-LANG PIC X(2). 05 LOCALE-REGION PIC X(2). 05 LOCALE-ENCODING PIC X(10). 05 LOCALE-DATE-FMT PIC X(10). 05 LOCALE-TIME-FMT PIC X(10). PROCEDURE DIVISION USING WS-CURRENT-LOCALE. PERFORM DETECT-LOCALE PERFORM LOAD-LOCALE-CONFIG PERFORM MAIN-LOGIC GOBACK. DETECT-LOCALE. *> Read from environment variable or configuration ACCEPT WS-CURRENT-LOCALE FROM ENVIRONMENT "LANG" IF WS-CURRENT-LOCALE = SPACES MOVE "en-US" TO WS-CURRENT-LOCALE END-IF *> Parse language and region UNSTRING WS-CURRENT-LOCALE DELIMITED BY "-" INTO LOCALE-LANG LOCALE-REGION END-UNSTRING. LOAD-LOCALE-CONFIG. EVALUATE WS-CURRENT-LOCALE WHEN "en-US" MOVE "MM/DD/YYYY" TO LOCALE-DATE-FMT MOVE "HH:MM:SS" TO LOCALE-TIME-FMT MOVE "UTF-8" TO LOCALE-ENCODING WHEN "en-GB" MOVE "DD/MM/YYYY" TO LOCALE-DATE-FMT MOVE "24:MM:SS" TO LOCALE-TIME-FMT MOVE "UTF-8" TO LOCALE-ENCODING WHEN "de-DE" MOVE "DD.MM.YYYY" TO LOCALE-DATE-FMT MOVE "HH.MM.SS" TO LOCALE-TIME-FMT MOVE "UTF-8" TO LOCALE-ENCODING WHEN OTHER MOVE "YYYY-MM-DD" TO LOCALE-DATE-FMT MOVE "HH:MM:SS" TO LOCALE-TIME-FMT MOVE "UTF-8" TO LOCALE-ENCODING END-EVALUATE.

Best Practices for Internationalization

Common Internationalization Challenges

When implementing internationalization, watch out for these common pitfalls:

  • Hardcoded separators: Never assume period/comma separators for decimals
  • Fixed-length messages: Different languages have different lengths for the same message
  • Context-dependent text: Word order varies between languages
  • Cultural conventions: Consider right-to-left languages, different calendars, etc.
  • Time zones: Properly handle UTC conversions and daylight saving time

Example: Internationalized Report Program

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
IDENTIFICATION DIVISION. PROGRAM-ID. I18N-REPORT. ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT MESSAGE-FILE ASSIGN TO "MSG.TXT" ORGANIZATION IS INDEXED ACCESS IS RANDOM RECORD KEY IS MSG-KEY FILE STATUS IS MSG-STATUS. DATA DIVISION. FILE SECTION. FD MESSAGE-FILE. 01 MESSAGE-REC. 05 MSG-KEY PIC X(15). 05 MSG-TEXT PIC X(100). 05 MSG-LANG PIC X(5). WORKING-STORAGE SECTION. 01 WS-LOCALE PIC X(5) VALUE "en-US". 01 WS-FORMATTED-DATE PIC X(20). 01 WS-FORMATTED-AMOUNT PIC X(30). PROCEDURE DIVISION. PERFORM LOAD-MESSAGES PERFORM FORMAT-DATE PERFORM FORMAT-AMOUNT PERFORM DISPLAY-REPORT STOP RUN. FORMAT-DATE. *> Format current date based on locale COMPUTE WS-FORMATTED-DATE = FUNCTION DATE-OF-INTEGER( FUNCTION INTEGER-OF-DATE(FUNCTION CURRENT-DATE) ) END-COMPUTE *> Apply locale-specific formatting EVALUATE WS-LOCALE(1:2) WHEN "en" *> Format as MM/DD/YYYY or DD/MM/YYYY CONTINUE WHEN "de" *> Format as DD.MM.YYYY CONTINUE WHEN OTHER *> ISO format YYYY-MM-DD CONTINUE END-EVALUATE. FORMAT-AMOUNT. *> Format currency amount based on locale *> Implementation details for currency formatting CONTINUE. DISPLAY-REPORT. PERFORM GET-MESSAGE-DISPLAY USING "RPT001" DISPLAY "Date: ", WS-FORMATTED-DATE DISPLAY "Amount: ", WS-FORMATTED-AMOUNT. GET-MESSAGE-DISPLAY USING MSG-CODE PIC X(10). MOVE MSG-CODE TO MSG-KEY READ MESSAGE-FILE IF MSG-STATUS = "00" DISPLAY MSG-TEXT END-IF.

Explain It Like I'm 5 Years Old:

Think of internationalization like having a TV channel that can show programs in lots of different languages! Just like how some people speak English, others speak Spanish or French, computer programs need to be able to talk to people in their own language. Instead of having a separate TV for each language, we make one TV that can switch between different languages. That's what internationalization does for COBOL programs - it lets one program work for people in America, Germany, France, Japan, and lots of other places, all by changing the language and how things look based on where the person lives!

Key Takeaways

  • Internationalization enables programs to support multiple languages and regions
  • Separate user-facing text from program code
  • Use canonical formats for internal data storage
  • Apply locale-specific formatting only at display time
  • Leverage built-in functions and modern character set support
  • Test thoroughly with multiple locales and character encodings

Test Your Knowledge

1. What is the primary purpose of internationalization in COBOL programs?

  • To make code run faster
  • To support multiple languages and regional settings
  • To improve file access
  • To reduce program size

2. What is the difference between internationalization and localization?

  • They are the same
  • Internationalization is designing for multiple locales; localization is adapting for a specific locale
  • Internationalization is for European languages only
  • Localization is obsolete

3. Which COBOL feature is most important for handling international character sets?

  • PIC clauses
  • ALPHABET clause
  • COLLATING SEQUENCE clause
  • STRING and UNSTRING verbs

4. How should currency amounts be handled in internationalized COBOL programs?

  • Always use fixed currency symbols
  • Store numeric values separately and format based on locale
  • Hardcode currency codes
  • Use only US dollars

5. What is the recommended approach for date and time formatting in international applications?

  • Hardcode one format
  • Use intrinsic functions with locale parameters
  • Always use Gregorian calendar
  • Only support US date format