MainframeMaster

COBOL Text Processing Concepts

Text processing turns raw lines into structured fields you can validate and store. Focus on predictable delimiters and defensive parsing. This covers parsing, normalization, and validation of text data in COBOL programs.

Tokenize a Delimited Line

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
01 IN-LINE PIC X(120) VALUE 'ID=123|NAME=ALICE|CITY=AUSTIN'. 01 ID-FLD PIC 9(9). 01 NAME-FLD PIC X(30). 01 CITY-FLD PIC X(20). 01 P PIC 9(3) VALUE 1. UNSTRING IN-LINE DELIMITED BY '|' OR '=' INTO FILLER ID-FLD FILLER NAME-FLD FILLER CITY-FLD WITH POINTER P.

This pattern alternates label and value; send labels to FILLER, real values into fields. UNSTRING splits on multiple delimiters ('|' separates fields, '=' separates labels from values).

Normalize Case and Trim

cobol
1
2
3
4
5
MOVE FUNCTION UPPER-CASE(NAME-FLD) TO NAME-FLD MOVE FUNCTION LOWER-CASE(CITY-FLD) TO CITY-FLD *> Trim right spaces by moving into a smaller PIC MOVE NAME-FLD TO NAME-FLD-TRIMMED MOVE CITY-FLD TO CITY-FLD-TRIMMED

Use intrinsic functions for case conversion. To trim trailing spaces, move to a smaller PIC field. Always normalize data early in processing to ensure consistent comparisons.

Parse CSV Data

cobol
1
2
3
4
5
6
7
8
9
10
11
01 CSV-LINE PIC X(200) VALUE '123,Alice Smith,Austin,TX'. 01 FIELD-COUNT PIC 9 VALUE 0. 01 FIELD-1 PIC X(20). 01 FIELD-2 PIC X(30). 01 FIELD-3 PIC X(20). 01 FIELD-4 PIC X(10). UNSTRING CSV-LINE DELIMITED BY ',' INTO FIELD-1 FIELD-2 FIELD-3 FIELD-4 WITH POINTER P TALLYING FIELD-COUNT.

CSV parsing uses comma as delimiter. TALLYING counts how many fields were successfully parsed. Check FIELD-COUNT to ensure all expected fields were found.

Fixed-Position Records

cobol
1
2
3
4
MOVE IN-LINE(1:2) TO STATE MOVE IN-LINE(3:5) TO ZIP MOVE IN-LINE(8:20) TO STREET MOVE IN-LINE(21:50) TO CITY

Reference modification is best for fixed columns. Keep a spec documenting column ranges. Format: field(start:length) where start is 1-based position and length is number of characters.

Validate Parsed Data

cobol
1
2
3
4
5
6
7
8
9
10
11
IF ID-FLD NOT NUMERIC DISPLAY 'Error: ID must be numeric' ELSE IF ID-FLD = 0 DISPLAY 'Error: ID cannot be zero' ELSE IF NAME-FLD = SPACES DISPLAY 'Error: Name is required' ELSE IF FUNCTION LENGTH(FUNCTION TRIM(CITY-FLD)) < 2 DISPLAY 'Error: City name too short' ELSE DISPLAY 'Data validation passed' END-IF.

Always validate parsed data: check for numeric fields, required values, reasonable lengths. Use FUNCTION TRIM to remove leading/trailing spaces before length checks.