String manipulation in COBOL is done with INSPECT (count, replace, translate), STRING (concatenate), and UNSTRING (split). This page explains each verb, typical uses, and how to combine them for text and character processing.
Strings are lines of text—names, addresses, codes. Sometimes you need to count something (how many spaces?), change something (replace commas with spaces), or put pieces together (first name + space + last name) or take them apart (split "Smith, John" into last and first). String manipulation is the set of COBOL verbs that do these jobs without writing a character-by-character loop every time.
INSPECT works on one field and either counts occurrences (TALLYING), replaces characters or strings (REPLACING), or translates characters (CONVERTING). The target field is modified in place; its length does not change. Modifiers control scope: ALL (every occurrence), LEADING (only at the start), FIRST (first occurrence only), and BEFORE/AFTER INITIAL (only in a portion of the string).
| Form | Purpose | Example |
|---|---|---|
| TALLYING | Count occurrences of characters or strings. | INSPECT WS-TEXT TALLYING WS-CNT FOR ALL SPACES. |
| REPLACING | Replace characters or strings in place. | INSPECT WS-TEXT REPLACING ALL "," BY SPACE. |
| CONVERTING | Translate characters (e.g. case). | INSPECT WS-TEXT CONVERTING "abc" TO "ABC". |
| TALLYING ... REPLACING | Count and replace in one pass. | INSPECT WS-TEXT TALLYING WS-CNT FOR LEADING "0" REPLACING LEADING "0" BY SPACE. |
12345678910111213WORKING-STORAGE SECTION. 01 WS-TEXT PIC X(40) VALUE ' HELLO, WORLD '. 01 WS-CNT PIC 9(2). *> Count all spaces INSPECT WS-TEXT TALLYING WS-CNT FOR ALL SPACES *> Replace all commas with space INSPECT WS-TEXT REPLACING ALL ',' BY SPACE *> Convert lowercase to uppercase (each char in first by same position in second) INSPECT WS-TEXT CONVERTING 'abcdefghijklmnopqrstuvwxyz' BY 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
TALLYING adds to the tally field (it does not reset it); often you INITIALIZE or MOVE 0 to the tally first. REPLACING changes the source field. CONVERTING is a one-to-one character map: the first literal and second literal must be the same length; each character in the string that appears in the first is replaced by the character at the same position in the second. INSPECT cannot remove characters and close the gap—for that you need a loop or UNSTRING.
ALL means every occurrence in the string. LEADING means only consecutive occurrences at the start of the string (e.g. leading zeros or leading spaces). FIRST means only the first occurrence. BEFORE INITIAL literal restricts the operation to the portion of the string before the first occurrence of the literal; AFTER INITIAL literal restricts to the portion after. These can be combined (e.g. REPLACING LEADING SPACES BY ZEROS, or TALLYING only AFTER INITIAL ",").
STRING concatenates two or more sending items into one receiving field. You list the sending items; they are placed into the receiver left to right. POINTER tracks the current position in the receiving field (1-based). DELIMITED BY SIZE means use the full length of the sending item; DELIMITED BY literal or identifier means stop when that delimiter is encountered (or at end of item). If the receiver is too small, the remaining characters are truncated; you can check the POINTER to see how many characters were used.
1234567891011121301 WS-FIRST PIC X(10) VALUE 'JOHN'. 01 WS-LAST PIC X(15) VALUE 'SMITH'. 01 WS-FULL PIC X(30). 01 WS-PTR PIC 9(2). MOVE 1 TO WS-PTR STRING WS-FIRST DELIMITED BY SIZE SPACE DELIMITED BY SIZE WS-LAST DELIMITED BY SIZE INTO WS-FULL WITH POINTER WS-PTR END-STRING *> WS-FULL contains 'JOHN SMITH ' (with padding if WS-FULL is longer)
If a sending item is shorter than its full size (e.g. PIC X(10) containing "JOHN"), DELIMITED BY SIZE still sends all 10 characters (including trailing spaces). To send only the significant part you can use DELIMITED BY SPACE so that trailing spaces are not included, or use a different delimiter that matches your data.
UNSTRING splits one source string into multiple receiving fields. You specify the source and the receivers. DELIMITED BY lists the delimiters that separate values (e.g. comma, space, or multiple delimiters). Each UNSTRING moves the next "piece" into the next receiver until the source is exhausted or receivers are full. You can use DELIMITER IN and COUNT IN to capture which delimiter was found and how many characters were moved. TALLYING count keeps a count of how many receivers were used.
12345678910111201 WS-LINE PIC X(40) VALUE 'SMITH,JOHN,123 MAIN ST'. 01 WS-LAST PIC X(15). 01 WS-FIRST PIC X(10). 01 WS-ADDR PIC X(20). 01 WS-TALLY PIC 9(2). UNSTRING WS-LINE DELIMITED BY ',' INTO WS-LAST WS-FIRST WS-ADDR TALLYING WS-TALLY END-UNSTRING *> WS-LAST 'SMITH', WS-FIRST 'JOHN', WS-ADDR '123 MAIN ST'
If there are more delimiters than receivers, the extra data is skipped unless you use a receiver with OCCURS and handle multiple values. If there are fewer delimiters, the remaining receivers may be untouched or partially filled depending on the implementation. Always initialize receivers or check TALLYING and COUNT IN when the format can vary.
MOVE copies data from a sending item to a receiving item. For alphanumeric (PIC X) fields, MOVE copies left to right; if the receiver is longer, the right is padded with spaces; if the receiver is shorter, the right is truncated. No conversion or parsing is done—just byte-for-byte copy (with padding/truncation). Use MOVE for simple assignment; use STRING when you are building a value from several pieces and UNSTRING when you are splitting one value into several.
INSPECT REPLACING cannot shorten the string (e.g. remove a character and close the gap). To remove characters you can use a loop: move character by character to a work field, skipping the unwanted characters. Or use UNSTRING with multiple receivers if the data is delimiter-separated. For complex parsing, consider intrinsic functions (e.g. substring, length) if your compiler supports them—see string-functions.
1. Which INSPECT form would you use to count how many commas are in a field?
2. To join LAST-NAME, a comma, and FIRST-NAME into one field you would use:
3. UNSTRING is used to: