Reference modification in COBOL allows you to access a substring (portion) of a data item by specifying the starting position and length. It provides a powerful way to extract specific parts of fields, parse fixed-format data, manipulate strings, and access portions of records without using intermediate fields or complex MOVE statements. Understanding reference modification is essential for efficient data processing and string manipulation in COBOL programs.
Reference modification lets you access a substring of a data item using position-based syntax. Instead of working with the entire field, you can specify exactly which characters you want to access. This is similar to array slicing in other programming languages, but for character strings in COBOL.
Reference modification is useful for:
The syntax for reference modification is:
1data-item(start-position:length)
Where:
Both start-position and length can be:
123456789101112131415161718WORKING-STORAGE SECTION. 01 CUSTOMER-NAME PIC X(30) VALUE 'JOHN SMITH'. 01 FIRST-NAME PIC X(10). 01 LAST-NAME PIC X(20). PROCEDURE DIVISION. MAIN-PARA. *> Extract first 4 characters (first name) MOVE CUSTOMER-NAME(1:4) TO FIRST-NAME DISPLAY 'First Name: ' FIRST-NAME *> Output: First Name: JOHN *> Extract characters 6-10 (last name starts at position 6) MOVE CUSTOMER-NAME(6:5) TO LAST-NAME DISPLAY 'Last Name: ' LAST-NAME *> Output: Last Name: SMITH STOP RUN.
In this example:
CUSTOMER-NAME(1:4) accesses the first 4 characters: "JOHN"CUSTOMER-NAME(6:5) accesses 5 characters starting at position 6: "SMITH"The most common use of reference modification is extracting substrings from fields. This is especially useful for parsing fixed-format data.
A very common use case is extracting year, month, and day from a date stored as YYYYMMDD:
1234567891011121314151617181920212223242526272829WORKING-STORAGE SECTION. 01 DATE-FIELD PIC 9(8) VALUE 20240115. 01 YEAR PIC 9(4). 01 MONTH PIC 9(2). 01 DAY PIC 9(2). 01 YEAR-MONTH PIC 9(6). PROCEDURE DIVISION. EXTRACT-DATE-COMPONENTS. *> Extract year (positions 1-4) MOVE DATE-FIELD(1:4) TO YEAR *> Result: 2024 *> Extract month (positions 5-6) MOVE DATE-FIELD(5:2) TO MONTH *> Result: 01 *> Extract day (positions 7-8) MOVE DATE-FIELD(7:2) TO DAY *> Result: 15 *> Extract year and month together (positions 1-6) MOVE DATE-FIELD(1:6) TO YEAR-MONTH *> Result: 202401 DISPLAY 'Date: ' YEAR '/' MONTH '/' DAY *> Output: Date: 2024/01/15 STOP RUN.
This technique is extremely useful because:
Reference modification is ideal for parsing fixed-format records where fields are at known positions:
1234567891011121314151617181920212223242526272829303132333435WORKING-STORAGE SECTION. 01 INPUT-RECORD PIC X(80). 01 RECORD-TYPE PIC X(2). 01 CUSTOMER-ID PIC X(10). 01 TRANSACTION-AMOUNT PIC X(12). 01 TRANSACTION-DATE PIC X(8). PROCEDURE DIVISION. PARSE-FIXED-RECORD. *> Assume record format: *> Positions 1-2: Record type *> Positions 3-12: Customer ID *> Positions 13-24: Transaction amount *> Positions 25-32: Transaction date MOVE '01CUST12345 5000.00 20240115' TO INPUT-RECORD MOVE INPUT-RECORD(1:2) TO RECORD-TYPE *> Extracts: "01" MOVE INPUT-RECORD(3:10) TO CUSTOMER-ID *> Extracts: "CUST12345 " MOVE INPUT-RECORD(13:12) TO TRANSACTION-AMOUNT *> Extracts: " 5000.00 " MOVE INPUT-RECORD(25:8) TO TRANSACTION-DATE *> Extracts: "20240115" DISPLAY 'Type: ' RECORD-TYPE DISPLAY 'Customer: ' CUSTOMER-ID DISPLAY 'Amount: ' TRANSACTION-AMOUNT DISPLAY 'Date: ' TRANSACTION-DATE STOP RUN.
You can use variables for start position and length, making reference modification dynamic:
12345678910111213141516171819202122232425WORKING-STORAGE SECTION. 01 SOURCE-FIELD PIC X(50) VALUE 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'. 01 START-POS PIC 9(2) VALUE 5. 01 EXTRACT-LENGTH PIC 9(2) VALUE 10. 01 EXTRACTED-PORTION PIC X(50). PROCEDURE DIVISION. DYNAMIC-EXTRACTION. *> Extract using variables MOVE SOURCE-FIELD(START-POS:EXTRACT-LENGTH) TO EXTRACTED-PORTION *> Extracts 10 characters starting at position 5: "EFGHIJKLMN" DISPLAY 'Extracted: ' EXTRACTED-PORTION *> Change position and extract again MOVE 10 TO START-POS MOVE 5 TO EXTRACT-LENGTH MOVE SOURCE-FIELD(START-POS:EXTRACT-LENGTH) TO EXTRACTED-PORTION *> Extracts 5 characters starting at position 10: "JKLMN" DISPLAY 'Extracted: ' EXTRACTED-PORTION STOP RUN.
Using variables allows you to:
You can omit the length to get all characters from the start position to the end of the field:
12345678910111213WORKING-STORAGE SECTION. 01 FULL-NAME PIC X(30) VALUE 'JOHN SMITH'. 01 LAST-PART PIC X(30). PROCEDURE DIVISION. EXTRACT-TO-END. *> Get all characters from position 6 to the end MOVE FULL-NAME(6:) TO LAST-PART *> Result: "SMITH" (all characters from position 6 onward) DISPLAY 'Last Part: ' LAST-PART STOP RUN.
Omitting the length is useful when you want everything from a certain position to the end, without needing to calculate the remaining length.
You can use reference modification directly in IF statements and other comparisons:
12345678910111213141516171819202122232425WORKING-STORAGE SECTION. 01 CUSTOMER-ID PIC X(10) VALUE 'CUST123456'. 01 RECORD-TYPE PIC X(2). PROCEDURE DIVISION. VALIDATE-WITH-REF-MOD. *> Check if customer ID starts with "CUST" IF CUSTOMER-ID(1:4) = 'CUST' DISPLAY 'Valid customer ID format' ELSE DISPLAY 'Invalid customer ID format' END-IF *> Check the first two characters IF CUSTOMER-ID(1:2) = 'CU' DISPLAY 'Starts with CU' END-IF *> Extract and use in condition MOVE CUSTOMER-ID(1:2) TO RECORD-TYPE IF RECORD-TYPE = 'CU' DISPLAY 'Customer record type' END-IF STOP RUN.
This allows you to validate data formats, check prefixes or suffixes, and make decisions based on specific portions of fields without extracting to intermediate variables first.
1234567891011WORKING-STORAGE SECTION. 01 DATE-YYYYMMDD PIC 9(8) VALUE 20240115. 01 YEAR PIC 9(4). 01 MONTH PIC 9(2). 01 DAY PIC 9(2). PROCEDURE DIVISION. EXTRACT-DATE. MOVE DATE-YYYYMMDD(1:4) TO YEAR *> 2024 MOVE DATE-YYYYMMDD(5:2) TO MONTH *> 01 MOVE DATE-YYYYMMDD(7:2) TO DAY *> 15.
1234567891011WORKING-STORAGE SECTION. 01 ACCOUNT-NUMBER PIC X(12) VALUE '123456789012'. 01 BRANCH-CODE PIC X(4). 01 ACCOUNT-SEQUENCE PIC X(8). PROCEDURE DIVISION. PARSE-ACCOUNT. *> First 4 digits are branch code MOVE ACCOUNT-NUMBER(1:4) TO BRANCH-CODE *> Last 8 digits are account sequence MOVE ACCOUNT-NUMBER(5:8) TO ACCOUNT-SEQUENCE.
1234567891011WORKING-STORAGE SECTION. 01 COMPOSITE-KEY PIC X(20) VALUE 'REGION01DEPT05EMP123'. 01 REGION-CODE PIC X(8). 01 DEPARTMENT-CODE PIC X(6). 01 EMPLOYEE-NUMBER PIC X(6). PROCEDURE DIVISION. EXTRACT-KEY-COMPONENTS. MOVE COMPOSITE-KEY(1:8) TO REGION-CODE *> "REGION01" MOVE COMPOSITE-KEY(9:6) TO DEPARTMENT-CODE *> "DEPT05" MOVE COMPOSITE-KEY(15:6) TO EMPLOYEE-NUMBER *> "EMP123".
1234567891011121314WORKING-STORAGE SECTION. 01 TRANSACTION-CODE PIC X(10) VALUE 'TXN1234567'. PROCEDURE DIVISION. VALIDATE-FORMAT. *> Check if it starts with "TXN" IF TRANSACTION-CODE(1:3) = 'TXN' DISPLAY 'Valid transaction code prefix' END-IF *> Check if positions 4-6 are numeric IF TRANSACTION-CODE(4:3) IS NUMERIC DISPLAY 'Numeric sequence found' END-IF.
Understanding when to use reference modification versus other string manipulation techniques:
| Technique | Best Use | Example |
|---|---|---|
| Reference Modification | Extracting known positions, simple substring access | DATE-FIELD(1:4) to get year |
| STRING | Concatenating multiple fields into one | STRING A DELIMITED BY SIZE B INTO RESULT |
| UNSTRING | Parsing delimited data, splitting on separators | UNSTRING FIELD DELIMITED BY "," INTO PART1 PART2 |
| MOVE with editing | Formatting entire fields | MOVE AMOUNT TO FORMATTED-AMOUNT (with PIC editing) |
Important limitations and considerations when using reference modification:
Always validate positions to avoid runtime errors:
1234567891011121314151617181920WORKING-STORAGE SECTION. 01 SOURCE-FIELD PIC X(20) VALUE 'ABCDEFGHIJ'. 01 START-POS PIC 9(2). 01 EXTRACT-LENGTH PIC 9(2). PROCEDURE DIVISION. SAFE-EXTRACTION. MOVE 15 TO START-POS MOVE 10 TO EXTRACT-LENGTH *> Validate before using IF START-POS > 0 AND START-POS <= 20 AND EXTRACT-LENGTH > 0 AND (START-POS + EXTRACT-LENGTH - 1) <= 20 MOVE SOURCE-FIELD(START-POS:EXTRACT-LENGTH) TO RESULT-FIELD ELSE DISPLAY 'ERROR: Invalid position or length' END-IF.
Reference modification provides read-only access. You cannot modify part of a field using reference modification on the left side of MOVE. To modify a portion, you must:
While reference modification is efficient, excessive use in loops or with large fields may impact performance. Consider the trade-off between clarity and performance for your specific use case.
1234567891011121314151617WORKING-STORAGE SECTION. 01 DATE-INPUT PIC 9(8) VALUE 20240115. 01 FORMATTED-DATE PIC X(10). PROCEDURE DIVISION. BUILD-FORMATTED-DATE. *> Build formatted date: MM/DD/YYYY STRING DATE-INPUT(5:2) DELIMITED BY SIZE '/' DELIMITED BY SIZE DATE-INPUT(7:2) DELIMITED BY SIZE '/' DELIMITED BY SIZE DATE-INPUT(1:4) DELIMITED BY SIZE INTO FORMATTED-DATE END-STRING *> Result: "01/15/2024" DISPLAY 'Formatted Date: ' FORMATTED-DATE.
12345678910111213141516171819202122WORKING-STORAGE SECTION. 01 RECORD-TYPE PIC X(2). 01 INPUT-RECORD PIC X(80). 01 EXTRACTED-FIELD PIC X(20). PROCEDURE DIVISION. CONDITIONAL-EXTRACT. *> Extract different fields based on record type MOVE INPUT-RECORD(1:2) TO RECORD-TYPE IF RECORD-TYPE = '01' *> Type 01: Extract positions 10-29 MOVE INPUT-RECORD(10:20) TO EXTRACTED-FIELD ELSE IF RECORD-TYPE = '02' *> Type 02: Extract positions 15-34 MOVE INPUT-RECORD(15:20) TO EXTRACTED-FIELD ELSE *> Default: Extract positions 5-24 MOVE INPUT-RECORD(5:20) TO EXTRACTED-FIELD END-IF END-IF.
Follow these best practices when using reference modification:
Think of reference modification like cutting a piece of cake:
So CUSTOMER-NAME(1:5) is like saying "give me 5 slices starting from the first slice" - you get the first 5 characters! And just like you can't put cake back once it's cut, reference modification is read-only - you can look at the piece, but you can't change the original cake by changing the piece.
Complete these exercises to reinforce your understanding:
Create a program that reads a date in YYYYMMDD format and uses reference modification to extract and display the year, month, and day separately.
Create a program that parses a 12-character account number where positions 1-4 are branch code, 5-8 are account type, and 9-12 are sequence number. Use reference modification to extract each component.
Create a program that validates transaction codes. The code should start with "TXN" (positions 1-3), have numeric characters in positions 4-7, and end with a check digit in position 8. Use reference modification to check each part.
Create a program that uses variables for start position and length. Allow the user to specify which portion of a field to extract, then use reference modification with those variables.
Create a program that reads a date in YYYYMMDD format and uses reference modification to extract components, then builds a formatted date string in DD/MM/YYYY format using STRING.
1. What is the syntax for COBOL reference modification?
2. What does CUSTOMER-NAME(1:10) access?
3. Can you modify a field using reference modification on the left side of MOVE?
4. What is a common use of reference modification?
5. How do you get all characters from position 5 to the end?
6. What happens if the start position exceeds the field length?