MainframeMaster

COBOL Tutorial

COBOL File Organization Types

Progress0 of 0 lessons

Sequential Files

Sequential files are the simplest and most fundamental file organization in COBOL. As the name suggests, records in a sequential file are stored and accessed in sequence, one after another, from beginning to end. This organization closely mirrors the historical tape-based storage systems where COBOL originated.

Key Characteristics

  • Records must be accessed in the order they were written (sequentially)
  • Adding a new record typically requires rewriting the entire file or appending to the end
  • No direct access to individual records without reading through all preceding records
  • Simple structure with minimal overhead
  • Efficient for batch processing of entire files
  • Well-suited for archiving, logging, and reporting

ENVIRONMENT DIVISION Specification

Sequential files are defined in the ENVIRONMENT DIVISION using the ORGANIZATION IS SEQUENTIAL clause (or by omitting the ORGANIZATION clause, as SEQUENTIAL is the default).

cobol
1
2
3
4
5
6
7
ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT CUSTOMER-FILE ASSIGN TO CUSTFILE ORGANIZATION IS SEQUENTIAL ACCESS MODE IS SEQUENTIAL FILE STATUS IS CUSTOMER-FILE-STATUS.

For sequential files, the ACCESS MODE IS SEQUENTIAL clause is the only valid option. The FILE STATUS clause is optional but recommended for error handling.

Practical Example

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
IDENTIFICATION DIVISION. PROGRAM-ID. SEQFILE. ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT CUSTOMER-FILE ASSIGN TO "CUSTOMERS.DAT" ORGANIZATION IS SEQUENTIAL ACCESS MODE IS SEQUENTIAL FILE STATUS IS CUST-FILE-STATUS. SELECT REPORT-FILE ASSIGN TO "REPORT.TXT" ORGANIZATION IS SEQUENTIAL ACCESS MODE IS SEQUENTIAL FILE STATUS IS REPORT-FILE-STATUS. DATA DIVISION. FILE SECTION. FD CUSTOMER-FILE. 01 CUSTOMER-RECORD. 05 CUSTOMER-ID PIC 9(5). 05 CUSTOMER-NAME PIC X(30). 05 CUSTOMER-ADDRESS PIC X(50). 05 CUSTOMER-BALANCE PIC S9(7)V99. FD REPORT-FILE. 01 REPORT-RECORD PIC X(80). WORKING-STORAGE SECTION. 01 FILE-STATUS. 05 CUST-FILE-STATUS PIC XX. 05 REPORT-FILE-STATUS PIC XX. 01 EOF-FLAG PIC X VALUE 'N'. 88 END-OF-FILE VALUE 'Y'. 01 RECORD-COUNT PIC 9(5) VALUE ZEROS. PROCEDURE DIVISION. MAIN-PROCESS. PERFORM INITIALIZATION PERFORM PROCESS-RECORDS UNTIL END-OF-FILE PERFORM FINALIZATION STOP RUN. INITIALIZATION. OPEN INPUT CUSTOMER-FILE OPEN OUTPUT REPORT-FILE IF CUST-FILE-STATUS NOT = "00" DISPLAY "Error opening customer file: " CUST-FILE-STATUS MOVE "Y" TO EOF-FLAG END-IF. PROCESS-RECORDS. READ CUSTOMER-FILE AT END MOVE "Y" TO EOF-FLAG NOT AT END ADD 1 TO RECORD-COUNT PERFORM PROCESS-CUSTOMER-RECORD END-READ. PROCESS-CUSTOMER-RECORD. * Process each record sequentially MOVE SPACES TO REPORT-RECORD STRING "Customer: " DELIMITED BY SIZE CUSTOMER-ID DELIMITED BY SIZE " - " DELIMITED BY SIZE CUSTOMER-NAME DELIMITED BY SIZE " - Balance: $" DELIMITED BY SIZE CUSTOMER-BALANCE DELIMITED BY SIZE INTO REPORT-RECORD END-STRING WRITE REPORT-RECORD. FINALIZATION. CLOSE CUSTOMER-FILE MOVE SPACES TO REPORT-RECORD STRING "Total records processed: " DELIMITED BY SIZE RECORD-COUNT DELIMITED BY SIZE INTO REPORT-RECORD END-STRING WRITE REPORT-RECORD CLOSE REPORT-FILE.

This example demonstrates a typical sequential file processing pattern: open the file, read records sequentially until the end of file, process each record, and close the file. The AT END clause detects when all records have been read.

Advantages of Sequential Files

  • Simple structure and implementation
  • Efficient for processing all records in a file
  • Minimal overhead in terms of storage and processing
  • Ideal for archival data that is rarely updated
  • Well-suited for batch processing operations
  • Efficient for report generation

Limitations of Sequential Files

  • No direct access to individual records
  • Inefficient for applications requiring random access
  • Updating existing records is cumbersome and inefficient
  • Records must be processed in sequence from beginning to end
  • Adding or deleting records requires rewriting the file
  • Not suitable for interactive applications with random updates

Indexed Files (VSAM KSDS)

Indexed files, typically implemented as VSAM Key Sequenced Data Sets (KSDS) in mainframe environments, are a powerful file organization that combines the advantages of sequential access with the ability to directly access individual records using keys. These files maintain indexes that allow COBOL programs to quickly locate specific records without reading through the entire file.

Key Characteristics

  • Records can be accessed both sequentially and randomly by key
  • Each record contains at least one key field that uniquely identifies it
  • Supports primary keys and alternate (secondary) keys
  • Allows efficient updates, insertions, and deletions of individual records
  • Maintains index structures for rapid record location
  • Supports variable-length records

ENVIRONMENT DIVISION Specification

Indexed files are defined using the ORGANIZATION IS INDEXED clause, along with a RECORD KEY clause to specify the primary key. Additional ALTERNATE RECORD KEY clauses can define secondary access paths.

cobol
1
2
3
4
5
6
7
8
9
ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT CUSTOMER-FILE ASSIGN TO CUSTVSAM ORGANIZATION IS INDEXED ACCESS MODE IS DYNAMIC RECORD KEY IS CUSTOMER-ID ALTERNATE RECORD KEY IS CUSTOMER-NAME WITH DUPLICATES FILE STATUS IS CUSTOMER-FILE-STATUS.

For indexed files, you can choose from three access modes:

  • SEQUENTIAL - Process records in key sequence
  • RANDOM - Access records directly by key
  • DYNAMIC - Switch between sequential and random access as needed

Comprehensive Example

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
IDENTIFICATION DIVISION. PROGRAM-ID. IDXFILE. ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT CUSTOMER-FILE ASSIGN TO "CUSTVSAM" ORGANIZATION IS INDEXED ACCESS MODE IS DYNAMIC RECORD KEY IS CUSTOMER-ID ALTERNATE RECORD KEY IS CUSTOMER-NAME WITH DUPLICATES FILE STATUS IS CUST-FILE-STATUS. DATA DIVISION. FILE SECTION. FD CUSTOMER-FILE. 01 CUSTOMER-RECORD. 05 CUSTOMER-ID PIC 9(5). 05 CUSTOMER-NAME PIC X(30). 05 CUSTOMER-ADDRESS PIC X(50). 05 CUSTOMER-BALANCE PIC S9(7)V99. WORKING-STORAGE SECTION. 01 WORKING-VARIABLES. 05 CUST-FILE-STATUS PIC XX. 05 OPERATION-FLAG PIC X. 88 EXIT-PROGRAM VALUE 'X'. 05 MENU-OPTION PIC 9. 05 SEARCH-ID PIC 9(5). 05 NEW-BALANCE PIC S9(7)V99. PROCEDURE DIVISION. MAIN-PROCESS. PERFORM INITIALIZATION PERFORM UNTIL EXIT-PROGRAM DISPLAY "CUSTOMER FILE OPERATIONS" DISPLAY "1. Add Customer" DISPLAY "2. Find Customer by ID" DISPLAY "3. List All Customers" DISPLAY "4. Update Customer Balance" DISPLAY "5. Delete Customer" DISPLAY "X. Exit" DISPLAY "Enter option: " WITH NO ADVANCING ACCEPT OPERATION-FLAG EVALUATE OPERATION-FLAG WHEN '1' PERFORM ADD-CUSTOMER WHEN '2' PERFORM FIND-CUSTOMER-BY-ID WHEN '3' PERFORM LIST-ALL-CUSTOMERS WHEN '4' PERFORM UPDATE-CUSTOMER WHEN '5' PERFORM DELETE-CUSTOMER WHEN 'X' OR 'x' SET EXIT-PROGRAM TO TRUE WHEN OTHER DISPLAY "Invalid option" END-EVALUATE END-PERFORM PERFORM FINALIZATION STOP RUN. INITIALIZATION. OPEN I-O CUSTOMER-FILE IF CUST-FILE-STATUS = "35" DISPLAY "Customer file does not exist. Creating new file." CLOSE CUSTOMER-FILE OPEN OUTPUT CUSTOMER-FILE CLOSE CUSTOMER-FILE OPEN I-O CUSTOMER-FILE END-IF IF CUST-FILE-STATUS NOT = "00" DISPLAY "Error opening customer file: " CUST-FILE-STATUS SET EXIT-PROGRAM TO TRUE END-IF. ADD-CUSTOMER. MOVE SPACES TO CUSTOMER-RECORD DISPLAY "Enter Customer ID: " WITH NO ADVANCING ACCEPT CUSTOMER-ID DISPLAY "Enter Customer Name: " WITH NO ADVANCING ACCEPT CUSTOMER-NAME DISPLAY "Enter Customer Address: " WITH NO ADVANCING ACCEPT CUSTOMER-ADDRESS DISPLAY "Enter Customer Balance: " WITH NO ADVANCING ACCEPT CUSTOMER-BALANCE WRITE CUSTOMER-RECORD INVALID KEY DISPLAY "Error: Customer ID already exists." NOT INVALID KEY DISPLAY "Customer added successfully." END-WRITE. FIND-CUSTOMER-BY-ID. DISPLAY "Enter Customer ID to find: " WITH NO ADVANCING ACCEPT SEARCH-ID MOVE SEARCH-ID TO CUSTOMER-ID READ CUSTOMER-FILE KEY IS CUSTOMER-ID INVALID KEY DISPLAY "Customer not found." NOT INVALID KEY DISPLAY "Customer ID: " CUSTOMER-ID DISPLAY "Customer Name: " CUSTOMER-NAME DISPLAY "Customer Address: " CUSTOMER-ADDRESS DISPLAY "Customer Balance: " CUSTOMER-BALANCE END-READ. LIST-ALL-CUSTOMERS. MOVE LOW-VALUES TO CUSTOMER-ID START CUSTOMER-FILE KEY >= CUSTOMER-ID INVALID KEY DISPLAY "No records in file." NOT INVALID KEY PERFORM READ-NEXT-RECORD END-START. READ-NEXT-RECORD. READ CUSTOMER-FILE NEXT AT END DISPLAY "End of file reached." NOT AT END DISPLAY CUSTOMER-ID ": " CUSTOMER-NAME " - Balance: $" CUSTOMER-BALANCE PERFORM READ-NEXT-RECORD END-READ. UPDATE-CUSTOMER. DISPLAY "Enter Customer ID to update: " WITH NO ADVANCING ACCEPT SEARCH-ID MOVE SEARCH-ID TO CUSTOMER-ID READ CUSTOMER-FILE INVALID KEY DISPLAY "Customer not found." NOT INVALID KEY DISPLAY "Current balance: $" CUSTOMER-BALANCE DISPLAY "Enter new balance: $" WITH NO ADVANCING ACCEPT NEW-BALANCE MOVE NEW-BALANCE TO CUSTOMER-BALANCE REWRITE CUSTOMER-RECORD INVALID KEY DISPLAY "Error updating customer." NOT INVALID KEY DISPLAY "Customer updated successfully." END-REWRITE END-READ. DELETE-CUSTOMER. DISPLAY "Enter Customer ID to delete: " WITH NO ADVANCING ACCEPT SEARCH-ID MOVE SEARCH-ID TO CUSTOMER-ID DELETE CUSTOMER-FILE INVALID KEY DISPLAY "Customer not found." NOT INVALID KEY DISPLAY "Customer deleted successfully." END-DELETE. FINALIZATION. CLOSE CUSTOMER-FILE.

This example demonstrates a comprehensive indexed file application that allows adding, finding, listing, updating, and deleting records. It showcases both random access (by key) and sequential access (when listing all records).

Advantages of Indexed Files

  • Support for both sequential and random access patterns
  • Efficient retrieval of individual records by key
  • Multiple access paths through alternate keys
  • Efficient record updates, insertions, and deletions
  • Well-suited for interactive applications
  • Support for variable-length records

Limitations of Indexed Files

  • Higher overhead than sequential files due to index maintenance
  • Increased storage requirements for indexes
  • Performance can degrade as files grow and require reorganization
  • More complex to set up and maintain
  • Processing overhead for insert/update operations to maintain indexes

Relative Files (VSAM RRDS)

Relative files, implemented as VSAM Relative Record Data Sets (RRDS) in mainframe environments, are a file organization where records are identified by their relative position within the file, similar to an array. Each record is assigned a unique relative record number that corresponds to its location, allowing for both direct access by position and sequential processing.

Key Characteristics

  • Records are accessed by their relative position (relative record number)
  • No key field required within the record itself
  • Records can be fixed-length or variable-length
  • Empty slots can exist (deleted records leave gaps that can be reused)
  • Efficient for applications where records are naturally identified by numeric position
  • Combines random access capability with efficient storage

ENVIRONMENT DIVISION Specification

Relative files are defined using the ORGANIZATION IS RELATIVE clause, along with a RELATIVE KEY clause to specify the data field that will hold the relative record number for access.

cobol
1
2
3
4
5
6
7
8
ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT PRODUCT-FILE ASSIGN TO PRODRRDS ORGANIZATION IS RELATIVE ACCESS MODE IS DYNAMIC RELATIVE KEY IS PRODUCT-NUMBER FILE STATUS IS PRODUCT-FILE-STATUS.

Like indexed files, relative files support three access modes:

  • SEQUENTIAL - Process records in physical order
  • RANDOM - Access records directly by relative record number
  • DYNAMIC - Switch between sequential and random access

Advantages of Relative Files

  • Efficient direct access by record number
  • No need for key fields within the records
  • Lower overhead compared to indexed files (no index maintenance)
  • Efficient for applications where records are naturally numbered
  • Deleted record slots can be reused
  • Good performance for both sequential and random access patterns

Limitations of Relative Files

  • Records can only be accessed by position, not by content
  • Inefficient for sparse data (many empty slots waste space)
  • Record deletion can create fragmentation
  • No built-in support for alternate access paths
  • Requires application to manage the mapping between data and position
  • Maximum file size limited by the size of the relative key field

Line Sequential Files

Line sequential files are a specialized file organization primarily used in UNIX, Linux, and Windows environments. Unlike traditional sequential files, line sequential files store records with explicit line terminators (such as carriage return and/or line feed characters). This format makes them compatible with standard text editors and non-COBOL applications.

Key Characteristics

  • Each record ends with line terminators (CR, LF, or CR/LF)
  • Only available in COBOL for non-mainframe environments
  • Suitable for text files that need editing outside of COBOL
  • Limited to displayable characters (no binary data)
  • Records are processed sequentially
  • Compatible with standard text processing tools

ENVIRONMENT DIVISION Specification

Line sequential files are defined using the ORGANIZATION IS LINE SEQUENTIAL clause. This is only available in COBOL implementations for UNIX/Linux and Windows systems.

cobol
1
2
3
4
5
6
ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT LOG-FILE ASSIGN TO "APPLOG.TXT" ORGANIZATION IS LINE SEQUENTIAL FILE STATUS IS LOG-FILE-STATUS.

Line sequential files only support sequential access mode. They are typically used for log files, reports, and data exchange with other applications.

Advantages of Line Sequential Files

  • Compatible with standard text editors and processing tools
  • Easily read by non-COBOL applications
  • Ideal for log files and reports
  • Human-readable without special viewers
  • Simple to create and process
  • Well-suited for data exchange with other systems

Limitations of Line Sequential Files

  • Not available in mainframe COBOL environments
  • Cannot contain binary or low-value data
  • Only sequential access is supported
  • Less efficient storage than binary sequential files
  • Record length variability can complicate processing
  • Limited to text data only

File Organization Selection Criteria

Selecting the appropriate file organization for your COBOL application is a critical design decision that affects performance, maintenance, and flexibility. This choice should be based on a careful analysis of your application's requirements and access patterns.

Decision Factors

FactorConsiderationRecommended Organization
Access PatternSequential processing of entire fileSequential
Access PatternRandom access by business keyIndexed (VSAM KSDS)
Access PatternRandom access by positionRelative (VSAM RRDS)
Update FrequencyFrequent updates to existing recordsIndexed or Relative
Update FrequencyAppend-only operationsSequential
File SizeLarge files with frequent random accessIndexed (with appropriate buffer size)
Record IdentificationNatural business key existsIndexed
Record IdentificationRecords naturally identified by positionRelative
External CompatibilityNeeds to be read by text editorsLine Sequential

File Organization Comparison

FeatureSequentialIndexedRelativeLine Sequential
Random AccessNoYes (by key)Yes (by position)No
Update in PlaceNoYesYesNo
Storage OverheadLowHighMediumMedium
Processing OverheadLowHighMediumLow
Multiple Access PathsNoYes (alternate keys)NoNo
Typical Use CaseBatch processing, logsInteractive applicationsArray-like storageText files, data exchange

Selection Process

  1. Analyze access patterns: Determine how the data will be accessed most frequently. Is it sequential processing of the entire file, or is random access to specific records required?
  2. Consider update frequency: Will records be updated frequently, or is the file primarily for reading? Are new records added to the end or inserted throughout?
  3. Assess data volume: Consider the size of the file and number of records. Larger files with random access needs benefit from indexed organization.
  4. Identify natural keys: Does your data have natural identifiers (like customer IDs) that would make good keys for indexed files?
  5. Evaluate performance requirements: Critical applications with high-volume transactions may need the efficiency of a specific organization.
  6. Consider platform constraints: Some file organizations may not be available in certain COBOL environments or platforms.

Example Scenarios

  • Customer Master File: Frequent random lookups by customer ID and updates to individual records suggest an indexed file with the customer ID as the primary key.
  • Transaction Log: Records are only appended and processed in sequence, making asequential file the most efficient choice.
  • Fixed Size Array-like Data: A product catalog with stable, numbered items could use arelative file with the product number as the relative key.
  • Export File for Other Systems: When data needs to be shared with non-COBOL applications on Windows/UNIX, a line sequential file provides the best compatibility.
  • High-Volume Order Processing: An application that requires both sequential batch processing and random lookups might use an indexed file with appropriate buffer sizing.

Test Your Knowledge

1. Which file organization type uses record keys to access data directly?

  • Sequential files
  • Indexed files (VSAM KSDS)
  • Line sequential files
  • Stream files

2. What is the main characteristic of sequential file organization?

  • Records must be accessed in sequence, one after another
  • Records can be accessed in any order
  • Records are accessed by relative record number
  • Records are stored in XML format

3. Which COBOL file organization is best suited for applications requiring frequent inserts, updates, and random access?

  • Sequential
  • Indexed (VSAM KSDS)
  • Relative (VSAM RRDS)
  • Line sequential

4. How are records accessed in a relative file organization (VSAM RRDS)?

  • By a primary key
  • By a secondary key
  • By a relative record number
  • By sequential access only

5. Which file organization would be most appropriate for a log file that only has new records appended to the end?

  • Indexed (VSAM KSDS)
  • Sequential
  • Relative (VSAM RRDS)
  • VSAM ESDS

Frequently Asked Questions