COBOL Tutorial

Progress0 of 0 lessons

COBOL File Organizations

Choosing the right file organization is one of the most important decisions in COBOL application design. The file organization you select determines how records are stored, how they can be accessed, and what operations are possible. This choice directly impacts application performance, storage requirements, and maintenance complexity.

This guide helps you understand when to use each file organization type, compares their characteristics, and provides practical decision-making criteria for selecting the best organization for your specific application needs.

Overview of File Organizations

COBOL supports three primary file organizations:

  • Sequential - Records stored and accessed in physical order
  • Indexed - Records accessed by key values with automatic index maintenance (VSAM KSDS)
  • Relative - Records accessed by relative record number (VSAM RRDS)

Each organization has distinct characteristics, advantages, and limitations. Understanding these differences is essential for making informed design decisions.

Comparison Matrix

The following table compares the key characteristics of each file organization:

File Organization Comparison
CharacteristicSequentialIndexed (KSDS)Relative (RRDS)
Access MethodsSequential onlySequential, Random, DynamicSequential, Random, Dynamic
Record LengthFixed or VariableFixed or VariableFixed only
Key RequirementNonePrimary key requiredRelative key (record number)
Random AccessNot supportedBy key valueBy record number
Update SupportLimited (must recreate file)Full support (READ, REWRITE, DELETE)Full support (READ, REWRITE, DELETE)
Storage OverheadMinimal30-50% for indexesMinimal (may waste empty slots)
Performance (Sequential)FastestGood (with index overhead)Good
Performance (Random)Not applicableFast (via index)Very fast (direct access)
Alternate KeysNot supportedSupportedNot supported
Best ForBatch processing, logs, archivesTransaction processing, master filesPosition-based access, fixed records

Sequential File Organization

Characteristics

Sequential files store records in the order they are written. Records must be accessed sequentially from the beginning of the file. This is the simplest file organization with minimal overhead.

When to Use Sequential Files

  • Batch Processing: When processing entire files from start to finish
  • Log Files: Files that are only appended and read sequentially
  • Archival Data: Historical data that is rarely updated
  • Report Generation: Creating reports from complete data sets
  • File Copying: Duplicating or backing up files
  • No Random Access Needed: When you don't need to access specific records directly

Example: Sequential File Processing

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
IDENTIFICATION DIVISION. PROGRAM-ID. SEQUENTIAL-PROCESSING. ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT TRANSACTION-FILE ASSIGN TO "TRANS.DAT" ORGANIZATION IS SEQUENTIAL ACCESS MODE IS SEQUENTIAL FILE STATUS IS FILE-STATUS-CODE. DATA DIVISION. FILE SECTION. FD TRANSACTION-FILE. 01 TRANSACTION-RECORD. 05 TRANS-DATE PIC 9(8). 05 TRANS-AMOUNT PIC S9(7)V99. 05 TRANS-DESCRIPTION PIC X(50). WORKING-STORAGE SECTION. 01 FILE-STATUS-CODE PIC XX. 88 END-OF-FILE VALUE "10". 01 TOTAL-AMOUNT PIC S9(9)V99 VALUE ZEROS. PROCEDURE DIVISION. MAIN-PROCESS. OPEN INPUT TRANSACTION-FILE IF FILE-STATUS-CODE NOT = "00" DISPLAY "Error opening file: " FILE-STATUS-CODE STOP RUN END-IF PERFORM PROCESS-TRANSACTIONS UNTIL END-OF-FILE CLOSE TRANSACTION-FILE DISPLAY "Total amount: " TOTAL-AMOUNT STOP RUN. PROCESS-TRANSACTIONS. READ TRANSACTION-FILE AT END CONTINUE NOT AT END ADD TRANS-AMOUNT TO TOTAL-AMOUNT DISPLAY "Processed: " TRANS-DESCRIPTION END-READ.

This example demonstrates typical sequential file processing: opening the file, reading records one by one until end of file, processing each record, and closing the file. Sequential access is straightforward and efficient for this use case.

Advantages of Sequential Files

  • Simplest structure - easiest to understand and implement
  • Fastest performance for sequential processing
  • Minimal storage overhead
  • No index maintenance required
  • Ideal for batch processing operations

Limitations of Sequential Files

  • No random access - must read from beginning
  • Limited update capabilities - typically requires recreating the file
  • Cannot delete individual records efficiently
  • Not suitable for transaction processing systems

Indexed File Organization (VSAM KSDS)

Characteristics

Indexed files use keys to organize records and maintain indexes that allow direct access to records by key value. They support both sequential and random access, making them versatile for many applications.

When to Use Indexed Files

  • Transaction Processing: Systems that need to look up specific records by key
  • Master Files: Reference data that is frequently accessed and updated
  • Mixed Access Patterns: Applications needing both sequential and random access
  • Frequent Updates: Files with records that are frequently added, modified, or deleted
  • Alternate Keys: When you need multiple access paths to the same data
  • Variable-Length Records: When record sizes vary

Example: Indexed File with Random Access

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
IDENTIFICATION DIVISION. PROGRAM-ID. INDEXED-ACCESS. ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT CUSTOMER-FILE ASSIGN TO "CUSTOMER.IDX" ORGANIZATION IS INDEXED ACCESS MODE IS RANDOM RECORD KEY IS CUSTOMER-ID FILE STATUS IS FILE-STATUS-CODE. DATA DIVISION. FILE SECTION. FD CUSTOMER-FILE. 01 CUSTOMER-RECORD. 05 CUSTOMER-ID PIC 9(5). 05 CUSTOMER-NAME PIC X(30). 05 CUSTOMER-BALANCE PIC S9(7)V99. WORKING-STORAGE SECTION. 01 FILE-STATUS-CODE PIC XX. 88 RECORD-FOUND VALUE "00". 88 RECORD-NOT-FOUND VALUE "23". 01 SEARCH-ID PIC 9(5). PROCEDURE DIVISION. MAIN-PROCESS. OPEN INPUT CUSTOMER-FILE IF FILE-STATUS-CODE NOT = "00" DISPLAY "Error opening file: " FILE-STATUS-CODE STOP RUN END-IF DISPLAY "Enter customer ID to search: " ACCEPT SEARCH-ID MOVE SEARCH-ID TO CUSTOMER-ID READ CUSTOMER-FILE INVALID KEY DISPLAY "Customer " SEARCH-ID " not found" NOT INVALID KEY DISPLAY "Customer found:" DISPLAY " Name: " CUSTOMER-NAME DISPLAY " Balance: " CUSTOMER-BALANCE END-READ CLOSE CUSTOMER-FILE STOP RUN.

This example shows random access to an indexed file. The program can directly access any customer record by setting the key (CUSTOMER-ID) and reading. The index allows the system to locate the record without reading through preceding records.

Advantages of Indexed Files

  • Supports both sequential and random access
  • Fast random access by key value
  • Supports alternate keys for multiple access paths
  • Handles variable-length records
  • Full update support (READ, WRITE, REWRITE, DELETE)
  • Automatic index maintenance

Limitations of Indexed Files

  • Storage overhead for indexes (30-50%)
  • Requires periodic reorganization for optimal performance
  • Slightly slower than sequential for pure sequential processing
  • More complex than sequential files
  • Requires key field definition

Relative File Organization (VSAM RRDS)

Characteristics

Relative files use relative record numbers to identify record positions. Records are stored in fixed-size slots, allowing direct access by record number. This organization is efficient for position-based access patterns.

When to Use Relative Files

  • Position-Based Access: When records are naturally identified by position
  • Fixed-Length Records: When all records are the same size
  • Frequent Deletion: When records are frequently deleted and slots reused
  • Direct Access: When you can map business keys to record numbers
  • Slot Management: When you need efficient slot-based record management

Example: Relative File Access

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
IDENTIFICATION DIVISION. PROGRAM-ID. RELATIVE-ACCESS. ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT INVENTORY-FILE ASSIGN TO "INVENTORY.DAT" ORGANIZATION IS RELATIVE ACCESS MODE IS RANDOM RELATIVE KEY IS RECORD-NUMBER FILE STATUS IS FILE-STATUS-CODE. DATA DIVISION. FILE SECTION. FD INVENTORY-FILE. 01 INVENTORY-RECORD. 05 ITEM-CODE PIC X(10). 05 ITEM-DESCRIPTION PIC X(40). 05 QUANTITY-ON-HAND PIC 9(5). 05 UNIT-PRICE PIC S9(5)V99. WORKING-STORAGE SECTION. 01 FILE-STATUS-CODE PIC XX. 01 RECORD-NUMBER PIC 9(5). PROCEDURE DIVISION. MAIN-PROCESS. OPEN I-O INVENTORY-FILE IF FILE-STATUS-CODE NOT = "00" DISPLAY "Error opening file: " FILE-STATUS-CODE STOP RUN END-IF *> Access record at position 100 MOVE 100 TO RECORD-NUMBER READ INVENTORY-FILE INVALID KEY DISPLAY "Record " RECORD-NUMBER " not found" NOT INVALID KEY DISPLAY "Item: " ITEM-CODE DISPLAY "Description: " ITEM-DESCRIPTION DISPLAY "Quantity: " QUANTITY-ON-HAND DISPLAY "Price: " UNIT-PRICE END-READ CLOSE INVENTORY-FILE STOP RUN.

This example demonstrates direct access to a relative file by record number. The relative key (RECORD-NUMBER) specifies which slot to access, allowing very fast direct access when you know the record position.

Advantages of Relative Files

  • Very fast direct access by record number
  • Efficient slot management for deletion and re-insertion
  • Supports both sequential and random access
  • Minimal storage overhead
  • Simple key structure (just record number)

Limitations of Relative Files

  • Fixed-length records only
  • Requires mapping business keys to record numbers
  • May waste space if many slots are empty
  • No alternate keys supported
  • Less flexible than indexed files

Decision-Making Guide

Use this decision tree to help choose the appropriate file organization:

Step 1: Determine Access Pattern

Question: How will records be accessed?

  • If only sequential (processing entire file): Consider Sequential
  • If random access by key needed: Consider Indexed
  • If random access by position needed: Consider Relative
  • If both sequential and random needed: Consider Indexed or Relative

Step 2: Consider Update Requirements

Question: How frequently will records be updated?

  • If rarely or never updated: Sequential may be sufficient
  • If frequently updated: Indexed or Relative required
  • If frequent inserts/deletes: Indexed or Relative required

Step 3: Evaluate Record Characteristics

Question: What are the record characteristics?

  • If variable-length records: Indexed or Sequential
  • If fixed-length records: Any organization (consider Relative if position-based)
  • If need alternate keys: Indexed only

Step 4: Assess Performance Requirements

Question: What are the performance priorities?

  • If maximum sequential speed: Sequential
  • If fast random access: Indexed or Relative
  • If minimal storage overhead: Sequential or Relative

Real-World Use Cases

Use Case 1: Customer Master File

Requirements: Look up customers by ID, update customer information, generate reports

Choice: Indexed File (VSAM KSDS)

Reasoning: Requires random access by customer ID, frequent updates, and both sequential (reports) and random (lookups) access patterns. Indexed files provide the necessary flexibility.

Use Case 2: Transaction Log File

Requirements: Append transactions, read sequentially for audit reports, rarely updated

Choice: Sequential File

Reasoning: Only sequential access needed, records are appended (not inserted), no random access required. Sequential files are simplest and most efficient for this use case.

Use Case 3: Inventory Lookup by Bin Number

Requirements: Access items by bin location (position-based), fixed record size, frequent updates

Choice: Relative File (VSAM RRDS)

Reasoning: Access is position-based (bin number maps to record number), records are fixed-length, and relative files provide efficient direct access by position.

Performance Considerations

Sequential Files

Sequential files provide the best performance for sequential processing:

  • No index overhead means faster reads
  • Optimal for batch processing large files
  • Minimal CPU and I/O for sequential access
  • Best choice when processing entire files

Indexed Files

Indexed files balance sequential and random access performance:

  • Index lookups provide fast random access
  • Sequential access is good but slower than pure sequential files
  • Index maintenance adds overhead for updates
  • Requires periodic reorganization for optimal performance
  • Consider buffer allocation (BUFND, BUFNI) for performance tuning

Relative Files

Relative files excel at direct position-based access:

  • Very fast direct access by record number
  • Efficient sequential access
  • Minimal overhead for position-based operations
  • Performance depends on key-to-position mapping efficiency

Migration Between Organizations

Sometimes you need to change file organization. This requires creating a new file and copying data:

cobol
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
IDENTIFICATION DIVISION. PROGRAM-ID. CONVERT-FILE-ORG. ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. *> Source: Sequential file SELECT SOURCE-FILE ASSIGN TO "SOURCE.DAT" ORGANIZATION IS SEQUENTIAL ACCESS MODE IS SEQUENTIAL FILE STATUS IS SOURCE-STATUS. *> Target: Indexed file SELECT TARGET-FILE ASSIGN TO "TARGET.IDX" ORGANIZATION IS INDEXED ACCESS MODE IS SEQUENTIAL RECORD KEY IS RECORD-KEY FILE STATUS IS TARGET-STATUS. DATA DIVISION. FILE SECTION. FD SOURCE-FILE. 01 SOURCE-RECORD. 05 RECORD-KEY PIC 9(5). 05 RECORD-DATA PIC X(75). FD TARGET-FILE. 01 TARGET-RECORD. 05 RECORD-KEY PIC 9(5). 05 RECORD-DATA PIC X(75). WORKING-STORAGE SECTION. 01 SOURCE-STATUS PIC XX. 88 END-OF-SOURCE VALUE "10". 01 TARGET-STATUS PIC XX. 01 RECORD-COUNT PIC 9(6) VALUE ZEROS. PROCEDURE DIVISION. MAIN-PROCESS. OPEN INPUT SOURCE-FILE IF SOURCE-STATUS NOT = "00" DISPLAY "Error opening source file: " SOURCE-STATUS STOP RUN END-IF OPEN OUTPUT TARGET-FILE IF TARGET-STATUS NOT = "00" DISPLAY "Error opening target file: " TARGET-STATUS CLOSE SOURCE-FILE STOP RUN END-IF PERFORM COPY-RECORDS UNTIL END-OF-SOURCE CLOSE SOURCE-FILE CLOSE TARGET-FILE DISPLAY "Conversion complete. Records copied: " RECORD-COUNT STOP RUN. COPY-RECORDS. READ SOURCE-FILE AT END CONTINUE NOT AT END MOVE SOURCE-RECORD TO TARGET-RECORD WRITE TARGET-RECORD IF TARGET-STATUS NOT = "00" DISPLAY "Error writing record: " TARGET-STATUS ELSE ADD 1 TO RECORD-COUNT END-IF END-READ.

This example shows how to convert a sequential file to an indexed file. The program reads from the sequential source file and writes to the indexed target file. Similar patterns can be used for other conversions.

Best Practices

General Best Practices

  • Choose file organization based on primary access pattern
  • Consider future requirements, not just current needs
  • Test performance with realistic data volumes
  • Document the rationale for file organization choice
  • Consider storage costs and maintenance requirements

Sequential File Best Practices

  • Use for batch processing and archival data
  • Consider blocking factors for performance
  • Use EXTEND mode for appending records
  • Plan for file recreation if updates are needed

Indexed File Best Practices

  • Choose meaningful, stable primary keys
  • Use alternate keys judiciously (they add overhead)
  • Plan for periodic reorganization
  • Optimize buffer allocation (BUFND, BUFNI)
  • Monitor index performance and space usage

Relative File Best Practices

  • Design efficient key-to-position mapping
  • Handle empty slots appropriately
  • Consider record size carefully (fixed-length requirement)
  • Plan for slot reuse after deletions

Summary

Choosing the right file organization is critical for COBOL application success. Sequential files excel at batch processing, indexed files provide flexibility for transaction processing, and relative files offer efficient position-based access. Consider your access patterns, update requirements, record characteristics, and performance needs when making your choice.

Remember:

  • Sequential - Best for batch processing and sequential-only access
  • Indexed - Best for key-based access with update requirements
  • Relative - Best for position-based access with fixed-length records

There's no one-size-fits-all solution. The best file organization depends on your specific application requirements. Take time to analyze your access patterns, test with realistic data, and choose the organization that best fits your needs.

Test Your Knowledge

1. Which file organization is best for a transaction processing system that needs to look up customer records by customer ID?

  • Sequential file organization
  • Indexed file organization (VSAM KSDS)
  • Relative file organization (VSAM RRDS)
  • Line sequential file organization

2. What is the primary advantage of sequential file organization?

  • Random access by key
  • Fastest performance for processing entire files sequentially
  • Support for variable-length records
  • Automatic indexing

3. Which file organization requires fixed-length records?

  • Sequential files
  • Indexed files (VSAM KSDS)
  • Relative files (VSAM RRDS)
  • All file organizations

4. What additional storage overhead do indexed files require?

  • No additional overhead
  • 10-20% for indexes
  • 30-50% for indexes
  • 100% for indexes

5. Which file organization supports both sequential and random access?

  • Sequential files only
  • Indexed files (VSAM KSDS) and relative files (VSAM RRDS)
  • Sequential files and indexed files
  • Only indexed files

6. For a log file that only appends new records and is read sequentially from beginning to end, which file organization is most appropriate?

  • Indexed file organization
  • Relative file organization
  • Sequential file organization
  • Any organization would work equally well

7. What is a key consideration when choosing between indexed and relative file organizations?

  • Both support variable-length records equally well
  • Indexed files use business keys, relative files use record positions
  • Relative files are always faster
  • Indexed files cannot be updated

8. Which file organization would be best for a file that is frequently updated with records being added, modified, and deleted?

  • Sequential files
  • Indexed files (VSAM KSDS)
  • Relative files (VSAM RRDS)
  • All organizations handle updates equally well

Related Pages