Random access in a VSAM Key Sequenced Data Set (KSDS) means retrieving a single record by supplying its key value. You do not read records one after another in order; instead, you tell VSAM which key you want, and VSAM uses the index component to find the control interval that contains that record and returns it. This is the main way applications look up a customer by ID, an order by number, or any entity by its primary key. This page explains how random access works, how it differs from sequential and dynamic access, how to use it in COBOL (ACCESS IS RANDOM, READ by key), and how the index makes it efficient.
In random access you request a specific record by some identifier. For a KSDS, that identifier is the primary key (or an alternate key if you defined one). Your program sets the key field to the value you want, issues a READ, and the access method returns that record—or a file status indicating that no record with that key exists. The important point is that VSAM does not scan the file from the beginning. It uses the index: the index component holds a B-tree style structure (sequence set and index set) that maps key values to the control intervals in the data component. Given a key, VSAM traverses the index to find the pointer to the correct data CI, reads that CI (one or a few I/Os), and then locates the record within the CI. So random access is fast and independent of file size for a single read.
Random access is contrasted with sequential access, where you read records in key order (READ NEXT, or READ PREV for backward). In sequential access you do not supply a key for each read; you establish a position (e.g. at the start of the file or after a START key) and then read one record after another. Sequential access is used for batch reports, full file processing, or walking a range of keys. Random access is used for lookups: given one key, get that one record. Many online transactions (CICS, batch lookups) use random access to fetch a record by customer number, account number, or order ID.
In COBOL you choose an access mode in the SELECT statement. ACCESS IS RANDOM means you will read (and optionally update or delete) by key; each READ supplies the key. ACCESS IS SEQUENTIAL means you will read in key order with READ NEXT (or READ PREV). ACCESS IS DYNAMIC means you can do both in the same program: for example, you might do a START to position at a key, then READ NEXT to process a range, or you might do a random READ by key in the same file. The file structure is the same (KSDS); only the way you use the READ (and START) changes.
| Access type | How it works |
|---|---|
| Random (by key) | Supply key value; VSAM uses index to find CI; one READ returns the record |
| Sequential | READ NEXT / READ PREV in key order; no key supplied per read |
| Dynamic | Mix random and sequential in the same program (e.g. START key then READ NEXT) |
When you issue a random READ with a key value, the following happens. First, VSAM takes the key you provided and searches the index component. The index has two main levels: the index set (top) and the sequence set (bottom). The index set contains separator keys and pointers to the next level; the sequence set has one entry per data control interval, with the high key of that CI and a pointer to that CI. VSAM compares your key to the separators and sequence set entries to find which data CI can contain your key. It then reads that data CI from the data component (one I/O), searches within the CI for the exact record (records in a CI are in key order), and returns the record. If the key is not in that CI (e.g. key not present), VSAM returns a "record not found" condition. The number of I/Os is roughly: one or two for the index (depending on index size) plus one for the data CI. So random access is efficient even for large files.
Only KSDS has an index component. ESDS and RRDS do not have keys in the same sense: ESDS is entry-sequenced (order of write), and you access by relative byte address (RBA); RRDS is slot-based, and you access by relative record number (RRN). So when we speak of "random access" in the sense of "read by key," we mean KSDS. For ESDS you can still do "direct" access if you know the RBA (e.g. from a previous read or from another index); that is sometimes called random access by RBA, but it is not key-based.
To use random access in COBOL you define the file as INDEXED and specify ACCESS IS RANDOM. The RECORD KEY clause identifies the data-name that holds the key value. That data-name must be part of the record (typically the first part of the record, or at least a field that matches the key length and position defined when the cluster was created). The key length and offset in the record must match the KEYS(length offset) you used in DEFINE CLUSTER. For example, if the key is 10 bytes at offset 0, your RECORD KEY field should be a 10-byte field at the same position in the record layout.
1234567891011121314FILE-CONTROL. SELECT CUSTFILE ASSIGN TO CUSTFILE ORGANIZATION IS INDEXED ACCESS IS RANDOM RECORD KEY IS CUST-ID FILE STATUS IS WS-CUST-STATUS. DATA DIVISION. FILE SECTION. FD CUSTFILE. 01 CUST-REC. 05 CUST-ID PIC X(10). 05 CUST-NAME PIC X(40). 05 CUST-BAL PIC S9(7)V99.
Here, CUST-ID is the primary key. When you want to read a record by key, you move the key value to CUST-ID, then execute READ CUSTFILE. No NEXT or PREV is used; the READ uses the current value of the record key to find the record. After the READ, FILE STATUS indicates success (e.g. 00) or not found (e.g. 23) or other conditions. The record area is filled only if the read is successful.
The typical pattern is: move the key to the record key field, OPEN the file (INPUT or I-O), then READ the file. The READ statement for random access is just READ file-name (no NEXT). The key must be set before each READ. Example:
1234567MOVE 'CUST000001' TO CUST-ID. READ CUSTFILE INVALID KEY DISPLAY 'Customer not found: ' CUST-ID NOT INVALID KEY DISPLAY 'Name: ' CUST-NAME ' Balance: ' CUST-BAL END-READ.
If a record with key CUST000001 exists, it is returned in CUST-REC and NOT INVALID KEY runs. If not, INVALID KEY runs and FILE STATUS is set (e.g. 23 for "record not found"). You should always check FILE STATUS or use INVALID KEY so that missing keys are handled. After a successful random READ you can also REWRITE to update the record in place or DELETE to remove it, if the file was opened I-O.
After a READ, the FILE STATUS (if you specified it in the SELECT) contains a two-byte code. Common values for random READ: 00 means success; 23 means the record was not found (no record with that key); 30 means a permanent error (e.g. I/O error). Other codes can indicate duplicate key, logic error, or file not open. Your program should test the status and branch accordingly so that "key not found" is not treated as a hard failure when it is a valid case (e.g. new customer ID).
If your KSDS was defined with an alternate index (path), you can also do random access by alternate key. In COBOL you define ALTERNATE RECORD KEY (and optionally WITH DUPLICATES). Then you can set the alternate key field and use READ with that key. The access method uses the alternate index to find the record (or one of the duplicates). This is useful when you need to look up by something other than the primary key—for example, by social security number or by department code—while the primary key might be a unique ID.
Use random access when your program needs to fetch one or a few records by key without processing the whole file. Typical uses: online lookups (e.g. CICS: user enters customer ID, program reads that customer); batch jobs that read a transaction file and look up a master record by key for each transaction; and any application where the unit of work is "get record for this key." Use sequential access when you need to process all records in key order (reports, batch updates that walk the file), and use dynamic access when you mix both—for example, position at a key with START then READ NEXT for a range, or occasionally do a random READ in the middle of sequential processing.
Random access is efficient because the index is small relative to the data and each lookup does a bounded number of I/Os (index traversal plus one data CI read). For very high random read rates, buffer tuning (BUFND, BUFNI) and CI size can matter: a larger index buffer pool reduces index I/Os; a smaller data CI can reduce the amount of data read per random hit. In practice, random access by key is one of the strengths of KSDS and is used heavily in production.
To perform a random read on a KSDS in COBOL: (1) Define the file with ORGANIZATION IS INDEXED, ACCESS IS RANDOM, and RECORD KEY pointing to the key field in the record. (2) Open the file with OPEN INPUT or OPEN I-O. (3) Move the key value you want to look up into the RECORD KEY field (e.g. MOVE search-key TO CUST-ID). (4) Execute READ file-name with INVALID KEY and NOT INVALID KEY to handle found vs not found. (5) If NOT INVALID KEY, the record is in the record area; use it or REWRITE/DELETE if opened I-O. (6) Repeat from step 3 for each new key, or close the file when done. No START is required for a pure random read; the key in the record area is used each time.
Imagine a big shelf of boxes, each with a number. Random access is when you say, "Give me box number 42." You don't look at box 1, then 2, then 3—you have a list (the index) that says "box 42 is on this shelf, in this spot," and you go straight there. That list is the VSAM index; the box number is the key. So you get the one box you asked for without walking past all the others.
1. Which VSAM type allows random access by key?
2. In COBOL, what must you set before a random READ?
3. What does VSAM use to find a record on a random read?