VSAM stands for Virtual Storage Access Method. It is both the name of an access method and the family of dataset types (KSDS, ESDS, RRDS, LDS) that use it. The architecture describes how VSAM organizes data, how it uses virtual storage (memory) and DASD, and what components—clusters, data and index components, control intervals, control areas, and the catalog—make it work. This page explains the Virtual Storage Access Method architecture so you can see how the pieces fit together on z/OS.
The name has three parts. "Virtual Storage" refers to the z/OS virtual storage (memory) architecture. When VSAM was introduced (OS/VS), the system had moved to virtual memory: address spaces, paging, and buffers in central storage. VSAM was designed to use that: it allocates buffers in virtual storage, reads control intervals from DASD into those buffers, and satisfies program READ requests from the buffer (or triggers a read if the needed CI is not present). So the access method is "virtual storage"-aware: it uses memory as a cache and works in fixed-size units (CIs) that match how the system moves data.
"Access Method" means VSAM is the interface and implementation that applications use to get to the data. The program does not read raw blocks from the device; it opens a cluster by name, issues READ (by key or RBA or RRN), WRITE, REWRITE, DELETE, and the access method translates those into the right I/O operations on the right control intervals. So the architecture includes the API (open, read, write, close) and the internal logic (index lookup, buffer management, CI I/O).
At the top level, a VSAM dataset is a cluster: one logical name that the program uses. The cluster has a type (KSDS, ESDS, RRDS, or LDS). For KSDS, the cluster has two components: a data component (the records, in key order) and an index component (the structure that maps key values to the control interval containing that record). For ESDS, RRDS, and LDS, there is only a data component (LDS has no record structure; it is a byte stream). The cluster and its components are defined in the catalog (ICF). When the program opens the cluster, the system finds the catalog entry, locates the data (and index, if any) on DASD, and the access method uses buffers and CI I/O to serve the program's requests.
| Component | Role |
|---|---|
| Cluster | Logical dataset; one name used in JCL and by programs |
| Data component | Stores the actual records (KSDS, ESDS, RRDS) or bytes (LDS) |
| Index component | KSDS only; maps keys to CI locations (sequence set + index set) |
| Control interval (CI) | Unit of I/O; holds records, free space, RDFs, CIDF |
| Control area (CA) | Group of CIs; allocation and split unit |
| Catalog (ICF) | Where cluster and component names and attributes are stored |
The cluster is the single logical entity. In JCL you reference the cluster name (DSN=cluster.name). The cluster has attributes: type (INDEXED, NONINDEXED, NUMBERED, LINEAR), RECORDSIZE, and for KSDS KEYS and optionally FREESPACE. Under the hood, the cluster is implemented as one or two components. For KSDS, the data component holds the records in key order in control intervals; the index component holds the sequence set (lowest level: one entry per CI or key range) and the index set (upper levels). For ESDS and RRDS there is only a data component. For LDS there is a single component with no record structure. The program never opens the data or index component directly; it opens the cluster, and VSAM uses the components.
The control interval (CI) is the unit of transfer between DASD and virtual storage. When VSAM needs a record, it identifies the CI that contains it (from the index for KSDS, or from RBA/RRN for ESDS/RRDS), and if that CI is not already in a buffer, it reads the whole CI. The CI has a fixed size (e.g. 4KB, 8KB, up to 32KB) set at define time. Inside the CI are: the records (or slot entries for RRDS), record descriptor fields (RDFs) that describe each record's length, free space, and a CI descriptor field (CIDF). So the CI is a small "page" of data that VSAM manages as a unit. Multiple CIs are grouped into a control area (CA). The CA is the unit of space allocation and of CA splits when the dataset grows. A CA is often one or more cylinders on disk. When you specify SPACE(primary secondary) in DEFINE CLUSTER, you are allocating in CA (or record) terms.
This design affects performance. A larger CI means fewer I/Os when reading sequentially (more records per read) but more data transferred per random read. A smaller CI can reduce memory use per buffer. FREESPACE(ci ca) reserves a percentage of each CI and each CA for inserts (KSDS, RRDS) so that splits do not happen too often. So the CI/CA architecture is central to both how VSAM stores data and how you tune it.
For KSDS, the index component is a B-tree style structure. The sequence set is the lowest level: it has one entry per data CI (or per key range), with the high key of that CI and a pointer to the CI. The index set is built on top: it has entries that point to sequence set entries (or lower index set blocks). To find a record by key, VSAM starts at the top of the index set, compares the key to the separators, goes to the next level, and eventually reaches the sequence set, which gives the data CI. Then it reads that CI (if not already in a buffer) and finds the record. So the index component is what makes key-based random access fast; without it you would scan the data. ESDS, RRDS, and LDS have no index component; access is by RBA, RRN, or byte offset.
VSAM datasets are cataloged in the Integrated Catalog Facility (ICF). The catalog stores the cluster name, type, attributes (RECORDSIZE, KEYS, etc.), and the locations of the data and index components (volume, extent information). When you open a cluster by name, the system looks up the name in the catalog, finds the component locations, and the access method can then use the catalog information to open the underlying data (and index) and satisfy I/O. So the catalog is part of the architecture: it is the name-to-storage binding. You never specify UNIT or VOLUME in JCL for normal VSAM access; the catalog supplies that.
The access method supports several ways to read and write. The following table summarizes the main access modes that the architecture supports.
| Mode | Description |
|---|---|
| Sequential | Read or write in order (key order for KSDS, entry order for ESDS, slot order for RRDS) |
| Random | Read/write by key (KSDS), by RBA (ESDS), or by RRN (RRDS) |
| Skip sequential | Position by key or RBA/RRN, then read sequentially from there |
| Dynamic | Mix sequential and random in the same open |
Sequential access uses the physical or logical order of records (key order for KSDS, entry order for ESDS, slot order for RRDS). Random access uses a key, RBA, or RRN to go directly to a record. Skip sequential means you position to a key or RBA/RRN and then read sequentially from there. Dynamic access means you can mix sequential and random in the same open. The access method uses the index (for KSDS), RBA, or RRN to find the correct CI and record and uses buffers to minimize I/O.
When your program issues a READ, the access method checks whether the CI that contains the record is already in a buffer (in virtual storage). If yes, it returns the record from the buffer. If no, it schedules a read of that CI from DASD into a buffer and then returns the record. So virtual storage holds the buffers; the access method manages which CIs are in which buffers (and which have been updated and need to be written back). You can influence this with AMP parameters (e.g. BUFND, BUFNI, STRNO) in JCL or in the ACB in the program. More buffers can improve hit rates for random or sequential access. So the "Virtual Storage" in the name is reflected in this buffer-based, CI-sized I/O model.
VSAM is the set of rules and structures the computer uses to store and find records on disk. The "Virtual Storage" part means it uses the computer's memory (virtual storage) as a workspace: it loads chunks of data (control intervals) into memory and then finds or updates the record you want. The "Access Method" part means it is the standard way programs ask for data: open a file by name, read by key or position, write, close. So the architecture is: a catalog that knows where each file is, a structure (control intervals and areas) that organizes the bytes on disk, and the access method code that uses memory and disk to satisfy your program's requests.
1. What is the unit of I/O in VSAM?
2. Which VSAM type has an index component?
3. What does "Virtual Storage" refer to in VSAM?