MainframeMaster

VSAM Tutorial

VSAM (Virtual Storage Access Method) is IBM's high-performance access method for organizing and accessing data on mainframe direct-access storage (DASD). It stores data in a proprietary format and supports four dataset types: Key Sequenced (KSDS), Entry Sequenced (ESDS), Relative Record (RRDS), and Linear (LDS). Each type suits different access patterns—by key, by position, or as a byte stream. This tutorial explains what VSAM is, how the four types differ, how data is organized in control intervals and control areas, and how to define and access VSAM datasets with IDCAMS and JCL.

Explain Like I'm Five: What Is VSAM?

Imagine a filing system where you can find a folder either by its number (like "folder 5") or by a label on it (like "Smith"). VSAM is the mainframe's way of organizing data so the computer can find records quickly. Some VSAM files are like a dictionary: you look up a word (the key) and get the page. Others are like a diary: you read from the beginning or jump to a byte position. The system keeps everything in fixed-size "chunks" (control intervals) so that reading and writing are efficient. You don't create these files with normal JCL; you use a special utility (IDCAMS) to define them, and then programs open them by name.

What Is VSAM?

VSAM is both a type of dataset and the access method that manages it. As an access method, it provides more than simple sequential or direct I/O: it manages indexing (for KSDS), space (control intervals and control areas), and catalog integration. VSAM datasets are stored in a format that only VSAM understands; you cannot edit or browse them with standard tools like ISPF edit. They are used mainly for application data—customer files, transaction logs, lookup tables—not for source code, JCL, or load modules. VSAM supports fixed-length and variable-length records (except RRDS, which uses fixed-length slots), and it can handle very large datasets with efficient access by key or position.

The Four VSAM Dataset Types

The way records are stored and accessed depends on the dataset type. Choosing the right type affects whether you can insert and delete records, whether you need a key, and how you address records.

VSAM dataset types
TypeFull nameAccessTypical use
KSDSKey Sequenced Data SetBy key value; sequential in key orderCustomer/inventory files; random and sequential
ESDSEntry Sequenced Data SetBy RBA or sequential; append onlyLogs, audit trails, sequential processing
RRDSRelative Record Data SetBy relative record number (slot)Position-based; fixed-length slots
LDSLinear Data SetByte-addressable; no record structureDb2, system; byte stream

Key Sequenced Data Set (KSDS)

KSDS is the most common VSAM type. Each record has a key (one or more contiguous bytes at a fixed offset). Records are stored in ascending key order. The key must be unique. You can read a record by key (random access), read in key sequence (sequential), insert new records, update in place, and delete records. VSAM maintains an index that maps key values to the physical location of the record. KSDS has two components: a data component (the records) and an index component (the key index). When you define a KSDS you specify KEYS(length offset) and RECORDSIZE(average maximum). FREESPACE(ci ca) reserves space in each control interval and control area for insertions so that the file does not need to be reorganized too often.

Entry Sequenced Data Set (ESDS)

ESDS stores records in the order they were written. There is no key. Records are addressed by Relative Byte Address (RBA)—the byte offset of the record from the start of the dataset. You can read sequentially (in physical order) or by RBA. New records are added only at the end; you cannot delete or insert in the middle. Records can be fixed or variable length. ESDS is often used for log files, audit trails, and data that is written once and read sequentially or by known RBA. IMS and Db2 use ESDS for certain structures; z/OS UNIX can use ESDS-style organization.

Relative Record Data Set (RRDS)

RRDS is organized as a set of fixed-length slots. Each slot has a relative record number (RRN): 1, 2, 3, and so on. You access a record by its RRN. Slots can be empty (no record) or full. You can add, update, and delete records in place. All records must be the same length. RRDS is useful when the application derives the record number (e.g. day of year, slot index). It supports both sequential and direct access. A variation is VRRDS (variable-length RRDS), which allows variable-length records in slots.

Linear Data Set (LDS)

LDS is a byte-addressable stream. VSAM does not impose any record structure; the dataset is just a contiguous range of bytes. The application (or a product like Db2) interprets the content. LDS is used for Db2 tablespaces, data-in-virtual, and some z/OS system functions. Application programs use it less often than KSDS or ESDS. Access is typically in "pages" (e.g. 4K) and the program must manage layout and boundaries.

Control Intervals and Control Areas

VSAM organizes data in control intervals (CIs) and control areas (CAs). A control interval is the unit of transfer between memory and disk: when VSAM reads or writes, it does so in whole CIs. The default CI size is 4KB; it can be larger (up to 32KB). A CI holds data records, free space, record descriptor fields (RDFs) that describe each record, and a CI descriptor field (CIDF). Multiple CIs are grouped into a control area. A CA is usually one or more cylinders on disk. Space allocation (e.g. SPACE(10 5) CYLINDERS in DEFINE CLUSTER) is specified in terms of CAs or records. VSAM does not use RECFM or BLKSIZE like non-VSAM datasets; blocking is internal to the CI.

Defining VSAM Datasets with IDCAMS

VSAM datasets are created with the IDCAMS (Access Method Services) utility. The main command is DEFINE CLUSTER. You run IDCAMS as a program (PGM=IDCAMS), pass SYSIN with the commands, and SYSPRINT receives the listing. The cluster is the logical dataset; for KSDS it has a data component and an index component. After a successful DEFINE, the dataset is cataloged (you do not use CATLG in JCL for the dataset itself).

jcl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
//DEFKSDS EXEC PGM=IDCAMS //SYSPRINT DD SYSOUT=* //SYSIN DD * DEFINE CLUSTER ( - NAME(USER.CUSTOMER.VSAM.KSDS) - INDEXED - VOLUMES(SYSVOL) - RECORDSIZE(200 300) - KEYS(10 0) - FREESPACE(20 10) - SHAREOPTIONS(2 3) - SPACE(10 5) CYLINDERS - ) /*

INDEXED means KSDS. RECORDSIZE(200 300) is average and maximum record length in bytes. KEYS(10 0) means the key is 10 bytes long starting at offset 0. FREESPACE(20 10) reserves 20% of each CI and 10% of each CA for insertions. SHAREOPTIONS(2 3) controls cross-region and cross-system sharing. SPACE(10 5) CYLINDERS allocates 10 primary and 5 secondary cylinders. For an ESDS you use NONINDEXED and omit KEYS and FREESPACE. For RRDS you use NUMBERED and fixed RECORDSIZE (same average and max).

Accessing VSAM in JCL

To use a VSAM dataset in a program, you reference it in JCL with a DD statement. You specify DSN= the dataset name and DISP=SHR (shared) or DISP=OLD (exclusive). You do not specify UNIT or VOLUME—the catalog supplies the volume. VSAM datasets are always cataloged; DISP=CATLG/UNCATLG are not used for allocation. To create or delete a VSAM dataset you use IDCAMS, not JCL allocation with DISP=(NEW,CATLG).

jcl
1
2
3
//STEP1 EXEC PGM=MYPROG //CUSTFILE DD DSN=USER.CUSTOMER.VSAM.KSDS, // DISP=SHR

Optional AMP parameter can specify BUFND (data buffers), BUFNI (index buffers), STRNO (strings), and other options to tune performance or sharing.

Step-by-Step: Defining a KSDS

  1. Decide the dataset name, key length and offset, record size (average and max), and space (e.g. cylinders or records).
  2. Run IDCAMS with SYSPRINT and SYSIN. In SYSIN, use DEFINE CLUSTER with NAME(...), INDEXED, VOLUMES(...), RECORDSIZE(...), KEYS(...), FREESPACE(...), SHAREOPTIONS(...), SPACE(...).
  3. Check the return code and SYSPRINT. If the DEFINE succeeds, the cluster is cataloged.
  4. In application JCL, reference the dataset with DD DSN=...,DISP=SHR (or OLD). The program opens it by ddname and uses the appropriate COBOL or other language file verbs (READ, WRITE, REWRITE, DELETE for KSDS).

Step-by-Step: Reading a VSAM KSDS by Key

  1. In the program, define an FD and record layout. The key field must match the KEYS(length offset) of the cluster.
  2. Open the file (INPUT, I-O, or EXTEND as appropriate). For random read by key, use OPEN INPUT or I-O.
  3. Move the key value to the key field in the record area (or the key field in the FD, depending on language).
  4. Issue READ file KEY IS key-field (or equivalent). VSAM looks up the key in the index and returns the record.
  5. Check the file status. Handle end-of-file (no record with that key) and any errors.

Access Methods: Sequential, Direct, Skip Sequential

VSAM supports sequential access (records in key order for KSDS, or physical order for ESDS), direct access by key (KSDS) or RBA (ESDS) or RRN (RRDS), and skip sequential (start at a key and then read sequentially). The program chooses the access mode when opening and then uses the corresponding read/write verbs. Buffering (BUFND, BUFNI) and STRNO affect how many buffers and concurrent strings are used and can improve performance for sequential or random access.

Common IDCAMS Operations

Best Practices

Test Your Knowledge

Test Your Knowledge

1. Which VSAM type is accessed by a key field in the record?

  • ESDS
  • RRDS
  • LDS
  • KSDS

2. In VSAM, the unit of I/O between memory and disk is the:

  • Block
  • Record
  • Control interval
  • Track

3. How do you define a new VSAM dataset?

  • JCL DD with DISP=(NEW,CATLG)
  • IDCAMS DEFINE CLUSTER
  • IEBGENER
  • ICKDSF