VSAM Data Component

Every VSAM cluster has at least one physical component that holds the actual data: the data component. For Key Sequenced Data Sets (KSDS) there is also an index component, but the data component is where the records—or for Linear Data Sets (LDS), the raw byte stream—live. Understanding the data component helps you see how VSAM organizes space (control intervals and control areas), how naming works (cluster name vs data component name), and why you never open the data component directly in your programs. This page covers the purpose of the VSAM data component, what it stores for each dataset type, naming conventions, and how it fits into the cluster.

What Is the Data Component?

The data component is the part of a VSAM cluster that contains the actual stored data. It has its own name in the catalog (often the cluster name with a suffix such as .DATA), its own space allocation on DASD, and its own extents. When your program reads or writes a record, the access method uses the data component to perform the I/O: it locates the correct control interval (CI), brings it into a buffer if needed, and returns or updates the record. So the data component is the "data" half of the cluster; the other half, for KSDS only, is the index component, which holds the key-to-CI mapping and is used for key-based access.

You never open the data component by name. You open the cluster. The catalog entry for the cluster points to the data component (and for KSDS, the index component). When you allocate the cluster in JCL with DSN=cluster.name, the system looks up the cluster in the catalog and finds the data and index component names and their locations. The access method then uses those components to satisfy your READ, WRITE, REWRITE, and DELETE requests. So from a programmer's point of view, the data component is invisible; from an administrative or LISTCAT point of view, you see it as a separate catalog entry under the cluster.

What the Data Component Stores by Dataset Type

The contents and layout of the data component depend on the type of VSAM cluster. In all cases the data component is organized in control intervals (CIs) and control areas (CAs). Within those structures, how records are stored varies.

Data component content by VSAM type
TypeWhat the data component stores
KSDSRecords in key order in CIs; key is part of each record
ESDSRecords in entry order (append order); accessed by RBA
RRDSFixed-length records in numbered slots (RRN); empty slots allowed
LDSByte stream only; no record structure, no keys

For KSDS, records are stored in key order. Each record includes its key; the index component holds pointers from key values to the CI that contains that record. For ESDS, records are stored in the order they were written (entry order); there is no key and no index—access is by relative byte address (RBA). For RRDS, the data component is divided into fixed-length slots; each slot has a relative record number (RRN); slots can be empty (deleted) or contain a record. For LDS, the data component is a contiguous byte stream with no record boundaries; it is used by products such as DB2 or VSAM RLS that manage their own structure on top of the bytes.

Data Component and Control Intervals

The data component is divided into control intervals. A control interval (CI) is a fixed-size block (e.g. 4KB or 8KB) that VSAM uses as the unit of I/O. When VSAM needs a record, it determines which CI contains it (from the index for KSDS, or from RBA/RRN for ESDS/RRDS), reads that entire CI into a buffer if it is not already there, and then extracts or updates the record. So the data component is not just a flat file of records; it is a sequence of CIs, each containing one or more records (or parts of spanned records), free space, and control information (RDFs, CIDF). The size of the CI (CISZ) is set at define time and affects both how many records fit in one I/O and how much free space is available for inserts.

Naming Conventions for the Data Component

When you define a cluster with IDCAMS DEFINE CLUSTER, you can specify the data component name explicitly with the DATA clause, e.g. DATA(NAME(USERID.MYFILE.KSDS.DATA)). If you omit the NAME in the DATA clause, IDCAMS generates a name for the data component. The usual convention is to take the cluster name and append a suffix such as .DATA. So for a cluster named USERID.CUSTOMER.VSAM, the data component might be USERID.CUSTOMER.VSAM.DATA. The exact generated name can depend on the IDCAMS implementation and options; the important point is that the data component has a distinct name that appears in the catalog and in LISTCAT output, but that name is not used in JCL or in application code. Only the cluster name is used there.

Data set naming rules apply to the data component name as to any other dataset: maximum 44 characters for the full name, qualifiers of 1–8 characters, and so on. Because the data component name is often cluster-name plus a suffix, the cluster name must leave room for that suffix (e.g. .DATA is 5 characters) if you let IDCAMS generate it. Many shops use explicit DATA(NAME(...)) and INDEX(NAME(...)) in DEFINE CLUSTER so that the component names are predictable and documented.

Defining the Data Component

You do not define the data component by itself. You define the cluster, and the cluster definition includes the data component (and for KSDS, the index component). In the DEFINE CLUSTER command, cluster-level parameters (e.g. RECORDSIZE, FREESPACE, VOLUMES, CYLINDERS) apply to the cluster as a whole and are used to build the data component (and index component) attributes. You can also specify parameters under the DATA clause to override or supply data-component-specific values, such as the data component name or CISZ. The following example shows a KSDS definition with an explicit data component name.

jcl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
//DEFVSAM EXEC PGM=IDCAMS //SYSPRINT DD SYSOUT=* //SYSIN DD * DEFINE CLUSTER ( - NAME(USERID.APPL.DATA.KSDS) - INDEXED - RECORDSIZE(100 200) - KEYS(12 0) - FREESPACE(10 5) - CYLINDERS(5 2) - VOLUMES(VOL001)) - DATA (NAME(USERID.APPL.DATA.KSDS.DATA)) - INDEX (NAME(USERID.APPL.DATA.KSDS.INDEX)) /*

The cluster is USERID.APPL.DATA.KSDS. The data component is USERID.APPL.DATA.KSDS.DATA and the index component is USERID.APPL.DATA.KSDS.INDEX. RECORDSIZE, FREESPACE, and space allocation apply to the data component (and the index gets its own space). In JCL you use DSN=USERID.APPL.DATA.KSDS only.

Data Component Only: ESDS, RRDS, LDS

For ESDS, RRDS, and LDS there is no index component. The cluster has a single component: the data component. So the cluster entry in the catalog points to one component. For ESDS the data component holds records in entry order; for RRDS it holds fixed-length records in slots; for LDS it holds a byte stream. Naming is the same idea: the cluster has one name (e.g. USERID.LOG.ESDS), and the data component might be USERID.LOG.ESDS.DATA. You still reference only the cluster name when allocating or opening the file.

Multi-Extent and Multi-Volume Data Components

The data component can span multiple extents and multiple volumes. When you specify VOLUMES or let SMS manage placement, the data component can grow (secondary allocation) and can use more than one volume. Each component—cluster, data, index—has its own catalog entry and its own extent information. So when you run LISTCAT, you see the cluster and under it the data component (and for KSDS the index component) with their respective volume and extent data. The data component is what actually consumes the DASD space for the records; the cluster is the logical umbrella.

Key Takeaways

  • The data component holds the actual records (or for LDS, the byte stream); every VSAM type has a data component.
  • KSDS also has an index component; ESDS, RRDS, and LDS have only the data component.
  • You never use the data component name in JCL or in programs; you use the cluster name.
  • The data component is organized in control intervals (CIs) and control areas (CAs); CISZ and FREESPACE are defined at cluster define time.
  • Data component naming often uses a suffix like .DATA on the cluster name; you can set it explicitly in DEFINE CLUSTER with DATA(NAME(...)).

Explain Like I'm Five

The cluster is like the front door of a building. The data component is the room where all the boxes (records) are stored. When you ask for a box, you go through the front door (open the cluster); the system knows which room has the boxes and goes there to get or put a box. You never say "open the data component"—you always say "open the cluster," and the data component is the place where the real stuff lives.

Test Your Knowledge

Test Your Knowledge

1. What does the VSAM data component contain?

  • Only the index
  • The actual records (or byte stream for LDS)
  • Only catalog information
  • JCL statements

2. In JCL, which name do you use for a VSAM dataset?

  • The data component name
  • The index component name
  • The cluster name
  • The catalog name

3. Which VSAM type has only a data component (no index component)?

  • KSDS only
  • ESDS, RRDS, and LDS
  • All types have an index
  • Only LDS
Published
Updated
Read time4 min
AuthorMainframeMaster
Reviewed by MainframeMaster teamVerified: IBM z/OS 2.5 documentationSources: IBM DFSMS Access Method Services, z/OS VSAM documentationApplies to: z/OS 2.5