The control interval (CI) is the fixed-size unit that VSAM uses to move data between disk and memory. When VSAM needs a record, it does not read just that record—it reads the entire CI that contains the record. When it updates a record, it may write back the whole CI. So the CI is the unit of I/O. Its size (CISZ) is set when you define the cluster and affects how many records fit in one read, how much free space is available for inserts, and how the data component is organized. This page explains what a control interval is, what it contains (records, free space, RDFs, CIDF), how CI size is chosen, and how it affects performance.
A control interval is a contiguous block of bytes in the data component (or in the index component for index CIs). It has a fixed size—for example 4096 (4KB) or 8192 (8KB) bytes—that you set at define time with the CISZ parameter (or let the system default). The data component is made up of a sequence of CIs. When your program reads a record, the access method figures out which CI contains that record (from the key, RBA, or RRN), reads that entire CI into a buffer if it is not already there, and then extracts the record from the CI. So the CI is the "page" or "block" that VSAM uses for all data transfer. There is no such thing as reading half a CI; I/O is always in whole CIs.
A data CI typically contains: (1) the logical records (or for RRDS, the slot entries), packed from the beginning of the CI; (2) free space, which is unused bytes reserved for future inserts (used mainly in KSDS and RRDS); (3) record descriptor fields (RDFs), which are small fields that describe each record (e.g. length); and (4) a control interval descriptor field (CIDF), usually at the end of the CI, which holds control information such as the offset to the free space. So the layout is roughly: [records] [free space] [RDFs and CIDF]. VSAM uses the RDFs and CIDF to know where each record starts and ends and where the free space begins. You do not define this layout yourself; the access method manages it.
| Part | Description |
|---|---|
| Records | Logical records (or RRDS slot entries) packed from the start of the CI |
| Free space | Reserved space for inserts (KSDS, RRDS); percentage set by FREESPACE |
| RDFs | Record descriptor fields: length and optional control info for each record |
| CIDF | Control interval descriptor field at end of CI: points to free space, etc. |
The control interval size is specified in bytes. Valid values depend on the system and the dataset type. Common ranges are 512 to 8192 bytes in 512-byte increments, and 8192 to 32768 in larger increments (e.g. 2KB). A typical default is 4096 bytes. If you do not specify CISZ in DEFINE CLUSTER, the system may calculate a default based on record size and device characteristics. Once the cluster is defined, the CI size is fixed; you cannot change it with ALTER. To change it you would have to define a new cluster with the desired CISZ and copy the data (e.g. with REPRO).
The CI size must be at least large enough to hold one maximum-length record (unless you use spanned records, which can cross CI boundaries). So if your maximum record length is 500 bytes, the CI must be at least 500 bytes plus the space for RDFs and CIDF. In practice, CIs are usually 4KB or more so that multiple records fit in one CI and I/O is efficient.
For sequential processing, a larger CI means more records per I/O. If each CI holds 20 records and you read sequentially, each read brings 20 records into the buffer. That can reduce the number of I/Os and improve throughput. For random access (e.g. read by key), you need only one CI per record requested. A larger CI means you read more data than you need for that one record, which can waste buffer space and memory bandwidth. A smaller CI (e.g. 4KB) keeps each random read to a minimum. So the tradeoff is: larger CI for sequential, smaller CI for random. A common choice that works for many workloads is 4096 bytes. For heavily sequential batch jobs, 8192 or 16384 might be better; for online random access, 4096 is often used.
In KSDS and RRDS, you specify FREESPACE(ci-percent ca-percent) at define time. The CI percentage reserves that proportion of each control interval for inserts. For example FREESPACE(20 10) reserves 20% of each CI (and 10% of each control area) for new records. When you insert a record, VSAM puts it in the correct CI in key order (KSDS) or in the specified slot (RRDS), using the free space. When the free space in a CI is exhausted, VSAM splits the CI (and possibly the control area), which can be expensive. So adequate free space reduces how often splits occur. The amount of free space in bytes is derived from the CI size: 20% of a 4096-byte CI is 819 bytes. ESDS and LDS do not use free space in the same way (ESDS appends; LDS has no record structure).
The index component (KSDS only) also has control intervals. Index CIs contain index records: sequence set entries (key + pointer to data CI) or index set entries (separators + pointers to the next level). The index CI size can be specified separately in some IDCAMS implementations; often it defaults to a value (e.g. 4096) that works for the key length and the number of data CIs. The same idea applies: the index is read and written in whole index CIs.
In DEFINE CLUSTER you can specify CISZ(n) where n is the size in bytes (e.g. CISZ(4096) or CISZ(8192)). The value must be valid for your system and dataset type. Example:
1234567891011DEFINE CLUSTER ( - NAME(USERID.FILE.KSDS) - INDEXED - RECORDSIZE(200 300) - KEYS(10 0) - FREESPACE(15 10) - CISZ(8192) - CYLINDERS(10 5)) - DATA (NAME(USERID.FILE.KSDS.DATA)) - INDEX (NAME(USERID.FILE.KSDS.INDEX))
Here CISZ(8192) sets the data CI size to 8KB. Each data CI will hold more records than a 4KB CI, which can help sequential reads. The index component may use a default or separate CISZ depending on the utility.
Think of the CI as a box that the computer always carries in one piece. When you ask for one toy (record), the computer brings the whole box that has that toy in it. The size of the box (CI size) is fixed when the file is created. Bigger boxes mean more toys per trip (good for reading many in a row); smaller boxes mean you only carry what you need for one toy (good when you only want one). The free space is empty room in the box left for adding new toys later.
1. What is the unit of I/O in VSAM?
2. Where is the CI size (CISZ) set?
3. What does a larger CI typically improve?