What is a CI split in VSAM?

A CI (control interval) split occurs when a new record is inserted (or an existing record is lengthened) and there is no free space left in the CI where the record must go. VSAM moves approximately half of the records from that CI to a free CI in the same control area, then inserts the new record in key order. A CI split costs extra I/O and can slow down inserts.

What is a CA split in VSAM?

A CA (control area) split occurs when a CI split is needed but there is no free CI in the current control area. VSAM allocates a new control area, moves about half of the CIs from the full CA to the new CA, and updates the index (sequence set and possibly index set). CA splits are more expensive than CI splits because they involve more data movement and index updates.

When do VSAM splits occur?

Splits occur only when inserting records or when an existing record is updated and becomes longer, and only in KSDS (and in effect RRDS when inserting into a full slot area). ESDS does not have in-place inserts (records are appended), so it does not have CI/CA splits in the same way. FREESPACE reserves room in each CI and CA to reduce how often splits happen.

How do you reduce VSAM splits?

Specify FREESPACE(ci-percent ca-percent) when defining the cluster. The CI percentage leaves part of each control interval free for new records; the CA percentage leaves some CIs in each control area free so that when a CI split is needed, there is an empty CI available. Loading records in key order also helps. For files that grow a lot after load, use higher FREESPACE (e.g. 20–25% CI, 10–25% CA).

MainframeMaster

VSAM Split Operations

When you insert a record into a KSDS (or add data that expands a record), VSAM must place it in the correct control interval in key order. If that CI has no free space left, VSAM performs a control interval split: it moves about half of the records from the full CI to another CI in the same control area, then inserts the new record. If there is no free CI in the control area, VSAM must perform a control area split: allocate a new CA, move half the CIs from the full CA to the new one, and update the index. Splits cost extra I/O and can hurt performance. Understanding when splits happen and how FREESPACE reduces them helps you size and tune VSAM files. This page explains CI splits, CA splits, when they occur, and how to minimize them.

When Do Splits Occur?

Splits occur only as a result of inserting new records or of updating an existing record so that it becomes longer. They apply to key-sequenced data sets (KSDS) and, in practice, to RRDS when you are filling slots. In a KSDS, every insert must go into the correct CI in key order. When that CI is full (no free space left from FREESPACE), VSAM cannot simply append the record—it must create room. It does that by splitting the CI. So the trigger is always: "need to put a record (or more data) in a CI that has no room." Deletes and reads do not cause splits. Sequential reads do not cause splits. Only inserts and record expansion do.

Entry-sequenced data sets (ESDS) do not have splits in this sense. Records are appended at the end of the file; there is no "insert in the middle" by key. So ESDS never performs CI or CA splits. Linear data sets (LDS) have no record structure, so there are no VSAM record inserts and no splits.

Control Interval (CI) Split

A CI split happens when the control interval where the new record belongs is full. VSAM finds a free CI in the same control area (one that was reserved by the FREESPACE CA percentage). It then moves approximately half of the records from the full CI to that free CI, in key order. After the move, the original CI has room for the new record, and VSAM inserts it. The sequence set (and possibly the index set) must be updated so that the index points to both CIs with the correct key ranges. So one insert that triggers a CI split results in several I/O operations: read the full CI, write the full CI (after moving some records out), write the other CI (with the moved records), and update the index. That is why CI splits are expensive and why FREESPACE in the CI is important: it gives room for many inserts before a split is needed.

Example: Suppose CI-1 in CA-1 holds records with keys 1, 2, 3, 4, 5 and has no free space left. You insert a record with key 4 (or a key that sorts between 3 and 5). VSAM must put it in CI-1 in key order, but CI-1 is full. So VSAM does a CI split. It might move records with keys 4 and 5 to a free CI (e.g. CI-2) in the same CA. Now CI-1 holds keys 1, 2, 3 and has room; CI-2 holds 4, 5 and the new record. The sequence set is updated to show the new key ranges for CI-1 and CI-2.

Control Area (CA) Split

A CA split occurs when a CI split is needed but there is no free control interval in the current control area. All CIs in the CA are in use (either full of records or already used in prior splits). VSAM cannot do a CI split without a free CI. So it allocates a new control area (from the space allocated to the cluster). It then moves approximately half of the control intervals from the full CA to the new CA. After the move, the original CA has a free CI that can be used for the CI split. VSAM then performs the CI split as described above. The sequence set and index set must be updated to include the new CA and the new key distribution. A CA split is more expensive than a CI split because it involves moving many CIs (each of which may need read and write), allocating new space, and updating more index structure.

Example: CA-3 has CIs all in use; no free CI. You insert a record that belongs in CA-3. VSAM needs to do a CI split but there is no free CI in CA-3. So VSAM does a CA split first: allocate a new CA (e.g. CA-5), move half of the CIs from CA-3 to CA-5, update the index. Now CA-3 has free CIs. VSAM then does the CI split within CA-3 and inserts your record.

CI Split vs CA Split

Comparing CI split and CA split
Aspect	CI split	CA split
Trigger	Insert or lengthen record; no free space in the target CI	CI split needed but no free CI in the current CA
What moves	About half the records in the full CI move to another CI in the same CA	About half the CIs in the full CA move to a newly allocated CA
Index updated	Sequence set (and possibly index set) updated for the new CI	Sequence set and index set updated; new CA gets index entries
Cost	Multiple I/O: read full CI, write two CIs, update index	Higher: new CA allocation, move many CIs, more index updates

How FREESPACE Reduces Splits

When you define a cluster with FREESPACE(ci-percent ca-percent), you reserve space for future inserts. The CI percentage reserves that proportion of each control interval for new or expanded records. So when you insert, there is room in the CI for many records before it becomes full. The CA percentage reserves that proportion of each control area as free CIs. So when a CI does become full and a CI split is needed, there is an empty CI in the same CA to receive the records that are moved. If you use FREESPACE(0 0), every CI is filled to capacity at load time, and the first insert in the middle of the file may trigger a CI split immediately; if there are no free CIs, you get a CA split. For files that will grow after load, typical values are FREESPACE(20 10) or (25 15)—enough to absorb a lot of inserts before splits occur.

FREESPACE does not eliminate splits forever. As you keep inserting, the free space is consumed. Once it is gone, the next insert that would go in a full CI will cause a split. So FREESPACE delays and reduces splits; it does not remove the need for them if the file keeps growing. For very dynamic files, you may also need to reorganize periodically (REPRO to a new cluster with the same or higher FREESPACE) to regain free space.

Performance Impact

Each CI split costs multiple I/Os: reading the full CI, writing two CIs (the split CI and the CI that receives the moved records), and updating the index. A CA split costs more: reading and writing many CIs, allocating the new CA, and updating the index at multiple levels. So applications that do many inserts (especially random inserts in key order) can see noticeable slowdowns if splits are frequent. Monitoring split activity (e.g. via LISTCAT or SMF) and tuning FREESPACE and load order can help. For batch loads, loading in key order and using adequate FREESPACE for later online inserts is a common approach.

Key Takeaways

A CI split occurs when you insert (or expand) a record and the target CI has no free space. VSAM moves about half the records to a free CI in the same CA and inserts the new record.
A CA split occurs when a CI split is needed but there is no free CI in the current CA. VSAM allocates a new CA, moves half the CIs to it, then performs the CI split.
Splits happen only on insert or record expansion, and only in KSDS (and RRDS for slot usage). ESDS and LDS do not have these splits.
FREESPACE(ci-percent ca-percent) reserves room in each CI and CA so that many inserts can be absorbed before a split is needed. Higher FREESPACE reduces split frequency but uses more DASD.

Explain Like I'm Five

Imagine a row of boxes (CIs) in a shelf (CA). When you add a new toy (record) to a box that is full, you have to take half the toys out and put them in an empty box on the same shelf—that's a CI split. If the shelf has no empty box, you get a new shelf, move half the boxes there, then put half the toys from the full box into one of the empty boxes—that's a CA split. Leaving some space in each box and some empty boxes on each shelf (FREESPACE) means you can add lots of toys before you have to do that work.

Test Your Knowledge

1. When does a CI split occur?

When you delete a record
When you read a record
When you insert a record and the target CI has no free space
When you open the file

2. Which VSAM type has CI and CA splits?

ESDS only
KSDS (and RRDS for slot usage)
LDS only
All types

3. How does FREESPACE help avoid splits?

It deletes old records
It reserves space in each CI and CA for new records and CI splits
It increases CI size
It compresses the file

VSAM Split Operations

When Do Splits Occur?

Control Interval (CI) Split

Control Area (CA) Split

CI Split vs CA Split

How FREESPACE Reduces Splits

Performance Impact

Key Takeaways

Explain Like I'm Five

Test Your Knowledge

Test Your Knowledge

Free space

Control Interval (CI)

Control Area (CA)

Record