VSAM Index Updates

In a Key-Sequenced Data Set (KSDS), the index component (sequence set and index set) is what allows VSAM to find records by key and to read them in key order. When a CI split or CA split occurs, the layout of the data component changes: new control intervals appear, key ranges move between CIs, and sometimes a whole new control area is added. For key-based and sequential access to keep working, the index must be updated to reflect that new layout. The sequence set—one entry per data CI—must get new entries for new CIs and updated high keys for CIs that changed. The index set—the upper levels that point to the sequence set—may need to grow when the sequence set grows. This page explains how the index is updated when CI splits and CA splits occur, what stays the same, and why these updates are essential for correctness and performance.

Why the Index Must Be Updated

The index component has two parts: the sequence set (lowest level), with one entry per data control interval (high key and pointer to that CI), and the index set (one or more levels above), with separators and pointers that narrow the search to the right part of the sequence set. When you do a random read by key, VSAM uses the index set to find the right sequence set entry, then uses that entry's pointer to read the data CI. When you do a sequential read (READ NEXT), VSAM walks the sequence set in key order and reads the corresponding data CIs. So the index must always accurately describe which keys are in which data CIs. After a CI split or CA split, some keys have moved to new CIs or to a new CA. If the index were not updated, VSAM would look in the wrong CI and either miss the record or return the wrong one. So every split is followed by index updates so that the sequence set (and index set if needed) again correctly maps keys to data CIs.

Index Updates on a CI Split

When a CI split occurs, one full data CI is effectively split into two: the original CI now holds about half the records (and has free space), and a new data CI (in the same control area) holds the other half. So you go from one data CI to two. The sequence set must therefore change from one entry (pointing to the original CI with its old high key) to two entries: one for the original CI with its new high key (the highest key that remained in that CI), and one for the new CI with its high key and pointer. VSAM updates the sequence set so that both entries are in key order and point to the correct CIs. If the sequence set is stored in index CIs and the index CI that holds the entries for this control area has room for one more entry, only that sequence set index CI is updated. If that index CI is full, VSAM may have to split the sequence set index CI as well, which can cause the index set to be updated (new index set record or new level) so that the index set still points to all sequence set CIs correctly. In the common case, a CI split results in a sequence set update only; the index set is updated only when the sequence set structure grows.

Index components updated on a CI split
ComponentChange
Sequence setOne new entry for the new data CI (high key + pointer); existing entry for original CI updated with new high key.
Index setUpdated only if the sequence set CI that gets the new entry is full and must split; then new index set record(s) or level(s) as needed.

Index Updates on a CA Split

A CA split allocates a new control area and moves about half of the CIs from the full CA to the new CA. So you now have many new data CIs in the new CA, and the original CA has fewer CIs (and free CIs for the CI split that triggered the CA split). The sequence set must be updated in a bigger way: new entries are added for every data CI in the new CA (each with its high key and pointer), and the entries for the original CA are updated so that they only describe the CIs that remain there (key ranges and pointers). All these entries must remain in key order so that sequential access and key search still work. Because a CA split adds many new data CIs at once, it often adds many new sequence set entries. That may require new sequence set index CIs. When the sequence set grows that way, the index set must be updated: new index set records may be added to point to the new sequence set CIs, or a new level may be added to the index set so that the tree still balances. So a CA split typically involves both sequence set and index set updates. The exact steps (order of writes, buffering) are handled by VSAM to keep the index consistent and recoverable.

Index components updated on a CA split
ComponentChange
Sequence setNew entries for every data CI in the new CA; existing entries for the original CA updated so key ranges match the CIs that remain.
Index setNew index record(s) or level(s) to point to the new sequence set entries; structure may grow (new level) so key ranges still route correctly.

Sequence Set: One Entry per Data CI

The rule that the sequence set has one entry per data CI still holds after any number of splits. After a CI split, you have one more data CI, so one more sequence set entry. After a CA split, you have many new data CIs in the new CA, so many new sequence set entries. Each entry contains the high key for that CI and the pointer (RBA) to that CI. VSAM keeps these entries in ascending key order. So the sequence set always reflects the current set of data CIs and their key ranges. When you insert a record and a split occurs, the new or updated sequence set entries are written so that the next read by key or READ NEXT will use the correct CI. If the sequence set is stored in index CIs and one of those CIs becomes full when you add an entry, VSAM may split that index CI too (similar in concept to a data CI split), which can propagate updates into the index set. So the index set grows when the sequence set can no longer fit in its current index structure.

Index Set Growth

The index set has one or more levels. Each level contains index records (in index CIs) with separators and pointers. The bottom of the index set points to the sequence set; the sequence set points to the data CIs. When the sequence set grows (more data CIs, hence more sequence set entries), it may need more index CIs to store those entries. When that happens, the index set may need a new record to point to the new sequence set CI, or the index set may need to split an index CI and add a new level (a new root) so that the tree stays balanced. VSAM manages this automatically. From the application's point of view, splits and index updates are transparent; the only visible effect may be slightly higher I/O or latency when a split (and thus index update) occurs. Understanding that the index set grows when the sequence set grows helps explain why CA splits are more expensive: they add many sequence set entries at once and are more likely to trigger index set growth.

Consistency and Recovery

VSAM ensures that after a split and its index updates, the index consistently describes the data. If a failure occurs during a split, recovery (e.g. via VERIFY or other mechanisms) can detect and correct inconsistencies. In normal operation, the sequence set and index set are updated in a way that preserves the invariant: every key maps to exactly one data CI, and the sequence set is in key order. So index updates are not optional—they are required for correctness. They are also part of the cost of a split: in addition to moving data, VSAM must read and write index CIs. That is one reason why minimizing splits (via FREESPACE) improves performance: fewer splits mean fewer index updates and less index I/O.

Key Takeaways

  • After a CI split, the sequence set gets a new entry for the new data CI and the original CI's entry is updated with its new high key. The index set is updated only if the sequence set index CI needs to grow or split.
  • After a CA split, many new sequence set entries are added for the new CA, and the original CA's entries are updated. The index set is often updated because the sequence set grows by many entries.
  • The index must always correctly map keys to data CIs so that key-based and sequential access work. Index updates are part of every split.
  • Index updates add I/O and sometimes latency; minimizing splits with FREESPACE also minimizes index update cost.

Explain Like I'm Five

Imagine a table of contents (the index) that says "page 1 has keys A–M, page 2 has N–Z." When you add so much to page 1 that you have to split it into page 1 and page 1b, you have to fix the table of contents: now it should say "page 1 has A–G, page 1b has H–M, page 2 has N–Z." Updating that table of contents is like updating the sequence set. If your table of contents gets so long it needs its own index (chapter titles pointing to sections), that higher-level index has to be updated too when you add new pages—that's the index set. VSAM does the same: when data CIs split or new CAs appear, it updates the "table of contents" (sequence set and index set) so you can still find everything.

Test Your Knowledge

Test Your Knowledge

1. What gets a new entry when a CI split occurs?

  • Only the index set
  • The sequence set (new entry for the new data CI)
  • The catalog
  • The data component only

2. When does the index set usually change?

  • On every CI split
  • Only when the sequence set grows (e.g. new sequence set CI or new CA)
  • Never
  • Only on delete

3. Why must the index be updated after a split?

  • To free space
  • So key-based and sequential access still find the correct data CIs
  • To compress the file
  • To update the catalog
Published
Updated
Read time4 min
AuthorMainframeMaster
Reviewed by MainframeMaster teamVerified: IBM z/OS 2.5 documentationSources: IBM DFSMS Access Method Services, z/OS VSAM documentationApplies to: z/OS 2.5