The sequence set is the lowest level of the index component in a Key-Sequenced Data Set (KSDS). It sits between the index set (the upper levels of the index) and the data component (where the actual records are stored). The sequence set has one entry for each control interval in the data component: each entry typically holds the highest key value in that data CI and a pointer—usually a relative byte address (RBA)—to that data CI. VSAM uses the sequence set to find which data CI contains a given key when you do a random read, and to walk the data in key order when you do sequential access. Understanding the sequence set helps you see how key-based and sequential access work and how the index is organized. This page explains what the sequence set is, what it contains, how it relates to the data CIs and the index set, and how VSAM uses it for random and sequential access.
The sequence set is the bottom level of the KSDS index. The index component as a whole has two kinds of levels: the index set (one or more levels at the top) and the sequence set (one level at the bottom). The sequence set is the only level that points directly to the data component. Each entry in the sequence set corresponds to one control interval in the data component. So if your data component has 500 data CIs, the sequence set has 500 entries. Each entry tells VSAM: “For this range of keys (up to this high key), the records are in the data CI at this RBA.” So the sequence set is the bridge from “key value” to “which data CI to read.” You never see or manipulate the sequence set directly; the access method uses it when you do READ by key, READ NEXT, or insert/delete/update.
Each sequence set entry typically contains: (1) a key value—often the highest key in the data CI that this entry represents (the “high key” or separator); and (2) a pointer to that data CI—usually the relative byte address (RBA) of the CI in the data component. So when VSAM is looking for a record with key K, it searches the sequence set (or uses the index set to narrow down which part of the sequence set to search) to find the entry whose key range includes K. That entry’s pointer gives the RBA of the data CI; VSAM then reads that CI and searches within it for the record with key K. The sequence set entries are stored in ascending key order. So the first sequence set entry points to the data CI with the lowest keys, the next entry to the next CI, and so on. That order is what allows sequential access: VSAM can read the sequence set in order and thus know the order of the data CIs by key.
| Item | Detail |
|---|---|
| High key (or key range) | The highest key value in the data CI that this entry represents. Used to decide whether a search key falls in this CI. |
| Pointer (RBA) | Relative byte address of the data CI in the data component. VSAM uses this to read the correct CI. |
| Order | Sequence set entries are in ascending key order. So the sequence set reflects the key order of the data CIs. |
There is a one-to-one correspondence between sequence set entries and data control intervals. Every data CI has exactly one sequence set entry that points to it. So the number of sequence set entries equals the number of data CIs in the cluster. When you add records and VSAM performs a CI split (creating a new data CI), a new sequence set entry is added for that new CI. When the sequence set itself grows, it is stored in index CIs; typically there is one sequence set index CI per control area in the data component, or the sequence set is packed into index CIs in key order. The exact packing depends on the implementation, but the important point is: the sequence set has one entry per data CI, and those entries are in key order so that the “sequence” of data CIs by key is known.
When you do a READ by key (random read), VSAM must find the data CI that contains the record with that key. It does not scan all data CIs. Instead it uses the index. It starts at the index set (top of the index), compares the search key to the separators in the index set records, and follows the appropriate pointer down. It continues until it reaches the sequence set. In the sequence set it finds the entry whose key range includes the search key—that is, the entry whose high key is greater than or equal to the search key (and the previous entry’s range, if any, is less than the search key). That entry’s pointer is the RBA of the data CI. VSAM then reads that one data CI (if not already in a buffer), searches within the CI for the record with the exact key, and returns the record (or “not found”). So the sequence set is the final step in the index search: it gives the exact data CI that might contain the key.
For sequential access (e.g. READ NEXT after a START, or full sequential read), VSAM must return records in ascending key order. The data component stores records in key order within each CI, but the CIs themselves are scattered on DASD. The sequence set is in key order: the first entry points to the CI with the lowest keys, the next to the next CI, and so on. So VSAM can “walk” the sequence set from the beginning (or from the entry that corresponds to the current position) and, for each entry, read the corresponding data CI and return the records from that CI in order. That way the application sees records in key order without VSAM having to sort them. So the sequence set not only supports random “which CI?” lookup but also defines the key order of the data CIs for sequential processing.
The data component is organized into control areas (CAs). Each CA contains a number of control intervals. The sequence set entries for the data CIs in one CA are often stored together—for example in one or more index CIs that “belong” to that CA. So you may hear that there is “one sequence set CI per control area” or that the sequence set is organized by CA. The idea is that the sequence set is partitioned or grouped so that the index for a given CA’s data CIs is localized. That can help with buffer usage and I/O when you are accessing a range of keys that fall in one CA. The exact layout is implementation-dependent, but the principle holds: the sequence set covers all data CIs and is in key order.
The sequence set is updated by VSAM whenever the data component structure changes. When you insert records and a CI split occurs, a new data CI is created and the keys are split between the old and new CI. VSAM then adds a new sequence set entry for the new CI and updates the high key (or key range) in the affected entries so that the sequence set still correctly reflects which keys are in which CI. When you delete records, the sequence set is not necessarily changed unless a CI is freed or merged (which depends on the implementation). So the sequence set is maintained automatically; you never add or remove sequence set entries yourself. The access method keeps the index (including the sequence set) consistent with the data.
The index set is the upper level(s) of the index. It does not point to the data component; it points to the sequence set (or to lower index set levels). The index set’s job is to quickly narrow down which part of the sequence set to search. So when you have many data CIs, the sequence set has many entries. Without the index set, finding the right sequence set entry would require scanning many entries. The index set groups sequence set entries (or groups of sequence set entries) and stores separators and pointers, so that with one or a few index reads you can get to the right part of the sequence set. So: index set = navigation to the right part of the sequence set; sequence set = direct pointer to each data CI. Together they form the B-tree-like index that makes key access efficient.
Imagine the sequence set as a list of “labels” in order. Each label says “the next box of toys has keys up to this letter/number” and “that box is at this place.” When you ask for one toy by its key, the computer looks at the list to find which label covers your key, then goes to that box. When you want to read all toys in order, the computer goes down the list from the first label to the last and opens each box in order. So the list (sequence set) is both a map (“which box?”) and the order for reading boxes one after another.
1. What does the sequence set point to?
2. How many sequence set entries are there for a given data component?
3. What is the sequence set used for?