In a Key-Sequenced Data Set (KSDS), the primary key is the field that identifies each record and controls the order of records. When you define the cluster with DEFINE CLUSTER, you specify the key using KEYS(length offset). The first value is the key length in bytes; the second value is the key offset—the byte position within each record where the key starts. The key offset tells VSAM where to find the key in every record so it can compare keys, maintain order, and perform key-based reads and inserts. This page explains what key offset is, how it is specified (including zero-based numbering), how it affects your record layout, and how to choose and validate it.
The key offset is the byte position of the first byte of the primary key, relative to the start of the logical record. It is specified as the second number in the KEYS parameter: KEYS(length offset). VSAM uses zero-based byte positions: offset 0 means the key starts at the very first byte of the record; offset 1 means the key starts at the second byte; offset 20 means the key starts at the 21st byte. So for KEYS(10 0), the key is bytes 0 through 9 (the first 10 bytes). For KEYS(8 20), the key is bytes 20 through 27. The offset is fixed when you define the cluster and cannot be changed later. Every record in the file must have its key at this same position; otherwise VSAM would read the wrong bytes when comparing or storing keys.
VSAM does not store the key in a separate structure from the record—the key is part of the record. When VSAM needs to compare two records (for example, to maintain ascending key order or to find a record by key), it reads the key bytes from each record at the offset you specified. If you tell VSAM the key is at offset 0 with length 10, it will always interpret the first 10 bytes of each record as the key. If your program actually puts the key at bytes 20–29, VSAM would be comparing the wrong data and the file would not behave correctly. So the offset you specify in DEFINE CLUSTER must match the layout of the records your program reads and writes. Getting the offset wrong leads to incorrect ordering, failed reads, or data corruption.
VSAM uses zero-based byte positions for the key offset. That means the first byte of the record is byte 0, not byte 1. If you are used to COBOL or other languages that sometimes use one-based positions, remember: in KEYS(length offset), offset 0 is the first byte. So KEYS(5 0) means the key is in positions 1–5 in “human” terms (bytes 0–4 in zero-based terms). Similarly, if the key starts at “position 21” in a 1-based sense, that is byte 20 in zero-based terms, so you would use offset 20. This zero-based convention is consistent with how many system-level APIs and documentation describe byte offsets.
| Offset value | Meaning |
|---|---|
| 0 | Key starts at the first byte of the record (byte 0). Most common when the key is the first field. |
| 10 | Key starts at byte 10 (11th byte). Use when the first 10 bytes are a prefix or header. |
| 20 | Key starts at byte 20. Typical when a fixed header (e.g. record type, timestamp) precedes the key. |
| Any valid | Offset can be any value such that offset + key length ≤ record length. Must be consistent for every record. |
The key must fit entirely within the record. So offset + key length must be less than or equal to the record length. For fixed-length records, if the record length is 200 bytes, then offset + length ≤ 200. For example KEYS(10 195) would be invalid because 195 + 10 = 205 > 200. For variable-length records, the key must fit within the minimum record length (or the maximum, depending on implementation); it is safest to ensure offset + length does not exceed the minimum record size so that every record can hold a valid key. The offset is typically 0 or a small number; large offsets are used when you have a fixed header (record type, date, source system) before the key.
Many KSDS designs put the key at offset 0—the first bytes of the record. That is the simplest case: KEYS(length 0). In COBOL, the key is then the first field in the 01-level record. Benefits include: the key is easy to define in the program (no FILLER before it), the record layout is easy to document, and there is no risk of miscounting bytes when computing the offset. If your record has no leading header or prefix, use offset 0.
Sometimes the record has a fixed structure where the first few bytes are not the key. For example, bytes 0–1 might be a record type code, bytes 2–5 a timestamp, and bytes 6–15 the actual key. In that case you would specify KEYS(10 6) so that the key is the 10 bytes starting at byte 6. In your program you must ensure the key field is defined at the same position. In COBOL you would use a FILLER or other fields for bytes 0–5 so that the key field starts at the correct byte offset. If the key position in the program does not match the offset in DEFINE CLUSTER, VSAM will use the wrong bytes as the key.
Your application program (e.g. COBOL) must define the record so that the key occupies exactly the bytes indicated by KEYS(length offset). For KEYS(12 0), the first 12 bytes of the record must be the key. For KEYS(8 20), bytes 20–27 must be the key; the first 20 bytes can be other data or FILLER. Example for KEYS(10 20) with a 100-byte record:
123401 RECORD-AREA. 05 FILLER PIC X(20). *> Bytes 0-19: header / unused 05 REC-KEY PIC X(10). *> Bytes 20-29: key (offset 20) 05 REC-DATA PIC X(70). *> Bytes 30-99: rest of record
Here the key field REC-KEY starts after 20 bytes of FILLER, so it lines up with offset 20. When you do a READ by key, you move the key value to REC-KEY and issue READ; VSAM finds the record and returns the full 100 bytes. When you REWRITE, the key at bytes 20–29 must be unchanged (you cannot change the primary key on update).
In DEFINE CLUSTER you specify the offset as the second argument of KEYS. Only KSDS uses KEYS; ESDS, RRDS, and LDS do not have a primary key in this sense. Example with key at offset 0 and with key after a 20-byte header:
123456789101112131415161718192021* Key at start of record (offset 0) DEFINE CLUSTER ( - NAME(USERID.CUST.KSDS) - INDEXED - RECORDSIZE(80 80) - KEYS(12 0) - FREESPACE(10 5) - CYLINDERS(5 2)) - DATA (NAME(USERID.CUST.KSDS.DATA)) - INDEX (NAME(USERID.CUST.KSDS.INDEX)) * Key after 20-byte header (offset 20) DEFINE CLUSTER ( - NAME(USERID.TXN.KSDS) - INDEXED - RECORDSIZE(200 200) - KEYS(16 20) - FREESPACE(15 10) - CYLINDERS(10 5)) - DATA (NAME(USERID.TXN.KSDS.DATA)) - INDEX (NAME(USERID.TXN.KSDS.INDEX))
In the first example, the key is the first 12 bytes of each 80-byte record. In the second, the key is 16 bytes starting at byte 20, so bytes 0–19 are available for a header. The record length (200) is greater than 20 + 16 = 36, so the key fits.
Imagine each record is a row of boxes. The key is the box (or boxes) that tell the computer which row it is. The “offset” is how many boxes to skip before the key starts. If the offset is 0, the key is in the first box(es). If the offset is 5, you skip 5 boxes and the key is in the next box(es). You have to tell the computer “skip this many, then the key is here” when you create the file, and your program must put the key in that same place in every row.
1. What does the second value in KEYS(12 5) mean?
2. Is key offset zero-based or one-based in VSAM?
3. Can you change the key offset with ALTER?