Inserting a record into a VSAM Key Sequenced Data Set (KSDS) means adding a new record in key order. You do not choose the physical position—VSAM does. You open the file for output (or I-O), move the record including its key into the record area, and issue WRITE. VSAM finds the correct control interval for that key, places the record there in key order, and updates the index. If there is not enough free space in that CI, VSAM performs a control interval split (or, if necessary, a control area split) to make room. This page explains how dynamic insert works, how key order is maintained, the role of free space and splits, and how to tune FREESPACE to balance insert performance and storage use.
"Dynamic" here means you can add records at any time, in any key order. You are not limited to appending at the end. Each WRITE supplies a new record with a key; VSAM inserts it into the logical key sequence. The file remains sorted by key at all times. So you might WRITE a record with key 500, then key 100, then key 300; after each WRITE, the file still has records in key order (100, 300, 500, …). This is different from an ESDS, where records are stored in entry order (append-only) and there is no key-based insert.
To insert, the file must be opened in a mode that allows writing: OUTPUT (for load or rebuild), EXTEND (append), or I-O (input-output, for mixed read/insert/update/delete). You cannot insert when the file is opened INPUT only. After OPEN, you move the record to the FD record area (the key field must be part of it and must match the RECORD KEY) and execute WRITE record-name. The key must be unique unless the cluster was defined to allow duplicate keys (alternate key with DUPLICATES). If you WRITE a duplicate primary key, you get an error (e.g. file status 22).
When you WRITE a record, VSAM uses the key to determine which control interval (CI) should hold it. The index component maps key ranges to data CIs. VSAM looks up the CI that contains the key range for your new key, reads that CI (if not already in a buffer), and checks whether there is enough free space in the CI to hold the new record. If yes, VSAM inserts the record in key order within the CI (shifting existing records or using the free space), updates the CI control information (RDFs, CIDF), and writes the CI back. The index already points to that CI; if the high key of the CI changed, the index entry may be updated. If there is not enough free space, VSAM performs a CI split (and possibly a CA split) before inserting.
When you define the cluster with DEFINE CLUSTER, you can specify FREESPACE(ci-percent ca-percent). The first value is the percentage of each control interval to leave free for inserts; the second is the percentage of each control area to leave free (for new CIs when a CA runs out of CIs). For example FREESPACE(20 10) reserves 20% of each CI and 10% of each CA. That space is not used during the initial load (or is used only partially); it is available for later inserts. If you insert records with keys that fall in many different CIs (random insert pattern), having more free space (e.g. 20–25% CI) reduces how often a CI is full when you need to insert. If you mostly append in key order (e.g. ascending keys), inserts go into the "current" end of the file and free space in the middle matters less. So FREESPACE(0 0) might be acceptable for append-only loads; FREESPACE(20 25) is common for files with random inserts.
When the target CI has no free space, VSAM cannot simply put the new record there. It performs a control interval split. Roughly half the records in that CI are moved to a free CI in the same control area (or VSAM uses the free space in the CA to create a new CI). The new record is then inserted in key order in the appropriate CI. The index (sequence set) is updated so that both CIs are correctly referenced. A CI split involves multiple I/Os: read the full CI, possibly read a free CI or allocate one, write two CIs, update the index. So splits are more expensive than inserts that fit in existing free space.
If there is no free CI in the control area, VSAM must perform a control area split: it allocates or uses another CA, moves some CIs (or their contents) to the new CA, and updates the index. That is even more expensive. FREESPACE with a positive CA percentage reserves whole CIs (or CA space) so that new CIs can be created without a CA split as often.
| Split type | When it happens |
|---|---|
| CI split | No free space in the CI that should hold the new record; ~half the records move to another CI in the same CA |
| CA split | No free CI in the CA; ~half the CIs move to a new CA; index updated |
After opening the file (I-O or OUTPUT), move the record to the record area and issue WRITE. The key field (RECORD KEY) must contain the key value. Example:
123456789MOVE 'CUST000099' TO CUST-ID. MOVE 'NEW CUSTOMER NAME ' TO CUST-NAME. MOVE 0 TO CUST-BAL. WRITE CUST-REC INVALID KEY DISPLAY 'Write failed, key may be duplicate: ' CUST-ID NOT INVALID KEY DISPLAY 'Inserted: ' CUST-ID END-WRITE.
INVALID KEY can mean duplicate key (if the key already exists and duplicates are not allowed) or other write errors. NOT INVALID KEY means the record was inserted. After a successful WRITE, the record is in the file in key order and can be read by key (random READ) or in sequence (READ NEXT).
If you load a KSDS with records in ascending key order (sequential insert), each new record often goes into the same or the next CI. That can minimize splits because you are filling CIs in order. If you insert in random key order, inserts hit many different CIs and free space is used up more quickly; splits occur more often. So for initial load, sorting the input by key and writing in order is often more efficient. For online or batch inserts after load, adequate FREESPACE at define time is the main lever to reduce splits. You cannot change FREESPACE after the cluster is created; you would need to define a new cluster with different FREESPACE and REPRO the data.
Imagine a row of boxes in number order. When you add a new box with a number, you don't just put it at the end—you find where it belongs in the row and squeeze it in. If that spot is full, you have to move some boxes to a new row (split) to make room. The "free space" is empty spots you left in advance so you can squeeze in new boxes without moving others so often.
1. What COBOL verb inserts a new record into a KSDS?
2. When does a CI split occur?
3. What does FREESPACE(20 10) do?