In VSAM, “duplicate keys” means more than one record sharing the same key value. For the primary key of a Key-Sequenced Data Set (KSDS), duplicates are not allowed—each record must have a unique primary key, and an insert with a duplicate key fails. For alternate indexes, you can allow duplicates by defining the alternate index with NONUNIQUEKEY; then multiple base records can have the same alternate key value (e.g. many orders with the same customer ID). Understanding when duplicates are allowed and how to handle duplicate-key errors is important for designing files and writing robust programs. This page explains primary key uniqueness, alternate index UNIQUEKEY vs NONUNIQUEKEY, what happens when a duplicate key is written, and how to handle or avoid duplicates.
The primary key of a KSDS uniquely identifies each record. No two records can have the same primary key value. If your program issues a WRITE (or equivalent) to add a record whose key already exists in the file, VSAM does not add the record. Instead it returns a condition indicating duplicate key (often file status 22 in COBOL or an equivalent return code). The new record is not written; the existing record with that key is unchanged. There is no “replace if exists” or “ignore duplicate” option for the primary key at the VSAM level. The application must ensure that every key it writes is unique, or it must handle the duplicate-key return (e.g. skip the record, log an error, or read the existing record and update it instead).
The KSDS index structure maps each key value to one record. The index entries (sequence set and index set) assume a one-to-one relationship: given a key, there is exactly one record. If two records had the same primary key, the index could not point to both; the search and insert logic would not know which record to return or where to put a new record. So the design of KSDS requires unique primary keys. If your data naturally has a non-unique field (e.g. customer ID that appears on many order records), that field should not be the primary key. Use a unique key (e.g. order ID) as the primary key and create an alternate index on customer ID if you need to access by customer; the alternate index can allow duplicates (NONUNIQUEKEY).
| Key type | Duplicates allowed? |
|---|---|
| Primary key (KSDS) | Not allowed. Every record must have a unique primary key. |
| Alternate index (UNIQUEKEY) | Not allowed. Each alternate key value can appear at most once. |
| Alternate index (NONUNIQUEKEY) | Allowed. Multiple records can share the same alternate key value. |
An alternate index is a secondary access path. It is built over a base cluster (KSDS or ESDS) and allows you to access records by a different key (the alternate key). When you define the alternate index, you specify whether the alternate key must be unique or can have duplicates. UNIQUEKEY means each alternate key value can appear at most once—similar to the primary key. NONUNIQUEKEY means multiple base records can have the same alternate key value. For example, if the base cluster is a KSDS of orders with primary key order-ID, you can define an alternate index on customer-ID with NONUNIQUEKEY so that one customer ID maps to many orders. A path defined over that alternate index lets you read all orders for a given customer. So “duplicate keys” in the sense of multiple records with the same key value are allowed only for alternate keys when you use NONUNIQUEKEY.
When you WRITE a record to a KSDS and the primary key already exists, the WRITE fails. VSAM returns a condition code or file status (e.g. 22 for duplicate key in COBOL). The record is not written. Your program should check the file status after each WRITE and branch on duplicate key: for example, display an error, write the key to a report, skip the record, or read the existing record and perform an update instead of an insert. In batch jobs that load from a sequential file, a common approach is to sort the input by key and remove duplicates before writing to the KSDS, or to build the key so it is unique (e.g. add a sequence number or timestamp to the key). That way you avoid duplicate-key errors during the load.
If the source data can contain duplicate keys, you have several options. (1) Deduplicate: sort the input by key and keep only the first (or last) record per key before writing to the KSDS. (2) Make the key unique: add a field to the key so that each record has a unique key (e.g. line number, sequence number, or timestamp). (3) Update instead of insert: when you get a duplicate-key status, read the existing record by key and REWRITE with the new data (only if that matches your business logic). (4) Log and skip: write duplicate keys to an error file or report and continue. The right choice depends on whether duplicates are errors or expected and how you want to resolve them.
Some applications use a “logical delete” pattern: instead of physically deleting a record, they mark it as deleted (e.g. a flag byte) and may later insert a “new” record with the same key. In a KSDS you cannot have two records with the same primary key. So you cannot insert a new record with the same key as a logically deleted one without first physically deleting the old record (or using a key that is unique, e.g. version number in the key). If you need to support “reuse” of a key, you must delete the old record and then insert the new one; there is no in-place “replace key” for the primary key.
The exact file status or return code for duplicate key depends on the interface. In COBOL with VSAM, status 22 often means “duplicate key” or “record already exists.” In CICS or other environments the code may be different. Check your compiler or runtime documentation. After every WRITE to a KSDS you should test for this condition and handle it. Example pattern:
123456789*> After WRITE to KSDS EVALUATE FILE-STATUS WHEN '00' *> Success WHEN '22' *> Duplicate key - handle (skip, log, or update) WHEN OTHER *> Other error END-EVALUATE
The main key (primary key) is like a unique ID for each drawer—no two drawers can have the same ID. If you try to add a second drawer with the same ID, the system says “that ID is already used” and doesn’t add it. But you can have a second kind of label (alternate key) where many drawers can share the same label—like “all drawers for Customer A.” So “duplicate” main IDs are not allowed; “duplicate” second labels are allowed when you set up the file that way.
1. Can the primary key of a KSDS have duplicate values?
2. What does NONUNIQUEKEY allow?
3. What typically happens when you WRITE a record with a duplicate primary key?