A lookup table is a reference dataset—usually relatively small—that is keyed by a code or ID (e.g. product code, state code, account type). In batch processing you often need to enrich a main file (transactions, orders, detail records) with descriptive data from such a table: for example, add product name to each order line by matching product code. In DFSORT you do this with JOINKEYS: join the main file with the lookup on the key and use REFORMAT to pull the lookup fields into the output. This page explains how to set up a lookup join, when to use inner vs left join, and how to avoid duplicate rows when the lookup has duplicate keys.
A lookup table (or reference file) is a dataset that holds rows of the form (key, value1, value2, …)—for example, product code and product name, or state code and state full name. The main file you are processing has the key (e.g. product code on each order line) but not the description. The goal is to produce an output record that contains both the main record data and the looked-up fields (e.g. order line plus product name). In DFSORT there is no separate "lookup" verb; you use a join. The main file is one input (typically F1) and the lookup table is the other (typically F2). You match on the key and use REFORMAT to define the output as main-record fields plus lookup fields.
By convention the main file (the one you want to keep every record of) is the first file (F1), and the lookup is the second file (F2). In JCL you allocate the main file to SORTJNF1 and the lookup to SORTJNF2. In control statements you code two JOINKEYS (one for each file) with FIELDS= specifying the join key position and format for that file. The key must be in the same order in both files (e.g. ascending character). Then REFORMAT FIELDS=(F1:pos,len,...,F2:pos,len,...) defines the output: which segments from the main record and which from the lookup record go into the joined row. After the join you usually use SORT FIELDS=COPY because the order of the joined output is already determined by the merge; if you need a different order you can specify SORT FIELDS=.
If you use only JOINKEYS and REFORMAT without JOIN UNPAIRED, you get an inner join: only main records that have a matching key in the lookup appear in the output. Main records whose key is not in the lookup are dropped. That is useful when you only want rows with valid lookup data (e.g. exclude invalid product codes). If you want to keep every main record and still add lookup data when available, use JOIN UNPAIRED,F1 (left join). Then every F1 (main) record is written; when the key matches an F2 (lookup) row, the REFORMAT output includes the lookup fields; when there is no match, the F2 portion of the REFORMAT is blank-filled (or low values for numeric fields). So you get one output row per main record, with lookup fields blank when the code is not in the lookup table.
| Join type | When to use |
|---|---|
| Inner join (no JOIN UNPAIRED) | Only main records that have a matching key in the lookup appear; unknown codes are dropped. |
| Left join (JOIN UNPAIRED,F1) | Every main record appears; lookup fields are filled when the key matches, blank when it does not. |
Main file (orders): positions 1–8 order ID, 9–16 product code (CH), 17–24 quantity. Lookup: 1–8 product code (CH), 9–48 product name (CH). Key is product code. We want one output row per order with order ID, product code, quantity, and product name. Use left join so orders with unknown product codes still appear with a blank product name.
12345JOINKEYS FIELDS=(9,8,CH,A) JOINKEYS FIELDS=(1,8,CH,A) REFORMAT FIELDS=(F1:1,24,F2:9,40) JOIN UNPAIRED,F1 SORT FIELDS=COPY
F1 (main) key is 9–8 (product code). F2 (lookup) key is 1–8. REFORMAT takes F1 bytes 1–24 (order ID, product code, quantity) and F2 bytes 9–48 (product name, 40 bytes). JOIN UNPAIRED,F1 keeps every order; when product code is in the lookup, product name is added; when not, the 40-byte product name area is blank. Allocate main file to SORTJNF1 and lookup to SORTJNF2 (or use F2= if you use a different DD name for the lookup).
JOINKEYS matches every F1 record with every F2 record that has the same key. So if the lookup has two rows for the same product code, one order line with that code will produce two output rows. For a true "lookup" you usually want at most one row per key in the lookup. To achieve that, deduplicate the lookup before the join: run a prior step that reads the lookup, sorts by the key, and uses SUM FIELDS=NONE (or keeps one record per key with INREC/OUTREC if you need to choose which duplicate to keep). Write the result to a temporary dataset and use that as SORTJNF2. Then each main record will match at most one lookup row and you get one output row per main record (for left join) or one per matched pair (for inner join).
DFSORT JOINKEYS joins two files per step. To enrich with two lookup tables (e.g. product name and warehouse name), run two join steps. Step 1: main file + first lookup (e.g. product) → output1. Step 2: output1 + second lookup (e.g. warehouse) on the appropriate key → final output. The key in step 2 is the key in the first join output (e.g. warehouse code), and REFORMAT in step 2 includes the existing fields from output1 plus the new lookup fields. So lookup tables are applied one at a time by chaining join steps.
You have a list of orders that say "product 7" or "product 12" and a small list that says "7 = Apples" and "12 = Oranges". A lookup is like having a friend read your order list and the small list: for each order they write down the order and then look up the number in the small list and write "Apples" or "Oranges" next to it. If an order has a number that is not in the small list, they can either skip that order (inner join) or still write the order and leave the name blank (left join). DFSORT does that by joining the two lists on the number and then building a new line that has the order plus the name from the small list.
1. What is a "lookup table" in the context of DFSORT?
2. You have a main file (orders) and a lookup (product code, product name). You want every order row to show the product name. Some orders may have a product code not in the lookup. What join type do you use?
3. For a lookup table join, which file is usually F1 and which is F2?
4. What if the lookup table has duplicate keys (e.g. two rows for the same product code)?
5. Why is it often better to put the smaller file (the lookup) in memory or to have it as F2?