Range filtering in DFSORT means keeping or omitting records where a field's value falls between two values—for example, "keep records where amount is between 100 and 500" or "omit records where the code is outside A100–A999." There is no separate "BETWEEN" operator; you build a range by combining two comparisons with AND: one that the field is greater than or equal to the low value, and one that it is less than or equal to the high value. So "between 100 and 500 inclusive" is field >= 100 AND field <= 500. In COND= you write that as two 5-tuples (start, length, format, operator, value) connected by AND. You can use the same idea for exclusive ranges (strictly between, excluding the boundaries) by using GT and LT instead of GE and LE. This page explains inclusive and exclusive ranges for both numeric and character fields, how to omit records inside or outside a range, and how to combine ranges with other conditions.
An inclusive range means the boundary values are included. For example, "between 100 and 500 inclusive" includes 100 and 500. In COND= you express that as:
Example: keep records where a 4-byte packed-decimal field at position 20 is between 100 and 500 inclusive:
1INCLUDE COND=(20,4,PD,GE,100,AND,20,4,PD,LE,500)
The first condition (20,4,PD,GE,100) keeps records with value >= 100. The second (20,4,PD,LE,500) keeps records with value <= 500. AND means both must hold, so the result is 100 <= value <= 500. The same (start,length,format) is repeated because you are testing the same field twice. For a character field—for example a 4-byte code at position 1—you would use CH and character constants:
1INCLUDE COND=(1,4,CH,GE,C'A100',AND,1,4,CH,LE,C'A999')
This keeps records where the character field collates between A100 and A999 (inclusive). Character order follows the EBCDIC collating sequence (or AC for ASCII if specified).
An exclusive range means the boundary values are excluded. "Strictly between 0 and 100" would include 1–99 but not 0 or 100. Use GT (greater than) for the low bound and LT (less than) for the high bound:
1INCLUDE COND=(20,4,PD,GT,0,AND,20,4,PD,LT,100)
So only values 1 through 99 are kept. This is useful when 0 or 100 have special meaning (e.g. missing or invalid) and you want to exclude them. GT and LT never include the comparison value; GE and LE do.
Sometimes you want one boundary included and the other excluded. For example, "from 100 up to but not including 500" is a half-open interval [100, 500). Use GE for the low and LT for the high:
1INCLUDE COND=(20,4,PD,GE,100,AND,20,4,PD,LT,500)
That keeps 100, 101, … 499 but not 500. Conversely, (20,4,PD,GT,100,AND,20,4,PD,LE,500) keeps 101–500 but not 100. Choosing GE/LE vs GT/LT lets you control exactly which endpoints are in the range.
| Range type | Operators | Effect |
|---|---|---|
| Inclusive [low, high] | GE low, LE high | Boundaries included |
| Exclusive (low, high) | GT low, LT high | Boundaries excluded |
| Half-open [low, high) | GE low, LT high | Low included, high excluded |
| Half-open (low, high] | GT low, LE high | Low excluded, high included |
To omit records that fall inside a range (and keep everything else), use OMIT with the same range condition:
1OMIT COND=(20,4,PD,GE,100,AND,20,4,PD,LE,500)
This drops every record where the field is between 100 and 500 inclusive. The output contains only records with value < 100 or value > 500. So OMIT with a range is the complement of INCLUDE with that range.
If you want to keep only records that are outside a range (below the low or above the high), you have two options. Option 1: use OMIT for the range (as above); the remaining records are outside the range. Option 2: use INCLUDE with OR: (field < low OR field > high). In COND= that is:
1INCLUDE COND=(20,4,PD,LT,100,OR,20,4,PD,GT,500)
So you keep records where the value is less than 100 or greater than 500. Both approaches yield the same set; choose the one that reads more clearly for your case (OMIT range vs INCLUDE outside range).
For numeric fields (PD, ZD, BI, etc.), the comparison is numeric: 99 is less than 100, and 501 is greater than 500. For character (CH) fields, the comparison is byte-by-byte collating order. So the "range" is in character order, not necessarily numeric. For example, with CH, "9" can collate higher than "10" depending on encoding, so numeric-looking character fields should often be converted to a numeric format (or normalized in INREC) before range filtering if you want numeric semantics. For truly numeric data stored as PD or ZD, use PD or ZD in COND= so the range is numeric.
You can combine a range with additional conditions using AND or OR. For example, keep records where amount is between 100 and 500 and region is "NY":
1INCLUDE COND=(20,4,PD,GE,100,AND,20,4,PD,LE,500,AND,5,2,CH,EQ,C'NY')
The range is one logical condition (GE low AND LE high); the third condition (region = NY) is ANDed with it. For more complex logic (e.g. (range AND code=A) OR (range AND code=B)), use parentheses as allowed by your DFSORT product; see the AND/OR and complex conditional expressions pages.
Imagine a number line. You want to keep only the numbers between 100 and 500. You put a "fence" at 100 and one at 500. "Between" can mean two things: (1) including the fences—so 100 and 500 are allowed (inclusive). (2) not including the fences—so only 101, 102, … 499 (exclusive). In DFSORT we don't have one word for "between"; we say "at least 100" (GE 100) and "at most 500" (LE 500), and we connect them with AND. If we want to leave out the fences, we say "more than 100" (GT 100) and "less than 500" (LT 500). Same idea for letters or codes: we say "from A100 up to A999" by using GE and LE (or GT and LT) on that field.
1. How do you keep records where a numeric field at 20, length 4 (PD) is between 100 and 500 inclusive?
2. What is the difference between inclusive and exclusive range in COND=?
3. For a character field at 1–8 (e.g. a code), how do you omit records where the code is outside the range 'A100' through 'A999'?
4. Can you specify a range using two different fields (e.g. keep when field A >= 10 and field B <= 20)?
5. Why might you use GT/LT instead of GE/LE for a numeric range?