The web version only has simple instructions since chapter 04, while the full book has detailed explanations and background info.

0405: Range Query

KV Range Query

The Seek() function only supports open-ended range queries (key >= 123):

func (kv *KV) Seek(key []byte) (*KVIterator, error)

Now add a close-ended range query API:

func (kv *KV) Range(start, stop []byte, desc bool) (*RangedKVIter, error)

The desc flag controls the traversal order and can be used for ORDER BY:

RangedKVIter just wraps the original KVIterator:

type RangedKVIter struct {
    iter KVIterator
    stop []byte
    desc bool
}

func (iter *RangedKVIter) Key() []byte {
    return iter.iter.Key()
}

func (iter *RangedKVIter) Val() []byte {
    return iter.iter.Val()
}

With a stop condition during iteration:

func (iter *RangedKVIter) Valid() bool {
    if !iter.iter.Valid() {
        return false
    }

    r := bytes.Compare(iter.iter.Key(), iter.stop)
    if iter.desc && r < 0 {
        return false
    } else if !iter.desc && r > 0 {
        return false
    }
    return true
}

Next() must check the traversal order:

func (iter *RangedKVIter) Next() error {
    if !iter.Valid() {
        return nil
    }
    if iter.desc {
        return iter.iter.Prev()
    } else {
        return iter.iter.Next()
    }
}

Range() also wraps Seek(). Implement it yourself. Note that you may need to adjust the initial iterator position, since Seek() is >=, while Range() can be >= or <=.

func (kv *KV) Range(start, stop []byte, desc bool) (*RangedKVIter, error) {
    iter, err := kv.Seek(start)
    if err != nil {
        return nil, err
    }
    // TODO
}

Prefix Comparison

Now add a similar Range() API to DB. A new issue appears: a primary key or index can consist of multiple columns, and a query may use only a prefix. For example, if the primary key is (a, b, c), these queries are valid:

If the prefix (123, 4) is encoded and used for K ≤ (123, 4) in KV, the actual result is (a, b, c) < (123, 4). So we introduce infinity and pad the prefix to a full key. All 4 prefix comparisons can be converted to full key comparisons:

Prefix Full
(a, b) < (1, 2) (a, b, c) ≤ (1, 2, −∞)
(a, b) ≤ (1, 2) (a, b, c) ≤ (1, 2, +∞)
(a, b) > (1, 2) (a, b, c) ≥ (1, 2, +∞)
(a, b) ≥ (1, 2) (a, b, c) ≥ (1, 2, −∞)

If we include infinity in KV key encoding, prefix comparison becomes possible. Add a function to encode key prefixes:

func EncodeKeyPrefix(schema *Schema, prefix []Cell, positive bool) []byte

The positive flag decides whether the suffix is +∞ or −∞. Encode infinity as follows:

func (row Row) EncodeKey(schema *Schema) []byte {
    key := append([]byte(schema.Table), 0x00)
    for _, idx := range schema.PKey {
        cell := row[idx]
        key = append(key, byte(cell.Type))  // avoid 0xff
        key = cell.EncodeKey(key)
    }
    return append(key, 0x00)                // > -infinity
}

Since −∞ is encoded as an empty string, we need to avoid conflicts between (a, b) and (a, b, −∞) by appending a trailing 0x00 byte, which is greater than −∞.

func EncodeKeyPrefix(schema *Schema, prefix []Cell, positive bool) []byte {
    key := append([]byte(schema.Table), 0x00)
    for i, cell := range prefix {
        key = append(key, byte(cell.Type))  // avoid 0xff
        key = cell.EncodeKey(key)
    }
    if positive {
        key = append(key, 0xff)             // +infinity
    }                                       // -infinity
    return key
}

DB Range Query

Add a close-ended range query API that supports 4 comparison operators:

type ExprOp uint8

const (
    OP_LE ExprOp = 12 // <=
    OP_GE ExprOp = 13 // >=
    OP_LT ExprOp = 14 // <
    OP_GT ExprOp = 15 // >
)

type RangeReq struct {
    StartCmp ExprOp // <= >= < >
    StopCmp  ExprOp
    Start    []Cell
    Stop     []Cell
}

func (db *DB) Range(schema *Schema, req *RangeReq) (*RowIterator, error)

The traversal direction is decided by StartCmp: >= and > mean ascending, otherwise descending. Start and Stop can be full primary keys or prefixes, and their lengths do not need to match.

Replace the KVIterator in RowIterator with RangedKVIter:

type RowIterator struct {
    schema *Schema
    iter   *RangedKVIter // replaced
    valid  bool
    row    Row
}

The rest is left to the reader:

func (db *DB) Range(schema *Schema, req *RangeReq) (*RowIterator, error) {
    start := EncodeKeyPrefix(schema, req.Start, suffixPositive(req.StartCmp))
    stop := EncodeKeyPrefix(schema, req.Stop, suffixPositive(req.StopCmp))
    desc := isDescending(req.StartCmp)
    iter, err := db.KV.Range(start, stop, desc)
    if err != nil {
        return nil, err
    }
    // TODO
}

CodeCrafters.io has similar courses in many programming languages, including build your own Redis, SQLite, Docker, etc. It’s worth checking out.