⚠
The web version only has simple instructions since chapter 04, while the full book has detailed explanations and background info.
0605: Log + SSTable
We’ll add SSTables in this step. Queries must merge results from the MemTable and the SSTable. We introduce KV.Compact(), which merges the MemTable (mirroring the log) and the SSTable to produce a new SSTable.
type KV struct {
log Log
mem SortedArray
main SortedFile // added
}
func (file *SortedFile) Open() error
func (kv *KV) Compact() errorDeleted Flag
Now that there are 2 levels of data structures, deleted keys must be recorded in the upper level; otherwise, queries would return keys from the lower level.
type SortedArray struct {
keys [][]byte
vals [][]byte
deleted []bool // added
}
type SortedArrayIter struct {
keys [][]byte
vals [][]byte
deleted []bool // added
pos int
}
func (iter *SortedArrayIter) Deleted() bool { return iter.deleted[iter.pos] }This flag currently exists only in the top-level MemTable; the bottom-level SSTable does not need it. Later, when implementing an LSM-Tree with multiple SSTable levels, intermediate levels will also need this flag.
Related functions in SortedArray need to be modified, including del and set.
func (arr *SortedArray) Clear()
func (arr *SortedArray) Push(key []byte, val []byte, deleted bool)
func (arr *SortedArray) Pop()
func (arr *SortedArray) Del(key []byte) (deleted bool, err error)
func (arr *SortedArray) Set(key []byte, val []byte) (updated bool, err error)Merge Range Queries
Following MergedSortedKV.Scan(), add MergedSortedKV.Seek(), which initializes the iterator with each level’s Seek():
func (m MergedSortedKV) Seek(key []byte) (iter SortedKVIter, err error)The merged range query:
func (kv *KV) Seek(key []byte) (SortedKVIter, error) {
m := MergedSortedKV{&kv.mem, &kv.main}
iter, err := m.Seek(key)
if err != nil {
return nil, err
}
return filterDeleted(iter)
}Output must filter out deleted keys. We can wrap the iterator to do this:
func filterDeleted(iter SortedKVIter) (SortedKVIter, error) {
for iter.Valid() && iter.Deleted() {
if err := iter.Next(); err != nil {
return nil, err
}
}
return NoDeletedIter{iter}, nil
}
type NoDeletedIter struct {
SortedKVIter // inherits all methods
}
// override Next() and Prev()
func (iter NoDeletedIter) Next() (err error)Modify the SSTable Format
When building an SSTable in one pass, we need know the key count to allocate the offset array. However, after merging, duplicate keys and deleted keys are eliminated, so the final key count is smaller than the sum of key counts in all levels. Thus, the offset array size is only an estimate, but it still works as long as we write the true key count after iteration. We need to slightly modify SortedFile.CreateFromSorted(). Also, SortedKV.Size() is renamed to a more fitting name.
type SortedKV interface {
EstimatedSize() int // renamed
Iter() (SortedKVIter, error)
Seek(key []byte) (SortedKVIter, error) // added
}
func (m MergedSortedKV) EstimatedSize() (total int) {
for _, sub := range m {
total += sub.EstimatedSize()
}
return total
}Over-allocating the offset array creates an unused gap in the file:
[ n keys | offset1 | offset2 | ... | offsetn | gap | KV1 | KV2 | ... | KVn ]
Move the Log to an SSTable
KV.Compact() merges the MemTable into the SSTable in 3 steps:
func (kv *KV) Compact() error {
// 1. Merge MemTable and SSTable, output to a temporary file.
path := randomTempPath()
defer os.Remove(path)
file := SortedFile{FileName: path}
m := MergedSortedKV{&kv.mem, &kv.main}
if err := file.CreateFromSorted(m); err != nil {
return err
}
// 2. Replace the original SSTable (atomic operation).
if err := renameSync(file.FileName, kv.main.FileName); err != nil {
_ = file.Close()
return err
}
file.FileName = kv.main.FileName
_ = kv.main.Close()
kv.main = file
// 3. Drop the MemTable and the log.
kv.mem.Clear()
return kv.log.Truncate()
}In case of a power failure, Linux’s rename() syscall guarantees atomicity for the target file. However, to make it durable, you must also fsync the parent directory (see step 0104).
func syncDir(file string) error
func renameSync(src string, dst string) error {
if err := os.Rename(src, dst); err != nil {
return err
}
return syncDir(dst)
}ⓘ
CodeCrafters.io has similar courses in many programming languages, including build your own Redis, SQLite, Docker, etc. It’s worth checking out.