The web version only has simple instructions since chapter 04, while the full book has detailed explanations and background info.

0701: Atomic Store

Double Buffering

The previous step had only 1 SSTable level, with a fixed file name, replaced via rename(). An LSM-Tree has multiple SSTable levels and multiple files, so fixed names do not work. The list of file names of levels must be recorded by the DB.

Since the number of levels does not exceed log N, the list of file names does not need incremental updates. We can use the double buffering from 0600 to achieve atomic updates. That is, keep 2 copies with version and checksum, and update them in turn. After power loss, we can recover to the good version.

slot 0 -> [ version 124 | data yyy | crc32 ]
slot 1 -> [ version 123 | data zzz | crc32 ]

Metadata

These 2 records can be 2 files. Define the structs as follows:

type KVMetaStore struct {
    slots [2]KVMetaItem
}
type KVMetaItem struct {
    FileName string
    fp       *os.File
    data     KVMetaData
}
type KVMetaData struct {
    Version uint64
    SSTable string  // file name
}
func (meta *KVMetaStore) Open() error
func (meta *KVMetaStore) Close() error
func (meta *KVMetaStore) Get() KVMetaData
func (meta *KVMetaStore) Set(data KVMetaData) error

KVMetaData is the data to store, which has an monotonic version and an SSTable file name. Write it to the file using any serialization method (JSON is fine). Requirements:

CodeCrafters.io has similar courses in many programming languages, including build your own Redis, SQLite, Docker, etc. It’s worth checking out.