← cs
$ cat projects/systems-mini-db-cache-kv-store.md

Systems Mini-DB / Cache / KV Store

A collection of storage and caching components: LRU cache, log-structured KV, and B+Tree variants.

2024-08-20
CPythonLinux

Systems Mini-DB / Cache / KV Store

A collection of from-scratch storage system implementations built to understand the fundamentals of databases and caching layers.

What I Built

Three progressively complex storage components:

1. LRU Cache

A fixed-size in-memory cache with O(1) operations using a hash map and doubly-linked list combination. Implemented in both C (for understanding memory layout) and Python (for rapid prototyping and comparison).

2. Log-Structured Key-Value Store

A simple LSM-tree inspired storage engine:

  • Append-only write log for durability
  • In-memory memtable (sorted map) for recent writes
  • Background compaction to merge and garbage-collect old entries
  • Bloom filters to avoid unnecessary disk reads

3. B+Tree Index

A disk-oriented B+Tree implementation:

  • Page-based storage with configurable page size
  • Leaf nodes form a linked list for range scans
  • Interior nodes for logarithmic search
  • Simple buffer pool for caching hot pages

Technical Details

Write Path (LSM)

Writes follow the standard LSM pattern:

  1. Append to write-ahead log (WAL) for crash recovery
  2. Insert into memtable
  3. When memtable reaches threshold, flush to sorted run on disk
  4. Background compaction merges sorted runs

The implementation handles edge cases: partial writes, recovery after crash, and concurrent reads during compaction.

Read Path

Reads check (in order):

  1. Memtable (most recent writes)
  2. Bloom filter for each sorted run
  3. Binary search within matching runs

Benchmarking

Built a simple benchmarking harness to measure performance characteristics:

  • Throughput under different read/write ratios
  • Latency characteristics for reads and writes
  • Space amplification from compaction lag
  • Memory usage vs. hit rate trade-offs (cache)

Results / Learnings

Running benchmarks on different workloads revealed:

  • Write-heavy: LSM excels—sequential I/O is fast
  • Read-heavy with locality: LRU cache + B+Tree wins
  • Mixed random: Trade-offs depend heavily on memory budget

Key insights:

  • The gap between memory and disk speed is enormous—caching matters
  • Compaction scheduling is surprisingly complex for real workloads
  • Understanding memory layout and cache lines significantly improved C code performance
  • Simple systems are easier to debug and reason about; start simple, optimize with data