Distributed Cache System Design (Sharding, Eviction & Scale)

Scenario

You’re building an in-memory cluster that feels like Redis or Memcached at scale: sub-millisecond reads, horizontal add/remove of nodes, and explicit policies for eviction, replication, and what “consistent” means when failures and partitions happen. The interview is less about memorizing RDB/AOF flags and more about partitioning, hot keys, and failure behavior.

Design a distributed caching system like Redis or Memcached that can store key-value pairs and serve millions of requests per second.

Constraints

Functional

Set/Get, TTL, eviction when full, atomic ops (e.g. increment), optional persistence

Non-functional

Millions of ops/s, under 1 ms latency, 99.9% availability, horizontal scaling

Scale

10M ops/s, 100 GB cache, 100M keys, ~1 KB average value

Stages ahead

1Requirement Analysis
2API Design
3High-Level Design
4HLD Extensions
5Trade-offs