Database Topic
Key-Value Stores
Master key-value stores like Redis for caching and high-performance data access. Essential for system design interviews.
Key-value stores are the simplest form of NoSQL databases, storing data as key-value pairs with fast lookup capabilities.
Basic Concept
A key-value store is like a hash table or dictionary:
Key: "user:123"
Value: {"name": "John", "email": "john@example.com"}
Key: "session:abc123"
Value: {"user_id": 123, "expires_at": "2024-01-20T10:00:00Z"}
Key: "cache:product:456"
Value: {"name": "Widget", "price": 29.99, "stock": 100}
Operations:
GET key- Retrieve valueSET key value- Store valueDELETE key- Remove keyEXISTS key- Check if key exists
Characteristics
Simplicity
- Minimal data model
- Fast operations (O(1) lookup)
- Easy to understand and use
Performance
- Extremely fast reads and writes
- Low latency
- High throughput
Limitations
- No complex queries (no WHERE clauses, JOINs)
- No relationships between keys
- Value is opaque (database doesn't understand structure)
Popular Key-Value Stores
Redis
In-memory data structure store, supports various data types.
# Strings
SET user:123:name "John"
GET user:123:name
# Hashes
HSET user:123 name "John" email "john@example.com"
HGETALL user:123
# Lists
LPUSH notifications:123 "New message"
LRANGE notifications:123 0 -1
# Sets
SADD tags:post:1 "tech" "programming"
SMEMBERS tags:post:1
# Sorted Sets
ZADD leaderboard 100 "player1"
ZRANGE leaderboard 0 -1 WITHSCORES
Features:
- Persistence options (RDB, AOF)
- Pub/Sub messaging
- Lua scripting
- Expiration (TTL)
Memcached
Simple in-memory caching system.
# Python example
import memcache
mc = memcache.Client(['127.0.0.1:11211'])
mc.set('user:123', {'name': 'John'})
value = mc.get('user:123')
Characteristics:
- Pure caching (no persistence)
- Distributed
- Simple protocol
DynamoDB (AWS)
Managed key-value and document database.
// Put item
await dynamodb.put({
TableName: 'Users',
Item: {
userId: '123',
name: 'John',
email: 'john@example.com'
}
});
// Get item
const result = await dynamodb.get({
TableName: 'Users',
Key: { userId: '123' }
});
Features:
- Fully managed
- Auto-scaling
- Global tables (multi-region)
- Streams for change data capture
etcd
Distributed key-value store for configuration and service discovery.
Use cases:
- Kubernetes (stores cluster state)
- Service discovery
- Distributed locking
- Configuration management
Use Cases
Caching
Store frequently accessed data to reduce database load.
Key: "cache:user:123"
Value: User object (JSON)
TTL: 3600 seconds
Benefits:
- Reduce database queries
- Faster response times
- Lower database load
Session Storage
Store user session data.
Key: "session:abc123def456"
Value: {"user_id": 123, "login_time": "2024-01-15T10:00:00Z"}
TTL: 1800 seconds (30 minutes)
Rate Limiting
Track API request counts.
Key: "ratelimit:api:user:123"
Value: 42 (request count)
TTL: 60 seconds (resets every minute)
Leaderboards
Real-time rankings.
ZADD leaderboard 1000 "player1"
ZADD leaderboard 950 "player2"
ZADD leaderboard 1100 "player3"
ZREVRANGE leaderboard 0 9 WITHSCORES # Top 10
Real-Time Features
- Counters: Like counts, view counts
- Presence: Who's online
- Queues: Task queues, message queues
- Pub/Sub: Real-time notifications
Data Modeling Patterns
Namespacing
Use prefixes to organize keys:
user:123:profile
user:123:settings
user:123:preferences
session:abc123
session:def456
cache:product:789
cache:product:790
Composite Keys
Combine multiple values into a key:
order:123:item:456 # Order 123, item 456
user:123:friend:789 # User 123's friend 789
Serialization
Store complex data by serializing:
# JSON
import json
value = json.dumps({"name": "John", "age": 30})
redis.set("user:123", value)
# MessagePack (more efficient)
import msgpack
value = msgpack.packb({"name": "John", "age": 30})
redis.set("user:123", value)
Advanced Features
Expiration (TTL)
Automatically delete keys after a time period.
SET session:abc123 "data" EX 3600 # Expires in 3600 seconds
SET session:def456 "data" PX 3600000 # Expires in 3600000 milliseconds
Atomic Operations
# Increment
INCR page_views:article:123
INCRBY counter:123 5
# Compare and swap
SET key "old_value"
GETSET key "new_value" # Returns old value, sets new
# Conditional set
SETNX key "value" # Only set if key doesn't exist
Transactions
MULTI
SET key1 "value1"
SET key2 "value2"
INCR counter
EXEC # Execute all commands atomically
When to Use Key-Value Stores
Good Fit
- Caching: Reduce database load
- Session storage: Fast session lookups
- Real-time data: Counters, leaderboards
- Simple lookups: By ID or known key
- Temporary data: Data with expiration
Not a Good Fit
- Complex queries: Need WHERE, JOIN, GROUP BY
- Relationships: Data with foreign keys
- Analytics: Complex aggregations
- Structured queries: Ad-hoc reporting
Best Practices
- Use appropriate TTL: Set expiration for temporary data
- Namespace keys: Organize with prefixes
- Monitor memory: In-memory stores have size limits
- Handle failures: Key-value stores can be ephemeral
- Choose serialization: JSON is readable, MessagePack is efficient
Common Patterns
Cache-Aside Pattern
def get_user(user_id):
# Try cache first
cached = redis.get(f"user:{user_id}")
if cached:
return json.loads(cached)
# Cache miss: query database
user = db.query_user(user_id)
# Store in cache
redis.setex(f"user:{user_id}", 3600, json.dumps(user))
return user
Write-Through Pattern
def update_user(user_id, data):
# Update database
db.update_user(user_id, data)
# Update cache
redis.setex(f"user:{user_id}", 3600, json.dumps(data))
Distributed Locking
def acquire_lock(lock_key, timeout=10):
lock_value = str(uuid.uuid4())
if redis.set(lock_key, lock_value, nx=True, ex=timeout):
return lock_value
return None
def release_lock(lock_key, lock_value):
# Only release if we own the lock
if redis.get(lock_key) == lock_value:
redis.delete(lock_key)
Interview Questions
1. Beginner Question
Q: What is a key-value store, and what are its main use cases?
A: A key-value store is the simplest NoSQL database model, storing data as key-value pairs with fast lookup capabilities.
Main use cases:
- Caching: Store frequently accessed data to reduce database load
- Session storage: Fast session lookups for web applications
- Real-time data: Counters, leaderboards, presence indicators
- Rate limiting: Track API request counts
- Simple lookups: By ID or known key
Example:
# Caching user data
redis.set("user:123", json.dumps({"name": "John", "email": "john@example.com"}))
user = json.loads(redis.get("user:123"))
Why use it: Extremely fast (O(1) lookup), low latency, high throughput.
2. Intermediate Question
Q: Explain the cache-aside pattern and when to use it vs. write-through.
A:
Cache-Aside (Lazy Loading):
def get_user(user_id):
# 1. Check cache
cached = redis.get(f"user:{user_id}")
if cached:
return json.loads(cached)
# 2. Cache miss: query database
user = db.query_user(user_id)
# 3. Store in cache
redis.setex(f"user:{user_id}", 3600, json.dumps(user))
return user
Pros: Simple, only caches accessed data Cons: Cache miss penalty, possible stale data
Write-Through:
def update_user(user_id, data):
# 1. Update database
db.update_user(user_id, data)
# 2. Update cache
redis.setex(f"user:{user_id}", 3600, json.dumps(data))
Pros: Cache always consistent with database Cons: Write penalty (updates both cache and DB)
When to use:
- Cache-aside: Read-heavy workloads, can tolerate stale data
- Write-through: Write-heavy, need strong consistency
3. Senior-Level System Question
Q: Design a distributed rate limiting system using Redis that can handle 10M requests/second across 100 servers. How would you prevent a single user from exceeding their rate limit?
A:
Solution: Token bucket algorithm with Redis:
def check_rate_limit(user_id, limit=100, window=60):
key = f"ratelimit:{user_id}"
now = time.time()
# Use Redis pipeline for atomic operations
pipe = redis.pipeline()
# Remove old entries
pipe.zremrangebyscore(key, 0, now - window)
# Count current requests
pipe.zcard(key)
# Add current request
pipe.zadd(key, {str(now): now})
# Set expiration
pipe.expire(key, window)
results = pipe.execute()
current_count = results[1]
if current_count >= limit:
return False, limit - current_count # Rate limited
return True, limit - current_count - 1 # Allowed
Alternative: Sliding window log:
def check_rate_limit_sliding(user_id, limit=100, window=60):
key = f"ratelimit:{user_id}"
now = time.time()
# Remove requests outside window
redis.zremrangebyscore(key, 0, now - window)
# Count requests in window
count = redis.zcard(key)
if count >= limit:
return False
# Add current request
redis.zadd(key, {str(now): now})
redis.expire(key, window)
return True
Distributed approach (multiple servers):
# Use Redis with consistent hashing for sharding
def check_rate_limit_distributed(user_id, limit=100, window=60):
# Hash user_id to determine Redis shard
shard = hash(user_id) % num_redis_shards
redis_client = redis_shards[shard]
# Use same algorithm on sharded Redis
return check_rate_limit(user_id, limit, window, redis_client)
Optimizations:
- Lua scripts: Execute atomically on Redis server
- Sharding: Distribute load across multiple Redis instances
- Local cache: Cache rate limit status locally to reduce Redis calls
- Batch updates: Batch multiple checks in pipeline
Monitoring:
- Track rate limit hits/misses
- Monitor Redis memory usage
- Alert on high rate limit violations
Key Takeaways
- Key-value stores are the simplest NoSQL model—extremely fast O(1) lookups
- Primary use cases: Caching, session storage, real-time data, rate limiting
- Cache-aside pattern loads data on cache miss, good for read-heavy workloads
- Write-through pattern updates cache and DB together, ensures consistency
- Redis is the most popular key-value store, supports many data structures
- TTL (time-to-live) is essential for temporary data (sessions, cache)
- Namespace keys with prefixes (e.g., "user:123") for organization
- Distributed locking using Redis SETNX for coordination across servers
- Not for complex queries—use for simple lookups, not JOINs or aggregations
- Memory management is critical—monitor memory usage and set eviction policies
- Persistence options in Redis (RDB snapshots, AOF) for durability
- Atomic operations (INCR, SETNX) are powerful for counters and locks
Keep exploring
Database concepts build on each other. Explore related topics to deepen your understanding of how data systems work.