Topic Overview

Caching Strategies (Backend Fundamentals)

Learn caching basics: read-through vs write-through, TTLs, invalidation, and when caches help or hurt.

28 min read

Caching Strategies

Why Engineers Care About This

Caching is the art of remembering expensive computations so you don't have to do them again. When done right, caching makes systems fast. When done wrong, caching makes systems inconsistent, complex, and hard to debug. Most performance problems can be solved with caching, but most caching problems come from not understanding when and how to cache.

When your API is slow, or your database is overloaded, or users see stale data, you're hitting caching problems (or missing caching opportunities). These problems compound. A slow API gets more requests (users retry), which makes it slower, which gets more requests. Caching breaks this cycle by serving responses from memory instead of recomputing them. But caching adds complexity—you must decide what to cache, when to invalidate, and how to handle cache misses.

In interviews, when someone asks "How would you improve this system's performance?", they're really asking: "Do you understand caching? Do you know when to cache, what to cache, and how to invalidate caches?" Most engineers don't. They add caching without understanding trade-offs, or avoid caching because it's "too complex," missing performance gains.

Core Intuitions You Must Build

  • Caching is about trade-offs, not free performance. Caching improves performance but adds complexity. You must decide what to cache (frequently accessed data), when to invalidate (when data changes), and how to handle cache misses (recompute or return stale data). Also, caches use memory, which costs money. Don't cache everything—cache the things that provide the most benefit (hot data, expensive computations).

  • Cache invalidation is the hard part. Deciding what to cache is easy. Deciding when to invalidate is hard. If you invalidate too early, you waste cache space. If you invalidate too late, users see stale data. Common strategies: time-based expiration (TTL), event-based invalidation (invalidate when data changes), and version-based invalidation (include version in cache key). Each has trade-offs. Time-based is simple but can serve stale data. Event-based is accurate but requires tracking dependencies.

  • Different cache layers solve different problems. Browser cache (for static assets), CDN cache (for global content), application cache (for computed results), database cache (for query results). Each layer has different characteristics (size, latency, invalidation). Use multiple layers—browser cache for static assets (long TTL), CDN for global content (medium TTL), application cache for computed results (short TTL or event-based invalidation).

  • Cache consistency is about guarantees, not perfection. Perfect cache consistency (cache always matches source) is expensive and often unnecessary. Eventual consistency (cache will match source eventually) is often sufficient. For read-heavy workloads, serving slightly stale data (seconds old) is acceptable if it improves performance. For write-heavy workloads, you might need stronger consistency (invalidate immediately on writes). Understand your consistency requirements and choose caching strategies accordingly.

  • Cache-aside is simple, write-through is consistent. Cache-aside: application checks cache, if miss, fetches from source and populates cache. Simple but can have race conditions (two requests miss cache, both fetch from source). Write-through: application writes to cache and source simultaneously. More consistent but slower writes. Write-behind: application writes to cache immediately, writes to source asynchronously. Fastest but can lose data if cache fails. Choose based on your consistency and performance requirements.

  • Cache keys matter more than you think. Cache keys determine what gets cached and when it gets invalidated. Good cache keys are specific (include all parameters that affect the result) but not too specific (avoid cache fragmentation). Include user ID in keys for user-specific data, but don't include timestamps unless necessary (they prevent caching). Also, cache keys should be deterministic (same input = same key) so you can predict cache hits.

Subtopics (Taught Through Real Scenarios)

Cache Invalidation Strategies

What people usually get wrong:

Engineers often think "just set a TTL and forget about it." But TTL-based invalidation can serve stale data (data changes but cache hasn't expired) or waste cache space (data doesn't change but cache expires). Event-based invalidation (invalidate when data changes) is more accurate but requires tracking dependencies. Version-based invalidation (include version in cache key) is simple and effective—when data changes, version changes, old cache entries become unused.

How this breaks systems in the real world:

An API cached user profiles with a 1-hour TTL. A user updated their profile. For up to 1 hour, API responses showed the old profile (cache hadn't expired). Users complained about stale data. The fix? Use event-based invalidation—when a user updates their profile, invalidate their cache entry immediately. But the real lesson is: TTL-based invalidation is simple but can serve stale data. Event-based invalidation is more accurate but requires tracking what to invalidate.

What interviewers are really listening for:

They want to hear you talk about different invalidation strategies (TTL, event-based, version-based) and their trade-offs. Junior engineers say "just set a TTL." Senior engineers say "TTL is simple but can serve stale data, event-based is accurate but requires dependency tracking, version-based is simple and effective." They're testing whether you understand that cache invalidation is a trade-off between simplicity and accuracy.

Multi-Layer Caching

What people usually get wrong:

Engineers often think "one cache is enough." But different cache layers solve different problems. Browser cache (for static assets) has high hit rates but limited control. CDN cache (for global content) reduces latency but requires cache invalidation across edge locations. Application cache (for computed results) is fast but uses server memory. Database cache (for query results) reduces database load but requires careful invalidation. Use multiple layers—each layer caches what it's good at caching.

How this breaks systems in the real world:

A service used only application-level caching (Redis). This worked for frequently accessed data, but static assets (images, CSS, JS) weren't cached, causing unnecessary requests. Also, the service had global users, but all cache lookups went to a single Redis instance, adding latency for distant users. The fix? Add CDN caching for static assets (long TTL, global distribution) and browser caching (Cache-Control headers). But the real lesson is: different cache layers solve different problems. Use multiple layers for maximum benefit.

What interviewers are really listening for:

They want to hear you talk about different cache layers (browser, CDN, application, database) and when to use each. Junior engineers say "just use Redis." Senior engineers say "use browser cache for static assets, CDN for global content, application cache for computed results, and database cache for query results—each layer solves different problems." They're testing whether you understand that caching is about multiple layers, not just one cache.

Cache Consistency Models

What people usually get wrong:

Engineers often think "cache must always match source." But perfect consistency is expensive and often unnecessary. For read-heavy workloads, serving slightly stale data (seconds old) is acceptable if it improves performance. For write-heavy workloads, you might need stronger consistency (invalidate immediately on writes). Understand your consistency requirements and choose caching strategies accordingly. Also, different cache layers have different consistency guarantees—browser cache is eventually consistent (users might see stale data), CDN cache is eventually consistent (propagation delay), application cache can be strongly consistent (invalidate on writes).

How this breaks systems in the real world:

A service cached product prices with a 5-minute TTL. Prices changed frequently (sales, promotions). Users sometimes saw old prices (cache hadn't expired), leading to confusion and support tickets. The fix? For price data, use event-based invalidation (invalidate immediately when prices change) or shorter TTL (30 seconds). But the real lesson is: consistency requirements vary by use case. Price data needs strong consistency, but product descriptions can be eventually consistent.

What interviewers are really listening for:

They want to hear you talk about consistency requirements and trade-offs. Junior engineers say "cache must always match source." Senior engineers say "consistency requirements vary by use case—price data needs strong consistency, but product descriptions can be eventually consistent, and you should choose caching strategies based on these requirements." They're testing whether you understand that caching is about trade-offs between consistency and performance.

Cache-Aside vs Write-Through

What people usually get wrong:

Engineers often use cache-aside (check cache, if miss fetch from source) for everything. But cache-aside can have race conditions (two requests miss cache, both fetch from source) and requires application code to manage cache. Write-through (write to cache and source simultaneously) is more consistent but slower writes. Write-behind (write to cache immediately, write to source asynchronously) is fastest but can lose data if cache fails. Choose based on your consistency and performance requirements.

How this breaks systems in the real world:

A service used cache-aside for user sessions. Under normal load, this worked. But during traffic spikes, many requests missed cache (sessions expired or were evicted), causing thundering herd (many requests fetching from database simultaneously). The database became overloaded. The fix? Use write-through for sessions (write to cache and database simultaneously) or implement cache warming (pre-populate cache with frequently accessed sessions). But the real lesson is: cache-aside is simple but can cause thundering herd. Write-through is more consistent but slower.

What interviewers are really listening for:

They want to hear you talk about different caching patterns (cache-aside, write-through, write-behind) and their trade-offs. Junior engineers say "just use cache-aside." Senior engineers say "cache-aside is simple but can have race conditions, write-through is consistent but slower, write-behind is fast but can lose data—choose based on consistency and performance requirements." They're testing whether you understand that caching patterns have trade-offs.


  • Caching is about trade-offs—improves performance but adds complexity
  • Cache invalidation is the hard part—choose strategy based on consistency requirements
  • Use multiple cache layers—browser, CDN, application, database each solve different problems
  • Consistency requirements vary—price data needs strong consistency, descriptions can be eventually consistent
  • Cache keys matter—include all parameters that affect results, but avoid unnecessary specificity
  • Cache-aside is simple but can cause thundering herd—write-through is more consistent but slower
  • Monitor cache hit rates—low hit rates mean caching isn't helping, high hit rates mean you're optimizing correctly

Key Takeaways

Caching is about trade-offs—improves performance but adds complexity

Cache invalidation is the hard part—choose strategy based on consistency requirements

Use multiple cache layers—browser, CDN, application, database each solve different problems

Consistency requirements vary—price data needs strong consistency, descriptions can be eventually consistent

Cache keys matter—include all parameters that affect results, but avoid unnecessary specificity

Cache-aside is simple but can cause thundering herd—write-through is more consistent but slower

Monitor cache hit rates—low hit rates mean caching isn't helping, high hit rates mean you're optimizing correctly


About the author

InterviewCrafted helps you master system design with patience. We believe in curiosity-led engineering, reflective writing, and designing systems that make future changes feel calm.