The fundamental goal of a distributed cache is to reduce latency and database load by storing frequently accessed data in RAM, but the real magic is how it can also increase system availability when your primary data store falters.

Let’s watch a simple key-value lookup in action. Imagine a web application needing a user’s profile.

  1. App Server Request: The application code tries to get user:123 from the cache.
  2. Cache Hit: If user:123 is in the cache (e.g., Redis), the cache server immediately returns the profile data. This is lightning-fast, measured in microseconds.
  3. Cache Miss: If user:123 isn’t in the cache, the application proceeds.
  4. Database Fetch: The app queries the primary database for user:123. This takes milliseconds.
  5. Cache Set: The app receives the profile data from the database, stores it in the cache with a Time-To-Live (TTL) of, say, 300 seconds, and then returns it to the client.

This process is repeated for every critical read operation, significantly offloading the database.

Redis and Memcached are the titans here, each with distinct strengths. Memcached is pure simplicity: a fast, distributed, in-memory key-value store. It’s optimized for raw speed and massive concurrency. Its eviction policy is LRU (Least Recently Used) by default, meaning it throws out the oldest data when it runs out of space. Redis, on the other hand, is a more feature-rich data structure server. It supports not just strings but also lists, sets, sorted sets, hashes, and even more complex operations like pub/sub and Lua scripting. Redis also offers persistence options (RDB snapshots and AOF logs) and more sophisticated eviction policies beyond LRU.

The real power comes when you layer caching patterns on top of these tools.

Cache-Aside (Lazy Loading): This is what we just saw. The application code is responsible for checking the cache, fetching from the DB on a miss, and populating the cache. It’s simple to implement and ensures only data that’s actually requested gets cached.

Read-Through: The cache itself handles fetching from the underlying data store. The application asks the cache for data. If it’s not there, the cache queries the database, stores the result, and returns it. This moves the cache-loading logic out of the application.

Write-Through: When the application writes data, it writes to the cache and the database simultaneously. The write is considered successful only when both operations complete. This guarantees data consistency between the cache and the database but can increase write latency.

Write-Behind (Write-Back): The application writes only to the cache. The cache then asynchronously writes the data to the database in the background. This offers the lowest write latency but introduces a window where data might be lost if the cache fails before writing to the database.

Write-Around: The application writes directly to the database. Cache reads will fetch from the database and then populate the cache. This avoids caching data that’s written but never read, but it means newly written data won’t be immediately available in the cache.

When you’re scaling, especially with Redis, you’ll often see it deployed in a cluster. Redis Cluster shards your data across multiple Redis instances. Each instance is responsible for a subset of the hash slots (keyspace). When an application needs to access a key, it first calculates which hash slot the key belongs to, and then directs the request to the appropriate Redis instance. This allows for horizontal scaling of both memory capacity and throughput. For high availability, you’d typically pair this with Redis Sentinel, which monitors your master instances and promotes a replica to master if the current master becomes unavailable.

A common pitfall is treating the cache as a true data store. While Redis can persist data, it’s not a replacement for a durable database. Relying on Redis persistence (like RDB snapshots) for your sole source of truth is risky; these are primarily for faster restarts and disaster recovery, not ACID compliance. The real strength is its ephemeral nature – data can and will be lost or evicted. Embrace that.

The next step after mastering these patterns is understanding how to effectively invalidate cache entries when the underlying data changes.

Want structured learning?

Take the full System Design course →