Caching

Caching stores frequently accessed data in a faster location so you do not have to fetch it from the original source every time. Your database query takes 200ms. Your cache returns the same result in 2ms. For data that does not change every second, caching is one of the simplest and most effective performance improvements you can make.

Caches exist at every layer of a system. Browser caches store static assets locally. CDN caches store content at the network edge. Application caches (Redis, Memcached) store computed results in memory. Database query caches store frequently-run query results. Each layer reduces load on the layer behind it. A well-cached system might serve 95% of requests without ever touching the database.

The hard part of caching is invalidation. When the underlying data changes, the cache needs to reflect the update. Cache-aside (the application checks the cache first, falls back to the database, then populates the cache) is the most common pattern. TTL (time-to-live) expiration is the simplest strategy: cached data expires after a set duration. The right TTL depends on how stale your data can be. User profile data might tolerate a 5-minute cache. Stock prices cannot tolerate any cache at all.

Examples

An engineering team adds Redis caching to reduce database load.

The product catalog API makes the same database query 50,000 times per hour. Each query takes 150ms. The team adds a Redis cache with a 60-second TTL. Cache hit rate reaches 98%. Average response time drops from 150ms to 4ms. Database CPU usage drops from 80% to 15%. The team delays a costly database upgrade by a year.

A cache invalidation bug causes stale data.

Users update their display names but the old names keep showing for other users. The profile cache has a 30-minute TTL and no invalidation on write. The team adds cache invalidation: when a user updates their profile, the cache entry is deleted immediately. The next request populates a fresh cache entry. Stale data disappears.

A team uses multi-layer caching for a content platform.

Blog posts are cached at three layers: CDN (1-hour TTL for static HTML), application cache in Redis (5-minute TTL for API responses), and database query cache (30-second TTL for content queries). A new post publishes and is visible within 30 seconds via the API. The static HTML updates within an hour or when the team manually purges the CDN cache.

Frequently asked questions

What does 'cache invalidation is hard' mean?

It is a famous saying in computer science. The problem: when your source data changes, your cache still holds the old version. Invalidating too aggressively defeats the purpose of caching. Invalidating too slowly serves stale data. Getting it exactly right requires understanding your data's update frequency, your users' tolerance for staleness, and every code path that modifies the source data. Most caching bugs are invalidation bugs.

When should you not use caching?

Do not cache data that must always be current (financial transactions, inventory counts near zero). Do not cache data that is unique per request (search results with many filter combinations). Do not add caching before you understand your performance bottleneck. Caching the wrong thing adds complexity without improving performance. Measure first, cache second.

Examples

In practice

Read more on the blog

Frequently asked questions

What does 'cache invalidation is hard' mean?

When should you not use caching?

Related terms

Want the complete playbook?