On the Powers and Dangers of CachingPosted: July 19, 2011
Operating systems, the internet, video games, social media – they all would not exist without significant hardware and software infrastructure devoted to caching. The same is true for us: We make enormous demands on a host of distributed caching solutions, some open source, some commercial, all carefully optimized for a particular workload.
A second line of defense is our web-tier content cache, which serves content to authenticated/connected users. This is a large, but all-in-all straightforward caching system.
More interesting is our query result cache system, a multi-level distributed caching system that lazily replicates cache data from centralized memcached servers to in-memory caches on our web tiers. This enables us to handle cache system failures gracefully, so that cache restarts do not result in massive load spikes on our database clusters. Indeed, this is one of the main dangers of caching: When caching servers go down, rewarming those caches can drive back-breaking load on back-end systems, and so managing cache hydration and persistence is a key task for systems with 24×7 uptime.
Also interesting is our approach to cache locality: Since modern databases can serve single-row queries on clustered-index data incredibly quickly, improving object fetch performance is best done by avoiding network round trips, rather than query overhead. Therefore, we store query results in remote caches, but never objects, which, if they are cached at all, are always cached on the web tiers.
Using these approaches, as well as many others, such as sophisticated ETag exchanges, we enable our infrastructure to scale to the tens of thousands of requests per second that our customer base drives at peak hours.