On the Powers and Dangers of Caching
Posted: July 19, 2011 Filed under: Engineering Leave a comment »Operating systems, the internet, video games, social media – they all would not exist without significant hardware and software infrastructure devoted to caching. The same is true for us: We make enormous demands on a host of distributed caching solutions, some open source, some commercial, all carefully optimized for a particular workload.
Our front line of defense is our content distribution system, which replicates content that can be statically cached, out to network edges for speedy delivery. Automated tasks in our system determine which pieces of content are globally usable and push them out to the edges. Our JavaScript infrastructure does the same on browsers, pulling in data from the correct places at the right times so as to get optimal performance. This allows us to handle enormous traffic spikes.
A second line of defense is our web-tier content cache, which serves content to authenticated/connected users. This is a large, but all-in-all straightforward caching system.
More interesting is our query result cache system, a multi-level distributed caching system that lazily replicates cache data from centralized memcached servers to in-memory caches on our web tiers. This enables us to handle cache system failures gracefully, so that cache restarts do not result in massive load spikes on our database clusters. Indeed, this is one of the main dangers of caching: When caching servers go down, rewarming those caches can drive back-breaking load on back-end systems, and so managing cache hydration and persistence is a key task for systems with 24×7 uptime.
Also interesting is our approach to cache locality: Since modern databases can serve single-row queries on clustered-index data incredibly quickly, improving object fetch performance is best done by avoiding network round trips, rather than query overhead. Therefore, we store query results in remote caches, but never objects, which, if they are cached at all, are always cached on the web tiers.
Using these approaches, as well as many others, such as sophisticated ETag exchanges, we enable our infrastructure to scale to the tens of thousands of requests per second that our customer base drives at peak hours.

