Book: System Design Pocket Guide: Fundamentals Also by me: Database Playbook My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools Me: xgabriel.com | GitHub A team I talked to last quarter had a Tuesday outage that lasted eight minutes. The storefront looked fine until checkout, which started returning 503s. Database CPU sat at 98%. The cache was healthy. The cache was, in fact, the problem. A single product page (the one being pushed in an email blast that morning) held a cached entry that expired at 09:00:00 sharp, and 14,000 concurrent requests proceeded to hammer Postgres to rebuild it. This is the canonical shape of a cache stampede . You have probably seen at least one of these. Most teams have all four somewhere in their stack and do not know it until the page goes red. Hole 1: Stampede on hot key invalidation The shape. A cache entry expires. N concurrent requests miss simultaneously. All N attempt to rebuild it from the underlying store.…