At 2:14 AM on a Tuesday, our FastAPI 0.115 API serving 4.2k requests per second (RPS) collapsed under a flash traffic spike. P99 latency spiked to 11.7 seconds, error rates hit 34%, and our on-call engineer was 3 hours into a 12-hour shift. We had 48 hours to rebuild the core request pipeline to handle 10k RPS sustained, or we’d breach our SLA with a Fortune 500 client and lose $2.1M in annual recurring revenue. This is how we did it with Redis 8, AWS Graviton4 instances, and zero downtime. 📡 Hacker News Top Stories Right Now Where the goblins came from (647 points) Noctua releases official 3D CAD models for its cooling fans (253 points) Zed 1.0 (1866 points) The Zig project's rationale for their anti-AI contribution policy (298 points) Mozilla's Opposition to Chrome's Prompt API (82 points) Key Insights Graviton4 (c7g.2xlarge) instances delivered 37% higher throughput per dollar than equivalent x86 (c6i.2xlarge) nodes for FastAPI workloads.…