Menu

Post image 1
Post image 2
1 / 2
0

Postmortem: How a PostgreSQL 17 VACUUM Misconfiguration Crashed Our 2026 Prime Day Checkout

DEV Community·ANKUSH CHOUDHARY JOHAL·28 days ago
#J0VQU3ne
Reading 0:00
15s threshold

At 14:07 UTC on July 15, 2026, our checkout service’s p99 latency spiked from 112ms to 14.2 seconds, then flatlined into a total outage that cost us $4.7M in lost revenue during the first 2 hours of Prime Day. The root cause? A single misconfigured PostgreSQL 17 VACUUM parameter that we’d shipped in a routine maintenance window 3 days prior. 📡 Hacker News Top Stories Right Now Async Rust never left the MVP state (226 points) Should I Run Plain Docker Compose in Production in 2026? (86 points) When everyone has AI and the company still learns nothing (52 points) Bun is being ported from Zig to Rust (571 points) Empty Screenings – Finds AMC movie screenings with few or no tickets sold (176 points) Key Insights PostgreSQL 17’s new autovacuum_vacuum_insert_threshold=0 default caused 14x higher VACUUM throughput on write-heavy tables We reproduced the crash using the official https://github.com/postgres/postgres benchmark suite with 10k concurrent checkout transactions Misconfiguration cost $4.7M in direct…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More