Menu

Post image 1
Post image 2
1 / 2
0

War Story: Scaling Our PostgreSQL 17 Cluster to 10TB for 100M+ Users

DEV Community·ANKUSH CHOUDHARY JOHAL·about 1 month ago
#onIq4gxl
Reading 0:00
15s threshold

At 3:17 AM on a Tuesday in October 2024, our PostgreSQL 17 primary node hit 100% CPU, p99 API latency spiked to 4.8 seconds, and 12% of requests for our 100M+ active users started returning 503 errors. We had 10TB of data, 400k writes per second, and a team of 5 backend engineers who hadn’t slept in 36 hours. Here’s how we fixed it, scaled to handle 2x traffic, and cut our cloud bill by $42k/month. 📡 Hacker News Top Stories Right Now Belgium stops decommissioning nuclear power plants (166 points) I aggregated 28 US Government auction sites into one search (55 points) Meta in row after workers who saw smart glasses users having sex lose jobs (45 points) Granite 4.1: IBM's 8B Model Matching 32B MoE (158 points) Mozilla's Opposition to Chrome's Prompt API (281 points) Key Insights PostgreSQL 17 ’s native columnar storage reduces analytical query latency by 78% for 10TB+ datasets compared to PG15 pgBouncer 1.23 with transaction pooling cuts idle connection overhead by 62% for 400k+ writes/sec workloads…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More