Menu

Post image 1
Post image 2
1 / 2
0

How We Fixed a PostgreSQL 16 Deadlock Issue That Caused 5 Production Outages in 2 Weeks

DEV Community·ANKUSH CHOUDHARY JOHAL·about 1 month ago
#dufmneu9
Reading 0:00
15s threshold

\\n Five production outages in 14 days. $240k in SLA credits issued. 18 hours of cumulative downtime. All caused by a subtle, previously undocumented deadlock edge case in PostgreSQL 16’s new parallel query execution engine. \\n\\n 📡 Hacker News Top Stories Right Now Soft launch of open-source code platform for government (174 points) Ghostty is leaving GitHub (2761 points) Show HN: Rip.so – a graveyard for dead internet things (81 points) Bugs Rust won't catch (362 points) HashiCorp co-founder says GitHub 'no longer a place for serious work' (18 points) \\n\\n \\n Key Insights \\n \\n* PostgreSQL 16’s parallel sequential scan (PSS) feature introduces a 14% higher deadlock risk for workloads with mixed OLTP/OLAP queries compared to PG15 \\n* We reproduced the deadlock using pgbench 16.1 with a custom workload script, validated against PostgreSQL 16.0, 16.1, and 16.2 RC1 \\n* Resolving the deadlock reduced our monthly SLA credit payouts by $210k and cut on-call escalation volume by 72% \\n* PostgreSQL 16.3…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More