Menu

Post image 1
Post image 2
1 / 2
0

War Story: Kubernetes 1.32 Node OOM Kill Caused Pod Eviction for 20 Minutes

DEV Community·ANKUSH CHOUDHARY JOHAL·about 1 month ago
#o6qI55b4
#story#kubernetes#node#kill#memory#kubelet
Reading 0:00
15s threshold

At 14:17 UTC on March 12, 2024, a single Kubernetes 1.32 node in our production EU-West-1 cluster hit an OOM (Out-Of-Memory) threshold that triggered 23 minutes of cascading pod evictions, dropped 14% of real-time user traffic, and cost $42k in SLA penalties before we stabilized the control plane. 🔴 Live Ecosystem Stats ⭐ kubernetes/kubernetes — 121,985 stars, 42,943 forks Data pulled live from GitHub and npm. 📡 Hacker News Top Stories Right Now Ghostty is leaving GitHub (576 points) OpenAI models coming to Amazon Bedrock: Interview with OpenAI and AWS CEOs (63 points) A playable DOOM MCP app (55 points) Warp is now Open-Source (90 points) Waymo in Portland (168 points) Key Insights Kubernetes 1.32's kubelet memory accounting for containerd 2.0.0-rc.1 undercounts shared page cache by 18-22% in high-IOPS workloads, leading to silent OOM risks.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More