Channels Aren't Message Passing — How Parked Goroutines OOM-Killed a Pod

1 / 5

Channels Aren't Message Passing — How Parked Goroutines OOM-Killed a Pod

DEV Community·Harrison Guo·19 days ago

#pl3Ux3ct

#go #concurrency #goroutine #goroutines #queue #buffer

Reading 0:00

15s threshold

It's 3am. The Kafka consumer pod that's been running cleanly for six weeks gets OOM-killed. Kubernetes restarts it. Five minutes later: OOM-killed again. Restart. OOM-killed a third time. By the fourth restart I've shelved the dashboard and started reading runtime/chan.go . The code that died fit on one line: events := make ( chan Event ) Enter fullscreen mode Exit fullscreen mode I want to tell you that line is the bug. It isn't. An unbuffered channel will happily backpressure a single producer — every send rendezvous with a receiver, the producer cannot run ahead. The channel did exactly what it was designed to do. What I had built around it didn't. The Kafka consumer loop wrapped events <- parseEvent(msg) inside a go func(msg) { ... }(msg) , spawning a fresh goroutine per inbound message. Every one of those goroutines blocked on send, parked on the channel's sendq list, and kept its stack and the parsed event alive in memory. The channel was the gravestone.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Channels Aren't Message Passing — How Parked Goroutines OOM-Killed a Pod