It's 3am. The Kafka consumer pod that's been running cleanly for six weeks gets OOM-killed. Kubernetes restarts it. Five minutes later: OOM-killed again. Restart. OOM-killed a third time. By the fourth restart I've shelved the dashboard and started reading runtime/chan.go . The code that died fit on one line: events := make ( chan Event ) Enter fullscreen mode Exit fullscreen mode I want to tell you that line is the bug. It isn't. An unbuffered channel will happily backpressure a single producer — every send rendezvous with a receiver, the producer cannot run ahead. The channel did exactly what it was designed to do. What I had built around it didn't. The Kafka consumer loop wrapped events <- parseEvent(msg) inside a go func(msg) { ... }(msg) , spawning a fresh goroutine per inbound message. Every one of those goroutines blocked on send, parked on the channel's sendq list, and kept its stack and the parsed event alive in memory. The channel was the gravestone.…