Kafka on Kubernetes: Performance Lessons for Any Disk-Heavy Data Service

1 / 5

Kafka on Kubernetes: Performance Lessons for Any Disk-Heavy Data Service

DEV Community·Jacob Amar·23 days ago

#DwFeYlxv

#why #kafka #fix #memory #reclaim #disk

Reading 0:00

15s threshold

We recently started migrating Kafka clusters from EC2 to EKS using Strimzi. The goal was not to chase new features, but to reduce the operational overhead of running large stateful clusters by hand. Upgrades, configuration changes, instance-family replacements, and failure recovery all required too much manual coordination on EC2. We wanted a model that gave us: Declarative configuration. Simpler upgrades. Easier infrastructure changes. Better self-healing. Less day-to-day operational toil. That part worked. What did not work was the performance profile after the migration. As soon as we moved the first cluster, we saw persistent disk reads across the brokers and higher latency than we expected on comparable hardware. Why That Was a Problem For Kafka, disk reads are not just a storage detail. In normal operation, when consumers stay near the head of the log and the brokers have enough memory, hot data should usually be served from page cache instead of disk.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Kafka on Kubernetes: Performance Lessons for Any Disk-Heavy Data Service