Menu

Why Your Binary Protocol Should Care About CPU Cache Lines
πŸ“°
0

Why Your Binary Protocol Should Care About CPU Cache Lines

DEV CommunityΒ·speed engineerΒ·about 1 month ago
#idNkWBDl
#why#performance#networking#cache#line#bytes
Reading 0:00
15s threshold

Why Your Binary Protocol Should Care About CPU Cache Lines If you've ever designed a custom binary protocol for a hot path β€” a game server, a market-data feed, an internal RPC β€” you've probably obsessed over byte layout, alignment, and zero-copy parsing. There's one detail most tutorials skip that quietly costs you 2-5x throughput: cache line alignment . The 64-byte secret Modern CPUs don't read memory one byte at a time. They read in chunks called cache lines β€” typically 64 bytes on x86_64 and ARM. Every load that misses L1 pulls in a full cache line. Every store that has to be visible to other cores invalidates a cache line on those cores. If your protocol's "hot fields" β€” the bits the receiver reads first and most often β€” sit on the boundary between two cache lines, you just doubled your memory traffic for free. A worked example Picture a naive market-data tick struct: a uint8_t type tag, a uint64_t timestamp, a uint32_t symbol id, an 8-byte price, an 8-byte sequence number, and an 8-bit flags field.…

Continue reading β€” create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More