Menu

Post image 1
Post image 2
1 / 2
0

Structure Before Bytes: How Metarc Beats tar+zstd on Real Code

DEV Community·arhuman·27 days ago
#MNBMmh9J
Reading 0:00
15s threshold

tar+zstd is very hard to beat. So I stopped trying to beat it as a byte compressor. Instead, I tried something else: compress the structure first, then compress the bytes. That is the idea behind Metarc , a small experimental archiver written in Go. It explores what I call metacompression : reducing structural and semantic redundancy across a source tree before applying a standard compressor such as zstd . And on my current source-code benchmark corpus, Metarc is now smaller than tar+zstd on every tested repository. The problem: tar sees a stream, not a project Traditional archive pipelines are usually built around a simple idea: directory tree → tar stream → compressor Enter fullscreen mode Exit fullscreen mode ` This is robust, portable, simple, and battle-tested. But it also means that by the time compression starts, a rich source tree has already been flattened into a byte stream. A source-code repository is not just bytes.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More