TL;DR: The thing that broke my mental model first wasn't slow queries — it was watching disk I/O climb to 95% utilization on NVMe drives while average query latency jumped from 12ms to 340ms on a corpus I'd carefully tuned for months. We were running Elasticsearch 8. 📖 Reading time: ~41 min What's in this article The Problem I Kept Running Into: Index Bloat at Scale Quick Primer: What an Inverted Index Actually Stores The Core Algorithms You'll Actually Encounter How Lucene 9.x Actually Picks a Codec Elasticsearch 8.x: Configuring Compression in Practice Apache Solr: Where the Controls Are More Exposed Tantivy (Rust): A Different Approach Worth Knowing Benchmarking Compression Tradeoffs: What I Actually Measured The Problem I Kept Running Into: Index Bloat at Scale The thing that broke my mental model first wasn't slow queries — it was watching disk I/O climb to 95% utilization on NVMe drives while average query latency jumped from 12ms to 340ms on a corpus I'd carefully tuned for months.…