Everyone benchmarks models. Sonnet beats Haiku on reasoning. Opus beats Sonnet. Haiku is fastest. These things are all true. But benchmarking and deploying are different games. At scale, the difference between Haiku at $0.80/million tokens and Sonnet at $3/million tokens isn't academic. It's $400+ monthly on a mid-size application. The trap is paying for capability you don't actually need because you never measured what you do need. I built a router to answer one question: which tasks in my actual workflow could run on the cheapest model without failing? The answer surprised me. And I learned that the real value isn't the savings. It's the forcing function. You can't implement routing without auditing exactly where your complexity lives. 3 Things I Learned 1. Your Intuition About Task Complexity Is Backwards You think something needs Sonnet. Your gut says: "this requires reasoning, obviously expensive model." So I measured. Content classification? Haiku handles 95% of real requests. Writing summaries? 88%.…