If you run a multi-agent workflow — LangChain with fallbacks, CrewAI with different models per agent, AutoGen, or anything where someone (maybe past-you) configured model routing — this post is for you. Here's what the logs showed: [agent] Starting document analysis... [llm] Response received (200 OK) [llm] Response received (200 OK) [llm] Response received (200 OK) [llm] Response received (200 OK) [llm] Response received (200 OK) [llm] Response received (200 OK) [llm] Response received (200 OK) [llm] Response received (200 OK) [agent] Task complete. Enter fullscreen mode Exit fullscreen mode Eight successes. Nothing to investigate. Here's what actually happened: Model Calls Cost (USD) ───────────────────────────────────── claude-opus-4-6 3 $3.2325 claude-sonnet-4-6 3 $0.2775 claude-haiku-4-5 2 $0.0092 ───────────────────────────────────── Total $3.5192 Enter fullscreen mode Exit fullscreen mode Three calls to Opus. 92% of the bill. The model= config said Haiku.…