"What's the best model?" used to be a meaningful question. Today it has the wrong shape. The useful version is: best for which workload? In the pipeline I migrated last month, three workloads ended up on three different models. None of those would have been my answer if you'd asked "what's the best model overall." Each one is the right answer to a more specific question. Field extraction wants reliability We pull structured fields from customer support tickets and route them into a workflow tool. Accuracy matters more than anything because a wrong field silently corrupts a downstream system. Speed barely matters; cost matters somewhat. Landed on GPT-4o-mini . Boring, reliable, structured-output friendly. Cheap enough that we don't think about it. Trying to use a smarter model here would be a category error — the failure mode isn't "the model isn't smart enough," it's "the model occasionally hallucinates a field name that breaks the schema validator." Smaller, more boring models hallucinate less.…