I remember the first time I tried OpenAI’s o1. I asked it a gnarly infrastructure question: “Design a multi‑region, strongly consistent queue that survives a full AWS region outage.” It paused for ten seconds. Then it gave me a brilliant, cautious, self‑corrected answer. I was blown away. Then I saw the price. And the rate limits. And the fact that I couldn’t see why it rejected certain paths. That’s when a thought hit me – not a breakthrough, but an old, boring, beautiful cloud pattern: Map‑Reduce . Because here’s the secret no AI lab will tell you: reasoning is just search . And search loves parallelism. You don’t need o1. You need 50 cheap LLMs running in parallel, one judge, and AWS Step Functions. Let me show you exactly how we built a “bring‑your‑own‑o1” engine. It costs 25 cents per hard question and runs in under 15 seconds. The “Aha” Moment: Why One Model Fails A single LLM is a brilliant guesser, but it only gets one shot.…