Last post we stood up Ollama on the RTX 5090, pulled a stack of models, and wired them into our coding workflow. The whole time there was an obvious question hanging over it: are local models actually good enough? Not good enough in the abstract benchmarks-on-a-leaderboard sense. Good enough for the thing we’re journaling: vibe coding. Specifically, can a model running on consumer hardware in my homelab produce code that's as correct, as fast, and as complete as what comes back from Anthropic's cloud? We built a benchmark to find out. The Setup Six models, one prompt, no second chances.…