Two days ago, Gemma 4 topped our local model benchmark — 167 tokens per second, perfect code quality score, smallest download. Faster than Sonnet. Faster than Opus. The blog post said "Gemma 4 is the new default." Today we tested whether that's actually true. The Experiment Instead of another toy benchmark, we pulled a real item off the vibescoder.dev backlog: public-facing search across all blog posts . Multi-file feature, architectural decisions required, design system integration, no specification beyond "make search work." Two models. Same prompt. Same codebase. Same workspace template. One shot — no follow-up instructions, no hand-holding. Walk away and see what happens. Gemma 4 27B Opus 4.6 Provider Ollama (local, RTX 5090) Anthropic API (cloud) Benchmark speed 167.1 tok/s 74.3 tok/s Benchmark score 100/100 100/100 Cost $0 Per-token pricing The prompt was deliberately vague on implementation details: Add public-facing search to vibescoder.dev.…