Most model experiments start with a notebook, a benchmark script, or a quick API call. This one started with a production-shaped question: Can I swap out an entire model family that is currently serviing the default paths through my actual local AI gateway? Not a side demo. Not a one-off curl. Not "look, it runs." I mean the real route: the gateway that agents, background jobs, app surfaces, benchmark harnesses, and my own tools already call. That is the experiment I started with Gemma 4. This post is the beginning of that story, not the final verdict. I am writing it while the platform is still in the trial window. The follow-up will be more interesting: what stayed stable, what broke under real load, what got rolled back, and what I would keep after a week or two of actual use. For now, this is the setup: what I changed, why I changed it, and what failed immediately. The Platform Before The Swap My local AI stack is built around a gateway I call Forge.…