Submission for the Gemma 4 DEV Challenge . Gemma 4 ships in four sizes: 2B and 4B for edge and mobile, a 26B Mixture-of-Experts model, and a 31B dense model for servers. The big two are great. The small two are interesting for a different reason: they run on a laptop with no API key and no rate limit. The catch with small open models is reliability. Pointing a 2B model at a real task and asking for clean JSON, well-formed tool calls, and bounded behavior is where most demos fall apart. The model wraps JSON in Sure here you go . It hallucinates a number where a string was wanted. It decides to fetch a URL you never wrote down. I spent the last couple of weeks shipping a stack of five tiny, zero-dependency Node libraries that fix exactly these failure modes around any LLM. The challenge gave me a reason to wire them up around Gemma 4's edge 2B model ( gemma4:e2b ) running on Ollama and see how far a small model goes when the surrounding scaffolding is right. Spoiler: pretty far.…