1-bit, 545 megabytes, zero API keys — local AI that beats GPT-5.4 By Vilius Vystartas | May 2026 I ran the same 10 agent coding tasks against 8 locally-running models on my Mac. No cloud, no API keys, no per-token billing. The results surprised me enough that I ran them twice. The leaderboard Model Bits Size Score Time Qwen 3.5 9B 4-bit ~5GB 83% 190s AgenticQwen 8B 4-bit ~5GB 82% 189s Bonsai 4B 1-bit 545MB 80% 18s Ternary Bonsai 1.7B 2-bit 442MB 80% 10s Bonsai 8B 1-bit 1.1GB 80% 15s Ternary Bonsai 4B 2-bit 1.0GB 80% 20s Ternary Bonsai 8B 2-bit 2.1GB 78% 22s Bonsai 1.7B 1-bit 237MB 73% 8s A 545MB model beats GPT-5.4 Bonsai 4B at 1-bit quantization scores 80% on the same tasks where GPT-5.4 scored 75%. Half a gigabyte. No data center. Your laptop processes every request locally, zero latency. It's 3x faster than the Qwen models because there's less to compute. 4-bit controls tie Claude The 4-bit Qwen models at ~5GB score 82-83% — matching Claude Sonnet 4's cloud performance. On a Mac. These aren't toys.…