Every week or two, a model drops that makes the local AI community lose its collective mind. This week it was three at once: DeepSeek V4-Pro , DeepSeek V4-Flash , and Zyphra ZAYA1-8B . All three are genuinely impressive. All three are models I wanted to benchmark on our homelab. And after doing the research, I'm not testing any of them. Not because I don't want to. Because I physically can't — or can't yet. This post isn't a benchmark. It's the research that happens before the benchmark, where you figure out which models are even candidates for your hardware. If you're building or considering a local inference setup, the reasons these three models don't work are more instructive than any leaderboard score. The Rig Quick refresher on what we're working with: Resource Spec GPU NVIDIA RTX 5090 — 32 GB VRAM RAM 64 GB DDR5 CPU AMD Ryzen 9 9950X3D — 16 cores / 32 threads Disk 1.8 TB NVMe Inference llama.cpp on the GPU This is a strong homelab by any measure.…