Post 3 ended with three numbers: 97.4% test accuracy, 100% precision, 0 false positives. All of it measured on a MacBook. That was enough to prove the model could learn first crack. It did not prove the model could run where I needed it: next to a coffee roaster, on a Raspberry Pi 5, listening through a USB microphone during a live roast. The first Pi run was "it kind of works". Export the PyTorch checkpoint to ONNX FP32, copy the 345MB file across, run inference. The result was 9.4 seconds per 10-second audio window. That was not a bug. That was the cost of running an 86M parameter transformer on one ARM Cortex-A76 core at full floating-point precision. The next attempt exposed a different problem. More threads made the model faster, but the Pi crashed. Oz was already in the SSH session, so the next command was vcgencmd get_throttled . The answer came back as 0x50000 : under-voltage and throttling.…