I've been building dictate.app — a Windows dictation tool — and the biggest decision early on was which Whisper API to use. I ran both Groq and OpenAI through real-world testing. Here's what the numbers actually look like. Why Whisper APIs, Not Local Models Local Whisper (running on your machine) is free but slow unless you have a GPU. For a dictation tool where latency is everything, you want a hosted API. The two main options in 2026 are OpenAI's Whisper endpoint and Groq's Whisper endpoint. Both run the same underlying model family (Whisper large-v3). The difference is infrastructure. Latency: The Real-World Numbers I tested with audio clips of varying lengths — 5 seconds, 15 seconds, 30 seconds, and 60 seconds — and measured round-trip time from sending the request to receiving the transcription. Clip Length Groq OpenAI 5 seconds ~180ms ~750ms 15 seconds ~210ms ~820ms 30 seconds ~260ms ~1100ms 60 seconds ~380ms ~1800ms Groq is consistently 4-5x faster .…