The 800ms Barrier: Architecting Interruptible Voice Agents (Lessons from Sarvam AI x Swiggy) The Signal: The 800ms Latency Barrier In a research lab, a 3-second delay is an "optimization ticket." In a live call with a hungry customer on the Swiggy app, 3 seconds is a churn event. The partnership between Sarvam AI and Swiggy represents a shift in the "Boss Level" of agentic AI. Most developers build voice agents using a Cascaded Pipeline: STT -> LLM -> TTS. The result? A cumulative lag that makes the agent feel like a slow walkie-talkie. To build for the next billion users, you have to architect for Native Audio Streaming and sub-second response times. Phase 1: The Architectural Bet We are moving from Request-Response to Streaming State Machines. The Vendor Trap is relying on general-purpose, text-centric models for a multilingual, audio-first market. If you have to translate "Hinglish" to English just to understand an order, you’ve already lost the latency battle.…