How llm0 gets to 3 ms p50 cache-hit latency on a 4 vCPU droplet — three Redis Lua scripts, a two-tier cache, and pgvector instead of a separate vector DB.
Track and analyze AI costs across models, providers, and users with Custom Reporting for Vercel AI Gateway. Tag requests, query spend by any dimension, and get full visibility into your AI usage, with no extra tools needed.
AI Gateway is now in Beta, giving you a single endpoint to access a wide range of AI models across providers, with better uptime, faster responses, no lock-in.
You can now access Qwen3 Coder, a model from QwenLM, an Alibaba Cloud company, designed to handle complex, multi-step coding workflows, using Vercel's AI Gateway with no other provider accounts required.
You can now access GLM-4.5 and GLM-4.5 Air, new flagship models from Z.ai designed to unify frontier reasoning, coding, and agentic capabilities, using Vercel's AI Gateway with no other provider accounts required.
You can now access gpt-oss by OpenAI, an open-weight reasoning model designed to push the open model frontier, using Vercel's AI Gateway with no other provider accounts required.
You can now access the gpt-5 models by OpenAI, their most advanced models pushing the frontier of reasoning and domain expertise, using Vercel's AI Gateway with no other provider accounts required.
Vercel AI Gateway now supports automatic credit recharging (top-ups), optionally refilling your balance before it runs out to keep your apps running without interruption.
AI Gateway is now generally available, providing a single interface to access hundreds of AI models with transparent pricing and built-in observability.