Menu

Faster Agents with Automatic Prompt Caching
📰
0

Faster Agents with Automatic Prompt Caching

Heroku·Anush DSouza, Mandeep Bal·about 1 month ago
#qElnm4pd
Reading 0:00
15s threshold

Heroku is launching automatic prompt caching starting December 18, 2025. Prompt caching delivers a notable, zero-effort performance increase for Heroku Managed Inference and Agents . Enabled by default, this feature is designed to deliver significantly faster responses for common workloads. We have taken a pragmatic approach and currently only enabled this to cache system prompts and tool definition, and not user messages or conversation history. You can disable caching for any request by setting X-Heroku-Prompt-Caching: false . What is prompt caching and how does it speed up AI inference? Prompt caching is an optimization that speeds up inference by securely caching and reusing the processed components of your requests for system prompts and tool definitions. For applications involving agents, a large portion of the request remains static. Instead of reprocessing this content on every call, Heroku can now reuse the processed result from a secure cache.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More