Menu

Post image 1
Post image 2
1 / 2
0

How I built a Go proxy that keeps your LLM conversation alive when cloud quota runs out

DEV Community·Shouvik Palit·30 days ago
#itCiovHd
#opensource#llm#ai#go#context#trooper
Reading 0:00
15s threshold

Introduction If you've ever been mid-conversation with Claude or GPT, hit a quota limit, and switched to a local Ollama model,you know the pain. The local model has zero context. It's like walking into a meeting 45 minutes late and nobody catches you up. I got frustrated enough to build something about it. That something is Trooper. What is Trooper Trooper is a lightweight Go proxy (~850 lines, two files) that sits between your application and your LLM providers. When a cloud provider returns a quota error (429, 402, 529), Trooper automatically falls back to a local Ollama instance without dropping the conversation context. Single binary. Zero dependencies. Easy to audit since it sits in front of your API keys. The real problem: context loss on fallback Most fallback proxies solve the routing problem but ignore the context problem. They either pass the raw message history as-is (which blows up the local model's context window) or they truncate the oldest turns (which kills continuity).…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More