Menu

Post image 1
Post image 2
1 / 2
0

I tracked every Claude Code call for 30 days. Here's the cost breakdown that justified switching to Gemma.

DEV Community·CoherenceDaddy·about 1 month ago
#MTqZD3Yq
Reading 0:00
15s threshold

A month ago I switched my Claude Code setup so that the terminal calls go through a local Ollama model instead of hitting Anthropic. The claim I made publicly at the time was that this cuts Claude Code spending by roughly 90%. A few people in the comments asked, very fairly, "where does that number come from?" So I started tracking. Every Claude Code session for the next 30 days got logged: what kind of task it was, which engine handled it, how long it took, and — for the Anthropic-side calls — what it cost in tokens at published per-million pricing. I also rated quality 1 to 5 on every task: did the model actually do the job, or did I have to bounce it back to Sonnet? The takeaway is more nuanced than the headline. The 90% number basically held, but not because Gemma is "as good as Sonnet" — it isn't. The savings are real because a surprising fraction of what you actually ask Claude Code to do is mechanical, and mechanical work doesn't need a frontier model. Here's the breakdown.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More