I A/B tested an MCP server that cut my Claude Code token cost

1 / 2

I A/B tested an MCP server that cut my Claude Code token cost

DEV Community: ai·BasilSkyWalk·about 14 hours ago

#W5XpXTfO

#dev #file #parecode #files #agent #article

Reading 0:00

15s threshold

Most "I cut my token usage by X%" posts hand-wave the number. This one shows the method, the repos, and the case where the tool does basically nothing. I'd rather you trust the result than be impressed by it. The problem: agents read whole files to see three lines Watch a coding agent work on a large codebase and you'll see the same loop over and over: grep -rn "handlePayment" src/ → a dozen file:line hits. Read four of those files in full — hundreds of lines each — just to see the ~10 lines around each hit. Repeat for the next symbol. Each whole-file read is hundreds to thousands of tokens of context the model didn't need, and each one is another round trip. On a small repo it's invisible. On a real codebase it compounds into a slow, expensive session — and eventually a context window stuffed with files the agent only glanced at. The native tools aren't wrong; they're just coarse . Grep finds lines, Read returns files, and the agent is left to staple them together at full token cost.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

I A/B tested an MCP server that cut my Claude Code token cost