Seven days into the month, I'd burned through 75% of my AI API budget. Nothing had changed about how I was working — same codebase, same questions, same tools. But the token meter was spinning like I'd left a garden hose running. I dug in. The culprit wasn't my prompts. It was the context. The Problem With File-Based Retrieval When you ask Claude or GPT "how does my auth middleware work?" , most tools respond by grabbing the entire auth.ts file and stuffing it into the prompt. Sometimes two or three files. That's 300–800 lines of code when you probably needed 30. I call this the Confusion Tax — you're paying for tokens that actively make the AI worse. More irrelevant code means more noise, more hallucinations, and a higher bill. Traditional RAG treats code like a document. It doesn't understand that validateToken() calls checkExpiry() which imports from crypto/utils.ts . It just sees text. Code Isn't Text — It's a Graph Every codebase is a directed graph. Functions call other functions.…