A Rails app starts calling OpenAI or Anthropic. A few months later someone in finance asks "who's burning $X a month on this and on what?" The answer requires per-user, per-feature, per-tenant attribution — and the obvious solutions all want you to give up something I wasn't willing to give up. This is the design rationale behind llm_cost_tracker , a Rails Engine I've been building. It's not the only way to solve this problem; it's the way that fit the constraints I cared about. The constraint set Three non-negotiables shaped every other choice: No new infra. A Rails app already has a database, a request lifecycle, an authentication layer, a dashboard pattern. Anything I bolt on should reuse those, not duplicate them. No prompt storage. Prompt content is regulated data in a lot of contexts — PII, customer transcripts, medical, legal. The tracker has no business holding it. No traffic redirection. Direct calls to OpenAI / Anthropic / Gemini are the simplest path and the one with fewest failure modes.…