Last quarter I sat down with a junior on our team and watched her build a citation tracker in a spreadsheet for the third time that month. The fourth time, I stopped pretending we had a system. We had a habit. Habits and systems are not the same thing. What we had was a growing pile of screenshots from Perplexity, Google's AI Overviews, ChatGPT, and Gemini, plus a Notion page where each of us had been informally rating the citation quality of pieces we'd written or co-written for B2B SaaS clients. Some of us called it "good" or "weak." One person used a 1-5 scale. Another used colored dots. The tracker was the symptom. The problem was that we had no shared vocabulary for what a "good citation" actually was, and so every retrospective ended in a polite shrug. Over six weeks in Q4 2025 we ran what eventually became our baseline: 40 prompts, four engines (Perplexity, Google AIO, ChatGPT with web on, Gemini), five repetitions per prompt-engine combination. That's 800 prompt-runs.…