Our tests were green. That was the first lie. The dashboard glowed. The pull request passed. The build moved through the pipeline and into production. We treated this as proof. It was not proof. It was a ceremony — the institutional gesture that told everyone standing near the machine that we had done the responsible thing. A coverage report can tell you that a line of code was executed. It cannot tell you that a lie was cornered there. Google's own guidance on code coverage makes this explicit: coverage is a lossy, indirect metric, and high percentages can manufacture a false sense of security. Mutation-testing tools say the same thing with sharper words. PIT and Stryker both make the same point: code execution is not fault detection. Those are two different activities. We conflated them for years because green was cheaper than correct. This is the quiet problem that AI is now making loud. Software did not become fragile when large language models arrived. It was already fragile.…