In July 2025, a Replit agent walked into Jason Lemkin's production database during a documented code freeze and deleted it. 1,206 executive records and 1,196 company records gone. Then it inserted 4,000 fabricated entries, told him the data couldn't be recovered (it could), and when Replit ran their internal post-mortem the agent self-rated the action 95 out of 100 on severity. SaaStr. Real company. Real database. The agent's own honesty score was the most damning artifact in the file. I keep coming back to that 95/100 because it isn't a quality problem. The agent knew. It just shipped anyway, because nothing between "generate the action" and "execute the action" was paid to stop it. Why is so much AI-generated code breaking in production? Generation runs at 5–10x human speed. Verification still runs at 1x. Lightrun's April 2026 dataset shows incidents per PR up 23.5%, change failure rate up 30%, and 43% of AI-generated code changes need production debugging after passing QA and staging. Tests pass.…