A confession I told Claude to write tests first. Claude said "understood." Then Claude spawned a subagent. The subagent said "this is simple enough, I don't need tests." It shipped. I approved. The tests that didn't exist didn't fail. Everything looked fine. It was not fine. The fun part: I had three plugins installed specifically to prevent this. They were all working correctly. In the main session. Where the work wasn't happening. The problem with being confident AI agents have a specific failure mode: they sound right even when they're wrong. This is well-known. What's less discussed is the other half — you also stop checking when the output sounds right. So you have two parties in a conversation. One produces confident nonsense. The other accepts it because confidence is persuasive. Nobody verifies. Errors ship. This is not a technology problem. This is a trust problem. And every tool I tried was solving the wrong half of it.…