I shipped 3 bugs after 'looks good to me' AI code review last quarter I had one of those quarters where every PR went through an AI reviewer, every PR got a friendly "LGTM with minor suggestions", and three of those PRs still managed to wedge production. One was an N+1 query that only appeared when a customer hit a specific endpoint with more than 50 items. One was a missing await that the AI cheerfully ignored because the code "looked async-ish". One was a permission check we removed and nobody, human or model, flagged it. After the third one I stopped blaming the model and started blaming my setup. A single AI reviewer running once on a diff is not a code review. It is a vibe check. What actually fixed it was splitting review into three layers, with three different jobs, and never letting any one of them pretend to be the others. This post is that setup. Why a single AI reviewer falls over I once spent a Sunday tagging every comment on our PRs for a month. The split was uncomfortable.…