I've been skeptical of AI testing tools for a while, but not for the reasons most people are. My problem isn't that AI can't drive a browser. It clearly can. My problem is what happens after. Every tool I tried made the same implicit trade: the AI stays in the loop at runtime. Your "test" is really a prompt that gets re-evaluated every time CI runs. The model drifts, the response changes slightly, your test starts flaking — and you have no idea why because there's no diff to look at. You just have vibes and a red build. I kept thinking: I don't want AI to run my tests. I want AI to write them. The thing Playwright Codegen almost got right Playwright's built-in codegen is underrated. You record your actions, it spits out a .spec.ts file, and that file is yours forever. No model, no API key, no ongoing cost. Just code. The problem is it records mechanically. It doesn't think . It'll happily record click('[data-testid="btn-3"]') instead of getByRole('button', { name: 'Sign in' }) .…