There's a failure mode shared by every autonomous coding agent I've watched: the agent declares victory the moment its checks pass. Tests are green. Linter is happy. Build succeeds. Done. Except the test was checking the wrong thing. The linter doesn't know about placeholders. The build doesn't care that the function returns null on the path that mattered. The agent rationalized those away because they didn't trigger any red. A passing validator is necessary. It is not sufficient. This post is about the gap, and one pattern for closing it. Two existing patterns Two prior tools name the durable-autonomous-goal shape. They're worth knowing. OpenAI Codex /goal . You give Codex an objective and a stopping condition; it works toward both across many turns. The contract is "complete X without stopping until Y." When Y matches, Codex stops. ( docs ) The Ralph loop , by Geoffrey Huntley.…