AI Agents Don't Fail at Code. They Fail at Learning.

1 / 4

AI Agents Don't Fail at Code. They Fail at Learning.

DEV Community·Nagarjuna Yelisetty·about 1 month ago

#GqPRKnEE

#ai #maintainability #engineeringleadership #code #commit #paper

Reading 0:00

15s threshold

Part 1 of 5 in The New Engineering Contract — what it means to lead engineers when AI is doing more of the coding. SWE-CI tested 18 AI models across 71 consecutive commits. Most broke something on commit 47 they'd already broken on commit 1. That's not an intelligence problem. That's a learning system that isn't learning. A paper made me uncomfortable this month. Not because of what it found about AI. Because of what it revealed about how I think about my own work. The paper is SWE-CI , published March 4, 2026 by researchers at Sun Yat-sen University and Alibaba Group. It tested 18 AI models across 100 real codebases — not single bug fixes, but 71 consecutive commits of genuine evolution. The core finding: most state-of-the-art models have a zero-regression rate below 0.25. Three out of four times, the agent fixed something and silently broke something else downstream. I read that and thought: that's a learning problem, not a coding problem.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

AI Agents Don't Fail at Code. They Fail at Learning.