On March 3, 2026, I connected a custom agentic pipeline to our team GitHub repository. I gave it read access to pull requests, diff outputs, and our internal linting config. My goal was simple. I wanted to offload the tedious parts of code review so I could focus on system architecture and mentoring junior developers. I ran this exact setup until April 2. The data I collected completely changed how I think about developer automation. I expected the AI to catch minor formatting issues and maybe flag obvious null pointer exceptions. I did not expect it to rewrite our error handling strategy. I also did not expect it to confidently approve a race condition that took me three hours to reproduce locally. The experiment worked, but not in the way I originally planned. The Stack and Configuration I built the pipeline around a March 2026 release of an open source review framework. It pulls changes via GitHub webhooks and routes them through a local inference server.…