April 28, 2026 was a weird day for me IBM shipped Bob. Thoughtworks published SPDD. Researchers at Fudan, Peking, and Shanghai AI Lab published Agentic Harness Engineering on arxiv. Microsoft shipped A2A v1 backed by AWS, Cisco, Google, IBM, Salesforce, SAP, and ServiceNow. Four independent teams. Same day. Same problem: orchestrate AI across a software development workflow. Every single one of them stopped at generation. The question nobody answered How do you know the output is actually good? They all stop at generation. A human checks the checkpoint. A reviewer approves the step. The system moves on. That's supervision by convention, not by architecture. I've been working on the answer for nine months. Meet Pappy Pappy is a QC role inside Orca that scores every pipeline output before it reaches the user. PASS, WARN, or FAIL with a confidence score. Failed runs trigger an automatic repair loop.…