What happens when you put AI agent training on a blockchain, and what I wish someone had told me before I started. TL;DR: I built a Solana program that logs AI agent training episodes on-chain, making learning verifiable and tamper-proof. Two agents competed in a 10×10 grid. After 50 episodes, the Q-learning agent's average reward climbed from 0.10 to 6.50+. Every step is immutable on devnet. The primitive works. Here's how I got there and what broke along the way. Table of Contents The Problem Nobody Talks About What I Built Why Solana — Not a Database The Three Instructions The Agents Actually Learn The Dashboard What I Learned The Numbers What's Next Try It The Problem Nobody Talks About Here's something that bothered me for a while before I could articulate it clearly. When someone tells you their AI agent trained for 10,000 episodes and achieved superhuman performance, you have exactly two options: believe them, or don't. There's no third option. No audit trail. No verifiable history.…