Datadog published the State of AI Engineering 2026 report this week — real telemetry from over a thousand production environments. Read it. It is the most comprehensive look at AI in production available right now. I want to respond from the reliability engineering perspective, because the data reveals a problem the report names but doesn't fully resolve: agent sprawl is now a production reliability crisis, and the SRE discipline does not yet have governance frameworks for it. What the Data Shows Three findings stand out from an SRE perspective: Framework adoption doubled year over year. LangChain, LangGraph, Pydantic AI, Vercel AI SDK — up from 9% of organizations in early 2025 to nearly 18% by 2026. Services using agentic frameworks: more than doubled. 70%+ of organizations run three or more models. The share running more than six models nearly doubled. Teams are building model portfolios rather than committing to a single provider. Teams add models faster than they retire them.…