On July 8, 2015, the New York Stock Exchange halted all trading for three and a half hours. United Airlines grounded its entire fleet the same morning. The Wall Street Journal 's website went dark. By early afternoon, the U.S. Department of Homeland Security had confirmed that the three incidents were unrelated — each a cascading software failure, not a coordinated attack. The market lost nothing catastrophic that day. But the near-miss exposed something the technology industry had quietly known for years and the policy world had barely begun to understand: the software systems underpinning American economic life are not managed like the critical infrastructure they actually are. That gap — between the operational maturity the nation's digital infrastructure requires and the practices most organisations actually apply — is precisely what Site Reliability Engineering exists to close.…