Why SLOs Break in Microservices A SLO that works for a monolith often collapses when you distribute the same logic across 30 services. The math of availability is unforgiving. If your service depends on 5 others, each at 99.9%, your realistic ceiling is 0.999^5 = 99.5% . That 0.4% gap eats your entire error budget before your own code even runs. The Three Mistakes Teams Make 1. Copying the same SLO to every service A 99.9% target on a payment service and a batch analytics service are not the same thing. One ruins revenue. One ruins dashboards. 2. Measuring uptime instead of user experience GET /health returning 200 is not a SLO. Users don't call /health . They check out, log in, view pages. Measure those. 3. Ignoring fan-out If a user request fans out to 8 downstream calls, and one of them has a 99% SLO, your user-facing reliability is capped at 99% no matter how good your code is.…