It was a Monday morning. A routine playbook. One task: state=latest . Forty-seven minutes later the payments team had a P1 incident, 50 production web servers were running a version of Nginx nobody had approved, and the postmortem had a very uncomfortable finding: Ansible did exactly what it was told to do . This article covers what happened, how idempotency actually works in production (not how tutorials describe it), and how to install and use Ansible without repeating this. The Incident: What Actually Happened The task looked like this: - name : Ensure nginx is installed ansible.builtin.apt : name : nginx state : latest update_cache : yes Enter fullscreen mode Exit fullscreen mode Symptom: Users started seeing SSL handshake failed errors immediately after the playbook completed. Payment gateway API calls timed out. Transaction success rate dropped below 2% within 90 seconds of the run completing.…