Moving past YAML templates to failure handling, security, and real tradeoffs Before we start This is a follow-up to Infrastructure as Code with AWS CloudFormation: From Fundamentals to Production Patterns . That article covered templates, stacks, nested stacks, CI/CD, and production best practices. This article covers what happens when those best practices aren't enough. When things break in ways the documentation doesn't warn you about. When you're reading CloudFormation error messages at midnight and need answers. Part 1: Stack deployment failures Failure 1: "Resource handler returned message: 'Role does not exist'" Symptoms: IAM role creates successfully (status: CREATE_COMPLETE) Lambda or EC2 resource fails immediately after Error: "The role named 'xxx' does not exist or is not authorized" Root cause: IAM has eventual consistency. CloudFormation marks the role as complete as soon as the API call returns, but the role may take 5-10 seconds to propagate across AWS partitions.β¦