Retrospective: 3 Years of Using AWS Lambda for Serverless APIs at 1M Request Scale Three years ago, our team migrated our customer-facing API from a containerized ECS cluster to AWS Lambda, aiming to reduce operational overhead and align costs with traffic. Today, that API handles a steady 1 million requests per month (peaking at 5x that during seasonal events) with 99.99% uptime. This retrospective breaks down what worked, what didn’t, and the hard-won lessons we’d share with any team adopting Lambda for high-traffic APIs. Why We Chose Lambda in the First Place Our legacy ECS setup required constant capacity planning: we over-provisioned for peak traffic 80% of the time, and scaling events took 5+ minutes to spin up new tasks, leading to throttled requests during traffic spikes. Lambda promised three key wins: Zero capacity planning: Automatic scaling to match traffic, with no idle resource costs. Reduced ops burden: No patching, no load balancer config, no task health checks to manage.…