After returning from AWS Summit London 2026 I was doing some research on running AI/ML workload in AWS EKS with Karpenter . With some assistance from Gemini I turned some of my notes from various talks into this guide that will talk through the intricacies of deploying and scaling Generative AI (GenAI) workloads on AWS EKS, leveraging the power of Karpenter. Why GenAI/ML Infrastructure Sizing is Hard 📏 The initial challenge with GenAI/ML workloads often stems from translating business requirements into technical specifications. A request like "I have 10,000 users, and my LLM needs to respond fast" is a common starting point, but it provides insufficient detail for hardware selection. We need to convert this into a concrete workload model.…