AI is evolving, and reasoning models are increasing token demand, placing new requirements on every layer of AI infrastructure. More than ever, compute must scale efficiently to maximize token production and improve productivity for model creators and users. Modern GPUs operate at peak capacity, pushing throughput higher every generation, but system performance is increasingly gated by the CPU-bound serial tasks within an agentic loop–a classic example of a core computer science principle, called Amdahl’s law. This dynamic is especially visible in two classes of workloads: reinforcement learning (RL) for training models with new specialized skills such as coding or engineering, and agentic actions , which enable AI agents to use tools like web browsers, databases, code interpreters, and other software to complete tasks in real environments, or sandboxes. Both workloads combine two historically separate CPU characteristics.…