For a decade, the Kubernetes scaling playbook had one move: add another cluster. Need more capacity? Add a cluster. Need workload isolation? Add a cluster. Need regional separation? Add a cluster. Need a dedicated GPU pool? Add a cluster. The cluster became the unit of scale because the control plane could not scale far enough to avoid making it one. At Google Cloud Next '26, Google made the opposite bet. A single Kubernetes-conformant control plane spanning 256,000 nodes across multiple regions, managing a million accelerators as a unified capacity reserve. Not bigger Kubernetes. A different architectural claim entirely. The claim is this: the control plane is now the unit of scale. The cluster is not. Most platform architectures were not built around that assumption. They are still operating the old boundary — and that mismatch is what this post is actually about. The Old Scaling Model Was Cluster Multiplication The cluster-as-boundary model made sense when it emerged.…