We recently finished upgrading a production Airflow instance from 2.8 to 3.1 running on Amazon EKS . The whole thing took about 6 weeks from planning to production cutover. This post covers what we did, what changed in the DAG code, how we handled the data migration, and the Kubernetes manifests that make up the new deployment. No fluff, just what happened. Why Upgrade at All Airflow 2.8 worked. It was running production DAGs without issues. So why bother? A few reasons pushed us over the edge: End of life. Airflow 2 reaches EOL in April 2026 . No more security patches after that. For a system handling production data pipelines, that is not something we could ignore. DAG Processor as a separate process. In Airflow 3, the dag-processor runs independently from the scheduler. This means a slow or broken DAG file does not block the scheduler from doing its job. We had hit this problem before where a DAG with a heavy top-level import would stall scheduling for everything else. Native HA scheduler.…