Why this exists I've been running K8s troubleshooting workshops for two years. We have a 200-student program at IT Defined where we throw broken clusters at people. Patterns emerged. Most failures aren't novel. The same 25-30 failure modes account for 90% of real-world K8s incidents. If you can confidently debug these, you'll handle most production incidents. Here are the 10 most critical scenarios. Full 26 in the linked post. 1. CrashLoopBackOff Symptom: Pod restart count climbing. Diagnosis: kubectl describe pod POD_NAME kubectl logs POD_NAME --previous Enter fullscreen mode Exit fullscreen mode Likely causes: App crashes on startup (config error, missing env var, can't connect to DB), liveness probe too aggressive, command/args misconfigured. Fix: Read the previous container's logs. Reason is usually right there. If logs are empty, the container died before logging — check the entrypoint, command, and args. 2. ImagePullBackOff or ErrImagePull Diagnosis: kubectl describe pod , look at events at the bottom.…