The United States is in the grip of an opioid crisis. Between 2019 and 2023, fentanyl-related overdose deaths skyrocketed — but the impact is not uniform across the country. Are the hardest-hit states also the poorest? We built a full Big Data pipeline to answer that question. The Data We combined two official U.S. government sources: CDC VSRR (Vital Statistics Rapid Release) — state-level fentanyl overdose deaths per 12-month rolling period, from 2015 to 2023 U.S. Census Bureau ACS 5-Year — median household income, poverty rate, and unemployment rate for all 50 states + D.C. The Architecture CDC API ──┐ ├── Apache Spark ── Elasticsearch ── Kibana Dashboard Census ───┘ │ └── scikit-learn (ML) Step 1 — Ingestion Python scripts fetch both datasets via REST APIs and land them in a raw datalake ( data/raw/ ), with UTC timestamps for traceability.…