Every data project starts with excitement. Then comes: missing values duplicate rows inconsistent column names encoding leakage checks skew analysis outlier handling repetitive preprocessing pipelines After rebuilding the same workflow across notebooks and projects, I decided to create something reusable. So I built dfxpy — an open-source Python package focused on accelerating DataFrame workflows for machine learning, analytics, and research. What dfxpy does Automated Cleaning smart type inference missing value imputation duplicate removal snake_case normalization currency/percentage/date detection categorical encoding ML Preparation feature/target splitting optional scaling target encoding date feature extraction class balancing Diagnostics & Research leakage detection skewness + multicollinearity audits statistical profiling dataset lineage hashing publication-ready LaTeX exports Workflow Utilities reusable transformation pipelines dataframe comparison tools schema validation standalone HTML EDA…