Visualizing Why Standardization Changes Decision Boundaries

1 / 2

Visualizing Why Standardization Changes Decision Boundaries

DEV Community·hqqqqy·18 days ago

#iIMVb8Un

#machinelearning #python #tutorial #standardization #print #distance

Reading 0:00

15s threshold

My SVM classifier drew a perfect decision boundary in testing. In production, it misclassified 40% of samples. The only difference: I forgot to standardize one new feature. Here's why that completely changed where the boundary was drawn. The Visual Intuition Imagine classifying customers as "will churn" or "won't churn" based on two features: age (20-60) and income (20,000-200,000). Without standardization, the decision boundary is almost vertical because income varies 100× more than age. import numpy as np import matplotlib.pyplot as plt from sklearn.svm import SVC from sklearn.preprocessing import StandardScaler # Generate sample data: [age, income] np . random . seed ( 42 ) X_class0 = np . random . randn ( 50 , 2 ) * [ 5 , 20000 ] + [ 30 , 50000 ] # Won't churn X_class1 = np . random . randn ( 50 , 2 ) * [ 5 , 20000 ] + [ 45 , 120000 ] # Will churn X = np . vstack ([ X_class0 , X_class1 ]) y = np .…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Visualizing Why Standardization Changes Decision Boundaries