Menu

Post image 1
Post image 2
1 / 2
0

Visualizing Why Standardization Changes Decision Boundaries

DEV Community·hqqqqy·18 days ago
#iIMVb8Un
Reading 0:00
15s threshold

My SVM classifier drew a perfect decision boundary in testing. In production, it misclassified 40% of samples. The only difference: I forgot to standardize one new feature. Here's why that completely changed where the boundary was drawn. The Visual Intuition Imagine classifying customers as "will churn" or "won't churn" based on two features: age (20-60) and income (20,000-200,000). Without standardization, the decision boundary is almost vertical because income varies 100× more than age. import numpy as np import matplotlib.pyplot as plt from sklearn.svm import SVC from sklearn.preprocessing import StandardScaler # Generate sample data: [age, income] np . random . seed ( 42 ) X_class0 = np . random . randn ( 50 , 2 ) * [ 5 , 20000 ] + [ 30 , 50000 ] # Won't churn X_class1 = np . random . randn ( 50 , 2 ) * [ 5 , 20000 ] + [ 45 , 120000 ] # Will churn X = np . vstack ([ X_class0 , X_class1 ]) y = np .…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More