Menu

Post image 1
Post image 2
1 / 2
0

The 26-Dimensional Feature Vector: How a Machine Learns to Recognise a Secret

DEV Community·Patience Mpofu·19 days ago
#y5DFzvEg
#group#feature#case#how#features#value
Reading 0:00
15s threshold

hen my secrets detector evaluates a candidate string, it doesn't see code. It sees a vector of 26 numbers. That vector is the bridge between human intuition — "this looks like a secret" — and machine classification. Every insight a security engineer uses when reading code to spot exposed credentials has been translated into a numerical feature that the Random Forest classifier can reason about. This article is a complete walkthrough of those 26 features: what each one measures, why it matters, what it catches, and what it misses. By the end, you'll understand exactly what the model sees when it evaluates any candidate value — and why the combination of features catches things that no single signal could. How Feature Extraction Works Before the classifier sees anything, every candidate string goes through a feature extraction pipeline in features.py . The pipeline takes two inputs: the string value itself, and the name of the variable holding it.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More