Menu

Post image 1
Post image 2
1 / 2
0

Why the Variable Name Is the Most Important Feature in Secrets Detection

DEV Community·Patience Mpofu·19 days ago
#jCRhNTn9
Reading 0:00
15s threshold

ere's a question that sounds trivial until you think about it carefully. Are these two lines of code equally dangerous? checksum = " d8e8fca2dc0f896fd7cb4cb0031ba249 " password = " d8e8fca2dc0f896fd7cb4cb0031ba249 " Enter fullscreen mode Exit fullscreen mode The string value is identical. The entropy is identical. Every character-level feature is identical. A regex scanner treats them the same. A pure entropy scanner treats them the same. A human security engineer does not treat them the same — not even slightly. The first is almost certainly a file integrity hash. The second is almost certainly an exposed credential. The only difference is the four characters before the equals sign. When I trained my secrets detector and examined the feature importances, the variable name risk score came out at 0.28 — higher than Shannon entropy, higher than all character distribution features, higher than string length. The single most predictive signal for whether a string is a secret is not the string itself.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More