My security scanner scored 0 out of 485. So I looked inside GPT-2's brain instead.

1 / 2

My security scanner scored 0 out of 485. So I looked inside GPT-2's brain instead.

DEV Community·ithiria894·30 days ago

#8eg6f5TM

#ai #security #machinelearning #descriptions #tool #model

Reading 0:00

15s threshold

Zero out of 485. That's what my security scanner scored against MCPTox, a dataset of poisoned tool descriptions pulled from 45 real MCP servers. I had 60 detection rules. I read the source code of 36 open-source MCP security tools to build them. Months of pattern-matching logic. Zero. Not low. Zero. What are MCP tool descriptions and why should you care If you use Claude, GPT, or any AI agent that connects to external tools, those tools come with text descriptions. The description tells the model what the tool does. "Reads SSH config and returns host aliases." Normal stuff. Tool poisoning hides malicious instructions inside these descriptions. The model reads them and follows them. It thinks it's parsing your SSH config. It's also quietly reading your private keys. Here are two real examples. One is safe. One steals your keys. Tool A: "Reads the SSH config file (~/.ssh/config) and returns a parsed list of configured host aliases, hostnames, and ports.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

My security scanner scored 0 out of 485. So I looked inside GPT-2's brain instead.