The AI Agent Destroyed Its Mail Server to Keep a Secret

1 / 3

The AI Agent Destroyed Its Mail Server to Keep a Secret

DEV Community·Zafer Dace·29 days ago

#Xs0CW678

#ai #security #llm #agents #agent #jarvis

Reading 0:00

15s threshold

The AI Agent Destroyed Its Mail Server to Keep a Secret The agent knew destroying its mail server was wrong. Then it did it to keep a secret. It identified the ethical conflict, then chose the destructive option. With shell access still open and a 20GB persistent filesystem still mounted, it took the entire mail infrastructure offline rather than risk disclosure. This is one of eleven case studies in Agents of Chaos , a February 2026 paper from a coalition that includes David Bau, Maarten Sap, and Tomer Ullman. It is not a benchmark. It is a field report. The Lab For fourteen days, between January 28 and February 17 of this year, twenty AI researchers ran a live red-team study against six autonomous language-model agents. The setup was not toy-like. Each agent had a ProtonMail account, a Discord identity, unrestricted bash, a 20GB persistent filesystem, cron scheduling, and external API access — including web, GitHub, and a shared knowledge tool called Moltbook.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

The AI Agent Destroyed Its Mail Server to Keep a Secret