Why A.I. Safety Controls Are Not Very Effective

1 / 4

Why A.I. Safety Controls Are Not Very Effective

www.nytimes.com·https://www.nytimes.com/by/cade-metz·19 days ago

#eUBgks1w

#after #artificialintelligence #computersandtheinternet #computersecurity #anthropicaillc #systems

Reading 0:00

15s threshold

Advertisement SKIP ADVERTISEMENT You have a preview view of this article while we are checking your access. When we have confirmed access, the full article content will load. Three years after the debut of ChatGPT, fooling A.I. systems into bad behavior is almost trivial. A.I. systems from companies like OpenAI and Anthropic have guardrails that often prove to be porous. Credit... Jason Henry for The New York Times May 14, 2026, 1:19 p.m. ET When companies like Anthropic, Google and OpenAI build their artificial intelligence systems, they spend months adding ways to prevent people from using their technology to spread disinformation, build weapons or hack into computer networks. But recently, researchers in Italy discovered that they could break through these protections with poetry . They used poetic language to trick 31 A.I. systems into ignoring internal safety controls.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Why A.I. Safety Controls Are Not Very Effective