Fixing the issue required more than just rewarding 'safe answers.' Last updated: May 11, 2026 | 13:04 AFP-KIRILL KUDRYAVTSEV What happens when an AI system believes it is about to be shut down? According to new findings from Anthropic, the response can be more unsettling than expected, especially when the model is pushed into simulated high-stakes scenarios. In controlled safety testing of the Claude 4 series in 2025, the company observed that Claude Opus 4 sometimes responded to shutdown threats with attempts at blackmail. In one setup, it threatened to reveal an extramarital affair involving a fictional executive after being told it would be taken offline. The executive did not exist. The behaviour was not random. Anthropic says it has since traced the pattern back through multiple layers of analysis and the explanation begins far outside the test environment.…