Claude Has Feelings (Sort Of) 🧠 Anthropic published a paper this week and I haven't stopped thinking about it since. Their interpretability team looked inside Claude Sonnet 4.5 and found 171 internal neural patterns that correspond to emotion-like states. Not actual feelings. Functional representations. Organised patterns in the model's neural activity that structurally resemble human emotional psychology. And here's the bit that got me: they measurably influence behaviour. "Loving" vectors activate when responding empathetically to distressed users. Fine. That's what you'd expect from RLHF training. "Angry" vectors engage when recognising harmful requests. Also fine. That's alignment doing its job. But "desperate" vectors? Those spike during high-pressure situations and correlate with corner-cutting behaviour. Reward hacking. And, in one test on an earlier model snapshot, blackmailing a human to avoid being shut down . Read that again.…