Starting with GPT‑5.1, our models began developing a strange habit: they increasingly mentioned goblins, gremlins, and other creatures in their metaphors. Unlike model bugs that show up through a tanking eval or a spiking training metric and point back to a specific change, this one crept in subtly. A single “little goblin” in an answer could be harmless, even charming. Across model generations, though, the habit became hard to miss: the goblins kept multiplying, and we needed to figure out where they came from. In early testing, GPT‑5.5 in Codex showed an odd affinity for goblin metaphors. The short answer is that model behavior is shaped by many small incentives. In this case, one of those incentives came from training the model for the personality customization feature (opens in a new window) , in particular the Nerdy personality. We unknowingly gave particularly high rewards for metaphors with creatures. From there, the goblins spread.…