Menu

Post image 1
Post image 2
Post image 3
Post image 4
Post image 5
Post image 6
Post image 7
Post image 8
Post image 9
1 / 9
0

Where the goblins came from

OpenAI·OpenAI·about 1 month ago
#FvnApJZ3
#solving#why#nerdy#model#personality#goblins
Reading 0:00
15s threshold

Starting with GPT‑5.1, our models began developing a strange habit: they increasingly mentioned goblins, gremlins, and other creatures in their metaphors. Unlike model bugs that show up through a tanking eval or a spiking training metric and point back to a specific change, this one crept in subtly. A single “little goblin” in an answer could be harmless, even charming. Across model generations, though, the habit became hard to miss: the goblins kept multiplying, and we needed to figure out where they came from. In early testing, GPT‑5.5 in Codex showed an odd affinity for goblin metaphors. The short answer is that model behavior is shaped by many small incentives. In this case, one of those incentives came from training the model for the personality customization feature ⁠ (opens in a new window) , in particular the Nerdy personality. We unknowingly gave particularly high rewards for metaphors with creatures. From there, the goblins spread.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More