Why hallucination in LLMs is mathematically inevitable (derivation + notes)

📰

Why hallucination in LLMs is mathematically inevitable (derivation + notes)

Reddit r/learnmachinelearning·u/Ok-Ear7580·about 1 month ago

#math #model #hallucination #notes #objective #article

Reading 0:00

15s threshold

Why hallucination in LLMs is mathematically inevitable (derivation + notes) I’ve been digging into the math behind LLM behavior recently, and one conclusion that keeps coming up is: >hallucination isn’t just a bug — it’s a consequence of the objective function. At a high level, LLMs are trained to model: P(x\_t | x\_<t) using maximum likelihood. That means: * they optimize for *probability*, not *truth* * the learned distribution reflects the training data (which is incomplete + inconsistent) * softmax forces a normalized distribution → the model must always pick something So when the model is uncertain, it doesn’t abstain — it still generates a high-probability continuation, which can look confident but be wrong.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Why hallucination in LLMs is mathematically inevitable (derivation + notes)