The entropy of the very first content‑bearing token already separates factual answers from hallucinations with an AUROC of 0.82. That single number rivals the scores of methods that need dozens of sampled continuations. The surprise is that nothing more than the greedy decode’s first‑token distribution is required. Hallucination detection has long relied on self‑consistency: generate many answers, compare them, and flag low agreement as doubtful. Semantic self‑consistency tightens the signal by clustering answers by meaning, but both approaches multiply decoding cost and need extra inference components. Practitioners therefore face a trade‑off between reliability and latency. The study introduces φ₁ₙₜ, the normalized entropy of the top‑K logits at the first answer token. Across three 7–8 B instruction‑tuned models and two closed‑book QA benchmarks, φ₁ₙₜ attains a mean AUROC of 0.820, surpassing semantic self‑consistency (0.793) and surface‑form self‑consistency (0.791) [1] .…