At my company we recently began an internal project to benchmark LLMs for hallucinations. We are building internal tools and tools for clients. I am curious if anybody has experience or can point me to papers or tools that help measure a hallucination. I am currently reading this https://arxiv.org/html/2512.22416v2 but wondering what experiences people have in the wild.