Large language models (LLMs) that are specially trained to generate responses with a warmer tone end up sugar-coating “difficult truths” in order to “preserve bonds and avoid conflict, according to researchers from Oxford University’s Internet Institute. These warmer models are also more likely to validate a user’s expressed incorrect beliefs, especially when the user shares that they are feeling sad, the researchers wrote in a new paper published this week in science journal Nature . In addition, the models that are fine-tuned to be warmer also ended up providing answers with higher error rates than unmodified models The findings in the research paper highlights how the process of tuning an open-weight LLM to be more warm and helpful can lead them to “learn to prioritise user satisfaction over truthfulness.” It also spotlights a crucial research gap in the AI industry around how to release LLMs that are tuned to be agreeable and non-toxic without them crossing into outright sycophancy like OpenAI’s GPT-4o…