Claude’s new model is more ‘honest’ when it messes up

1 / 8

Claude’s new model is more ‘honest’ when it messes up

The Verge·Jay Peters·3 days ago

#lbiZdWI7

#theverge #opus #claude #anthropic #company #claims

Reading 0:00

15s threshold

Jay Peters is a senior reporter covering technology, gaming, and more. He joined The Verge in 2019 after nearly two years at Techmeme. Anthropic is releasing Claude Opus 4.8 on Thursday, and the company is touting the model’s “honesty.” According to Anthropic , it trains “all [its] models to be honest — for instance, to avoid making claims that they can’t support.” But it notes that “a general problem with AI models is that they sometimes jump to conclusions, confidently presenting their work as making progress despite thin evidence.” The AI lab claims that early testers have found that Opus 4.8 “is more likely to flag uncertainties about its work and less likely to make unsupported claims.” In the company’s evaluations, Opus 4.8 is “around 4x less likely than its predecessor to allow flaws in code it’s written to pass unremarked.” In addition to the honesty improvements, with Opus 4.8, users can direct the amount of effort Claude puts into a task.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Claude’s new model is more ‘honest’ when it messes up