Menu

Post image 1
Post image 2
Post image 3
1 / 3
0

o1 Outperforms Human Doctors on Medical Benchmarks & ER Cases

DEV Community·gentic news·30 days ago
#BPQkMRwk
Reading 0:00
15s threshold

o1 beat human physicians on medical benchmarks and real ER cases, per a new paper. Authors urge prospective trials. A new paper tests OpenAI's o1 against physicians on medical benchmarks and real ER cases. o1 outperformed both human doctors and older models across all scenarios. Key facts o1 outperformed human physicians on medical benchmarks. Study included real ER cases, not just synthetic exams. Authors urge prospective clinical trials. Model outperformed both humans and older AI models. Paper does not disclose exact benchmark scores. A new preprint evaluates OpenAI's o1 reasoning model against human physicians on medical benchmarks and real emergency room cases. According to the paper shared by @emollick, "across a variety of scenarios and applications, the large language model outperformed both human physicians and older models." The results span multiple medical domains, including diagnostic accuracy, treatment recommendations, and clinical reasoning tasks drawn from actual ER encounters.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More