Menu

Post image 1
Post image 2
1 / 2
0

GPT-5.5 Matches Heavily Hyped Mythos Preview In New Cybersecurity Tests - Slashdot

it.slashdot.org·it.slashdot.org·about 1 month ago
#n5xhz44c
#comments#modal_box#mythos#preview#aisi#last
Reading 0:00
15s threshold

An anonymous reader quotes a report from Ars Technica: Last month, Anthropic made a big deal about the supposedly outsize cybersecurity threat represented by its Mythos Preview model, leading the company to restrict the initial release to "critical industry partners." But new research from the UK's AI Security Institute (AISI) suggests that OpenAI's GPT-5.5, which launched publicly last week, reached "a similar level of performance on our cyber evaluations" as Mythos Preview , which the group evaluated last month. Since 2023, the AISI has run a variety of frontier AI models through 95 different Capture the Flag challenges designed to test capabilities on cybersecurity tasks, such as reverse engineering, web exploitation, and cryptography. On the highest-level "Expert" tasks, GPT-5.5 passed an average of 71.4 percent, slightly higher than the 68.6 percent achieved by Mythos Preview (though within the margin of error).…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More