GPT-5.5 Ties Claude Mythos in Enterprise Cyber Attack Tests, AISI Finds

1 / 5

GPT-5.5 Ties Claude Mythos in Enterprise Cyber Attack Tests, AISI Finds

DEV Community·gentic news·about 1 month ago

#UGQgd7pd

#ai #programming #tech #product #aisi #mythos

Reading 0:00

15s threshold

UK AISI finds GPT-5.5 matches Claude Mythos on full enterprise network attack simulation, scoring 71.4% on expert tasks vs 68.6%. UK AISI found GPT-5.5 matches Claude Mythos Preview in autonomously solving a full enterprise network attack simulation. OpenAI's model scored 71.4% on expert-level capture-the-flag tasks, edging out Anthropic's 68.6%. Key facts GPT-5.5 scored 71.4% on expert CTF tasks vs Mythos 68.6%. Only second model to fully solve enterprise network simulation TLO. GPT-5.5 succeeded in 2 of 10 TLO attempts; Mythos in 3 of 10. GPT-5.4 scored 52.4%; Claude Opus 4.7 scored 48.6%. AISI estimates human expert needs ~20 hours for same simulation. Full Network Attack: GPT-5.5 Matches Mythos The UK AI Security Institute (AISI) tested OpenAI's GPT-5.5 against a battery of cyberattack evaluations, finding it is the second model after Anthropic's Claude Mythos Preview to fully complete a multi-stage enterprise attack simulation [According to AISI's published results].…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

GPT-5.5 Ties Claude Mythos in Enterprise Cyber Attack Tests, AISI Finds