Menu

Post image 1
Post image 2
Post image 3
1 / 3
0

GPT-5.5 is OpenAI's best model. But paying more for it makes no sense.

DEV Community·Rohan Sharma·27 days ago
#pgfGOqso
#ai#agents#openai#skill#model#codex
Reading 0:00
15s threshold

We added OpenAI’s gpt-5.5 model to our eval suite the day it launched. We ran 1,742 tests overall, which included over 45 task scenarios across using 11 real engineering skills, each run 6 times and averaged the data, which is shown in this blog. TL;DR The gpt-5.5 model has the highest raw capability of any OpenAI model we've tested. When it uses agent skills and performs the same tasks, it pretty much ties with gpt-5.4 on score but costs 63% more per run. Question Answer Best Codex model out of the box? gpt-5.5: 75.6 avg baseline, highest in the family Best Codex model with skills loaded? gpt-5.4 and gpt-5.5 tie at 89.3 and 89.4 Worth the 63% price premium over gpt-5.4? With this data, we don’t think so Any scenario where it wins? Latency: 89.5s vs 135.4s for gpt-5.4 Should you use gpt-5.3 instead? No, oddly enough, gpt-5.3 costs 47% more than gpt-5.4 for a worse result because of the token bloat.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More