Menu

Post image 1
Post image 2
Post image 3
Post image 4
Post image 5
1 / 5
0

Do Open Frontier Models Have A Chance Against Closed Models?

DEV Community·Jason Agostoni·19 days ago
#f8uSFJ8C
Reading 0:00
15s threshold

Which of the new open-ish frontier models has the best chance to stand up against closed-source models on both cost and quality? I ran Ship-Bench against Kimi K2.6, Qwen 3.6 Plus, and DeepSeek v4 Pro to find out. Hypothesis: All three models will stand up to the hype and provide good enough output quality but destroy closed frontier's on price. Kimi is rumored to have "Opus-like" quality with Qwen and DeepSeek standing a long-time competitors. Key Insights (tldr;) DeepSeek v4 Pro finished first with a 95.0 average and 5/5 gate passes, ahead of Kimi K2.6 at 93.9 and 5/5 passes, and Qwen 3.6 Plus at 91.1 with 4/5 passes. All three produced strong-looking apps and much better visual results than the earlier Gemini and Gemma runs. Token usage is the clearest economic indicator: Kimi used an astounding 64.1 million tokens, Similarly, Qwen used 63.3 million, and DeepSeek used "just" 26.3 million. Qwen's planning left much to be desired, while Kimi and DeepSeek both cleared all five SDLC roles.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More