Menu

#Llmkubev072

1 post

Feed
1 of 1 post
62.8% on Aider Polyglot from a MacBook Pro. Then the other model we tried scored 4%. Here's what actually happened, with a working cost loop attached.
📰
0

62.8% on Aider Polyglot from a MacBook Pro. Then the other model we tried scored 4%. Here's what actually happened, with a working cost loop attached.

DEV Community·Christopher Maher·about 1 month ago
#PWkpNTLn

Qwen3.6-35B-A3B Q8 on a MacBook Pro M5 Max scored 62.8% on Aider Polyglot, beating Claude Sonnet 4 with 32k thinking. Then Devstral 2 scored 4% on the same harness but 81.7% on HumanEval+.…

15s
Read More