🖼️00HumanEval on a MacBook — 81.7% pass@1, Wi-Fi offDEV Community·Matt Macosko·about 1 month ago#CJkiU5Jz#why#ai#machinelearning#benchmark#humaneval#coder+7 more🧰Tag tools✨Add tagQwen 3 Coder 30B (8-bit MLX) scored 81.7% pass@1 on HumanEval running on a single M5 Max MacBook with Wi-Fi off. Real run, all 164 problems, 14 minutes wall-clock. The first measured number for this variant on this hardware.15s0Read later0Read More
🖼️00What 500 curated failure pairs actually fix: a breakdown across 3 seedsDEV Community·namakoo [IDFU]·about 1 month ago#SYq43zYl#ai#python#model#seeds#idfu#humaneval+3 more🧰Tag tools✨Add tagFrom Dev.to - machinelearning: What 500 curated failure pairs actually fix: a breakdown across 3 seeds15s0Read later0Read More