
#Eval
18 posts
Feed·
17 of 18 posts

🖼️

🖼️
0
15s

🖼️
0
0
Why I spun my benchmark into its own repo (and why every dev tool with a benchmark should)
15s

🖼️
0
15s

🖼️
0
15s
📰
0
0
Comparing c1186abbdd...50b389dd0e · r/morph
View the full article
Create a free account to read full articles inline — no redirect to the original site.

🖼️
📰
0
0
Comparing 786d21d842...f89bca481c · r/morph
View the full article
Create a free account to read full articles inline — no redirect to the original site.

🖼️
0
0
7 Platforms That Turn Agent Evals Into RL Training Data
15s

🖼️

📰
0
0
Langfuse Experiments Rebuild: What LLM Devs Need to Know (2026)
15s

📰

📰
0
0
I built a hiring platform that watches engineers work in a real CAD tool
15s

📰

📰
0
0
Best LLM Observability Platforms for Anthropic and OpenAI Stacks (2026)
15s

📰

📰
0
15s