#scores_a

🖼️

Why comparing average scores is the wrong way to evaluate LLM prompts (and what to do instead)

DEV Community·Aayush kumarsingh·25 days ago

From Dev.to - opensource: Why comparing average scores is the wrong way to evaluate LLM prompts (and what to do instead)

15s