Menu

Post image 1
Post image 2
1 / 2
0

I Increased Retrieval From Top-5 to Top-20. My Answers Got Worse

DEV Community·Md Ayan Arshad·25 days ago
#1OaHd2kU
Reading 0:00
15s threshold

The standard advice for improving RAG retrieval quality is: retrieve more candidates, then filter down. Bigger pool, better reranker, better answers. I followed that advice in my RAG System . On PDFs, going from top-5 to top-20 made my RAGAS scores drop. The answers got worse, not better. Here's what actually happened and the experiment design that explained it. TL;DR PDFs (40 QA pairs, 5 technical documents): Condition RAGAS SUM Context Precision top-5, no reranker (baseline) 3.4330 0.8102 top-20, no reranker 3.4051 ↓ 0.8118 top-20 → Cohere rerank → top-5 3.4843 ↑ 0.8368 GitHub code (50 QA pairs, encode/httpx repo): Condition RAGAS SUM Context Precision top-5, no reranker (baseline) 3.5680 0.7812 top-20, no reranker 3.5766 0.7812 ← identical top-20 → Cohere rerank → top-5 3.7079 ↑ 0.9335 On PDFs, more candidates without a quality filter made scores drop. On code, a 4x larger pool produced zero improvement in Context Precision i.e. 0.7812 versus 0.7812.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More