Most RAG systems get tested with a handful of happy-path questions. Someone asks "what is machine learning?", gets a reasonable answer, and calls it done. Then it goes to production and users find the edge cases, hallucinations on out-of-scope questions, failed refusals on adversarial prompts, latency that collapses under real concurrent load. RAG Pipeline Stress Tester is a battle-testing toolkit that finds these issues before deployment. What It Does Takes any HTTP RAG endpoint and hammers it with 7 categories of adversarial queries under configurable concurrent load. Tracks relevance, hallucination, refusal quality, and latency for every query sent. Scores everything into a composite health score from 0 to 100. Breaks results down by query category so you know exactly which failure modes are causing issues. Measures p50, p95, and p99 latency under realistic concurrent load, not just single-request response times. Produces an HTML report with interactive charts and a JSON report for CI/CD integration.β¦