Nitin Srivastava

Author Profile

Claim This Author Profile

Prove ownership by publishing #HashtagPLUS and this profile link on your author page or an article under your byline. A moderator or admin will review the request before it merges into your real HashtagPLUS username.

0 karma0 postsjoined about 1 month ago

Production Reranker Layer for RAG in Python: Cross-Encoder, Cohere Fallback, and Reciprocal Rank Fusion (Runnable Code)

🌐 dev.toSource

From Dev RSS Feed: Production Reranker Layer for RAG in Python: Cross-Encoder, Cohere Fallback, and Reciprocal Rank Fusion (Runnable Code)

#python #rag #llm #reranker #cohere #list #candidates #query

22 days ago

Bulletproofing LLM Structured Output in Python: Healing Retries, Cost Caps, and Drift Detection (Runnable Code)

🌐 dev.toSource

From Dev.to - ai: Bulletproofing LLM Structured Output in Python: Healing Retries, Cost Caps, and Drift Detection (Runnable Code)

#python #llm #ai #model #import #json #self #article

24 days ago

Building a Production LLM Evaluation Harness in Pytest: Cost-Bounded, Flake-Aware, CI-Gated (Runnable Python)

🌐 dev.toSource

From Dev.to - ai: Building a Production LLM Evaluation Harness in Pytest: Cost-Bounded, Flake-Aware, CI-Gated (Runnable Python)

#python #llm #testing #ai #tests #cost #test #runs

27 days ago

How We Cut API Response Time from 2.3s to 180ms Using Redis + Smart Caching

🌐 dev.toSource

p95 latency dropped from 2.3 seconds to 180 milliseconds. Same hardware, same database, same traffic. The only thing that changed was how we cached — and I don't mean slapping @lru_cache on a function. I'm writing this because every Redis caching tutorial I read before this proje

#dev #class #code #cache #redis #article #ama #englishlanguage

about 1 month ago

Menu

Nitin Srivastava