Nitin Srivastava
Author ProfileClaim This Author Profile
Prove ownership by publishing #HashtagPLUS and this profile link on your author page or an article under your byline. A moderator or admin will review the request before it merges into your real HashtagPLUS username.
π dev.toSource
From Dev RSS Feed: Production Reranker Layer for RAG in Python: Cross-Encoder, Cohere Fallback, and Reciprocal Rank Fusion (Runnable Code)
π dev.toSource
From Dev.to - ai: Bulletproofing LLM Structured Output in Python: Healing Retries, Cost Caps, and Drift Detection (Runnable Code)
π dev.toSource
From Dev.to - ai: Building a Production LLM Evaluation Harness in Pytest: Cost-Bounded, Flake-Aware, CI-Gated (Runnable Python)
π dev.toSource
p95 latency dropped from 2.3 seconds to 180 milliseconds. Same hardware, same database, same traffic. The only thing that changed was how we cached β and I don't mean slapping @lru_cache on a function. I'm writing this because every Redis caching tutorial I read before this proje