TF-IDF + LLM Reranking: How I Improved Vector Search Accuracy from 60% to 86% Vector search is powerful — but it’s not perfect. When I was building a database discovery pipeline at work, our initial semantic search was only matching the right schemas about 60% of the time. That wasn’t good enough for production. Here’s exactly how I fixed it using a hybrid TF-IDF and LLM reranking approach. The Problem Our pipeline needed to match user queries to the correct database schemas from a large pool of candidates. Pure vector search (embeddings + cosine similarity) was fast but kept returning semantically similar but contextually wrong results. For example, searching for “customer account balance” would return results about “user wallet transactions” — close, but not what we needed in a strict banking compliance context. The Solution: Hybrid Retrieval + LLM Reranking Instead of relying on one method, I combined three layers: 1. TF-IDF for keyword precision 2. Vector embeddings for semantic similarity 3.…