I wanted to build a privacy-first RAG app. The kind where your documents never leave the browser. It means no API keys, no server, no third-party vector database watching what you search for. The architecture was obvious: embed documents client-side with something like Transformers.js , store the vectors locally, and search them with cosine similarity. Simple enough. Except the "search them" part fell apart at about 5,000 vectors. Pure JavaScript vector search has a ceiling, and it's lower than you'd think. The math itself isn't that complicated, cosine similarity is just a dot product divided by two norms. But when you're doing it across 10,000 vectors, each with 1,536 dimensions (standard for OpenAI embeddings), you're running 15 million floating-point multiplications per query. JavaScript's garbage collector doesn't care that you're in a hot loop. It will pause when it wants to. I benchmarked every existing client-side library I could find.…