Menu

📰
0

Vector DB and ANN vs PHE conflict, is there a practical workaround? [D]

Reddit r/MachineLearning·u/XPERT_GAMING·about 1 month ago
#KXK795cU
Reading 0:00
15s threshold

Vector DB and ANN vs PHE conflict, is there a practical workaround? [D] Hey everyone, I have been digging into vector databases, ANN search, and privacy preserving techniques (specifically PHE), and I have hit a design roadblock that I would love some input on. The problem: Using a vector DB with ANN (HNSW, IVF, etc.) is great for fast similarity search at scale. But if we introduce Partially Homomorphic Encryption (PHE), we lose the ability to efficiently use ANN. This happens because encrypted embeddings force us into linear scan or exact computation, which makes ANN useless. What I am considering: One workaround I thought of is to drop the vector DB entirely, store embeddings in a standard database as BLOBs, and use something like RFID or tag based filtering to narrow down candidates before computing similarity. The idea is to reduce the search space first using metadata, then run similarity on a much smaller subset. Concerns: Will this scale to millions of embeddings?…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More