RAG Re-Indexing Without Downtime: A Dual-Write Pattern for Embeddings

1 / 3

RAG Re-Indexing Without Downtime: A Dual-Write Pattern for Embeddings

DEV Community·Gabriel Anhaia·28 days ago

#BtviwgCH

#rag #ai #doc_id #await #embedding #write

Reading 0:00

15s threshold

Book: RAG Pocket Guide: Retrieval, Chunking, and Reranking Patterns for Production Also by me: Thinking in Go (2-book series) — Complete Guide to Go Programming + Hexagonal Architecture in Go My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools Me: xgabriel.com | GitHub You've shipped a RAG service. It works. Then imagine OpenAI ships a hypothetical text-embedding-next , your eval rig says it picks up 4 points of recall on the corpus you actually care about, and your VP wants the upgrade live by Friday. The naive plan is: stop ingest, drop the old vectors, re-embed the corpus, write the new vectors back, resume traffic. On a small corpus this takes minutes. On 8 million chunks it takes a weekend, and your support bot returns nothing useful while the swap is mid-flight. The right shape is a dual-write re-index . Writes land in both the old and the new index from the moment you start. Reads keep going to the old one.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

RAG Re-Indexing Without Downtime: A Dual-Write Pattern for Embeddings