A RAG (Retrieval-Augmented Generation) chatbot answers questions based on your own documents — not just its training data. This guide builds one from scratch using Python, ChromaDB, and Claude. Originally published at kalyna.pro What Is RAG? RAG combines two things: Retrieval : search your documents for relevant chunks Generation : use an LLM to write an answer based on those chunks Without RAG, Claude can only answer questions based on its training data. With RAG, you inject relevant context directly into the prompt. Architecture Indexing : load docs → split into chunks → embed → store in vector DB Querying : embed question → find similar chunks → send to Claude → return answer Step 1: Install Dependencies pip install anthropic chromadb sentence-transformers pypdf2 Enter fullscreen mode Exit fullscreen mode Step 2: Load and Chunk Documents import PyPDF2 from pathlib import Path def load_pdf ( path : str ) -> str : with open ( path , " rb " ) as f : reader = PyPDF2 . PdfReader ( f ) return " \n " .…