RAG - Chunking — HashtagPLUS

1 / 4

RAG - Chunking

DEV Community·Ramya Perumal·22 days ago

#0OWQRDRm

#ai #llm #machinelearning #rag #chunking #chunk

Reading 0:00

15s threshold

What is chunking Chunking is the process of breaking data into smaller pieces called chunks. Chunking happens before the data is fed into an embedding model, which converts each chunk into a vector (point) and stores the converted vectors in a vector database. Why chunking Matters in RAG Data can contain different types of context while still relating to the same topic. From the above example, we may have a paragraph related to the Redis database that contains multiple contexts. An embedding model like nomic-embed-text converts the entire paragraph into a single vector point and stores it in the database. This is where chunking plays a major role. Proper chunking helps retrieve only the most relevant information and avoids unrelated content. For example, if a chunk contains information about both Python and Java, a query about Python may also retrieve Java-related information because both topics exist in the same chunk. Effective chunking helps prevent unrelated data from being retrieved.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

RAG - Chunking