How I Made My Vector Search Engine 16x Faster Without Changing the Algorithm

1 / 2

How I Made My Vector Search Engine 16x Faster Without Changing the Algorithm

DEV Community·kartikay dubey·29 days ago

#IdEIewZh

#vectorsearch #cpp #performance #vector #fullscreen #distance

Reading 0:00

15s threshold

I built a Vamana-based vector search engine in C++ called sembed-engine . Recently I made a pull request that sped up queries by 16x and builds by 9x. The algorithm stayed exactly the same. The recall stayed at 1.0. The number of visited nodes did not change. The speedup came from data layout. The old design The original code stored vectors as separate objects pointed to by shared_ptr : struct Record { int64_t id ; std :: shared_ptr < Vector > vector ; }; Enter fullscreen mode Exit fullscreen mode This is clean C++. Every record has an id and a vector. The vector knows how to calculate distance. In the hot path, though, the CPU had to load the record, read the shared_ptr , follow the pointer, call virtual methods, and read each float through an abstraction layer. Millions of times per query. The new layout I replaced the object graph with a flat array.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

How I Made My Vector Search Engine 16x Faster Without Changing the Algorithm