I think the whole field of vector databases is mostly just one huge misunderstanding. Most of you are not Google or any other big tech company so so won't have billions of embeddings.
It's crazy how people add bloat and complexity to their stuff just because they want to do medium scale RAG with ca. 2 million embeddings.
Here comes the punchline, you do not need a fancy vector database in this case. I stumbled over https://github.com/sqliteai/sqlite-vector which is a SQLite extension and I wonder why no one else did this before, but it simply implements a highly optimized brute force search over the vectors, so you get sub 100ms queries over millions of vectors with perfect recall. It uses dynamic runtime dispatch that makes use of the available SIMD instructions your CPU has. Turns out this might be all you need. No need for memory a memory hungry search index (like HNSW) or writing a huge index to disk (like DiskANN).
Okay, bummer. No support for quantized datatypes yet and from the docs I cannot see anything that mentions fast brute force search. I personally don't need an index. But I see that https://github.com/unum-cloud/usearch which is used by duckdb-vss in turn uses https://github.com/ashvardanian/simsimd which should make a really fast exact vector similarity search possible. Am I missing something here?
Oh right, duckdb being columnar is the ultimate brrr factor for such a brute force vector similarity search. But doesn't using a HNSW index completely forfeit this potential advantage?
I'd be cautious. Project seems abandoned. And I wouldn't say it's one of those cases where a piece of software is just finished and doesn't need any changes.
It's crazy how people add bloat and complexity to their stuff just because they want to do medium scale RAG with ca. 2 million embeddings.
Here comes the punchline, you do not need a fancy vector database in this case. I stumbled over https://github.com/sqliteai/sqlite-vector which is a SQLite extension and I wonder why no one else did this before, but it simply implements a highly optimized brute force search over the vectors, so you get sub 100ms queries over millions of vectors with perfect recall. It uses dynamic runtime dispatch that makes use of the available SIMD instructions your CPU has. Turns out this might be all you need. No need for memory a memory hungry search index (like HNSW) or writing a huge index to disk (like DiskANN).