Immutable Data Structures in Qdrant

ThinkBeat · 2024-08-23T12:14:44 1724415284

I do not understand how you can pack 1536d into 192 bytes ??? That some entirely impossible?

> Vector search, on the other hand, requires reading a lot of small vectors, which might create a large overhead. It is especially noticeable if we use binary quantization, where the size of even large OpenAI 1536d vectors is compressed down to 192 bytes. Dataset size: 2M 768d vectors (~6Gb Raw data), binary quantization, 650Mb of RAM limit. All benchmarks are made with minimal RAM allocation to demonstrate disk cache efficiency.

wredue · 2024-08-21T21:26:54 1724275614

>Immutable data structures, while tricky to implement correctly, offer significant performance gains, especially for read-heavy systems like search engines. They allow us to take full advantage of hardware optimizations, reduce memory overhead, and improve cache performance.

Nonsense.

LtdJorge · 2024-08-21T23:54:08 1724284448

wredue · 2024-08-22T00:53:34 1724288014

Because the measurably do none of those things.

There is no performance benefit that immutable structures offer that cannot be had by mutable ones. It is asinine to assert otherwise, especially when literally every single measurement ever done demonstrates that immutable structures perform orders of magnitude slower.

You do not need immutability to create a slow moving cache. Immutability is not where any semblance of performance is coming from here.

If they didnt succumb to idiotic bullshit nonsense, they wouldn’t have even needed this post. If you see any of your senior architects reading medium, fire them immediately.

Immutable structure can, in a single use case, meet mutable ones for performance and that single case is:

-read speed on aligned, flattened data

But, that ignores that fact that getting to a point of aligned, flattened data with immutable structures is incredibly slow.

moab · 2024-08-22T02:04:20 1724292260

Your analysis sounds reasonable to the non-expert, but recent work on purely-functional trees suggests that the gap is smaller than you suggest ("orders of magnitude slower").

E.g., see the nice work on the PAM library (https://arxiv.org/abs/1612.05665). Ideas from this work were used to build lots of cool things (immutable graph data structures, segment trees, databases) that are very fast, and all immutable.

wredue · 2024-08-22T14:03:44 1724335424

This shit is all lies dude. In measurements like this, they constrain mutable structures to the same memory layouts and ideas as immutable ones and say “see! Sometimes kinda close!”

But this is just bog standard FP community lies.

moab · 2024-08-22T20:38:38 1724359118

> bog standard FP community lies > constrain mutable structures to the same memory layouts and ideas as immutable ones

What are you even talking about, "dude"? I don't think you have the background or knowledge you seem to think you have to argue about this space. It's OK, as you blithely pointed out in your earlier post, there is a place called medium where you'll find likeminded folks that will eat up your drivel.

wredue · 2024-08-22T21:59:45 1724363985

Interestingly, medium is the place pushing idiotic “always runtime immutable or ur dumb” nonsense, resulting in this blog post because people such as yourself are eating it up on the basis of moronic bullshit like “beautiful”.

>what are you even talking about

I am talking about your own posts, perpetuating more moronic FP lies

>background

Nobody needs background to see that needlessly copying data for no reason is more expensive than not doing that.

Unfortunately for FP programmers, the moronic levels of doing this is actually worse than it sounds, as needlessly copying data is basically completely antithetical to how every modern CPU wants to optimally work.

It isn’t “me” declaring that runtime immutability is moronic. It is your CPU telling you that.

j-pb · 2024-08-23T09:34:43 1724405683

The CPU is disagreeing with you:

https://ceur-ws.org/Vol-1291/ewili14_15.pdf

Given the age of your account my guess is that you're still young with too strong opinions and too little experience. That's ok, we all start there. But I can tell you from experience, that you'll need to outgrow that mindset in order to grow intellectually.

The difference between mutability and immutability is not as clear cut as you think, and might just be the difference between a `& mut` and a `&`.

j-pb · 2024-08-22T06:01:42 1724306502

You obviously didn't read the article and jumped to the conclusion that they are talking about immutably peristent data structures, which they are not.

And even if they were talking about those (which they don't) your critique doesn't sound as smart as you seem to think if we just called them "lockless copy-on-write" data structures.

Lockless datastructures have some obvious advantages, and copy on write is one way to achieve that.

itishappy · 2024-08-22T15:14:49 1724339689

> Immutable structure can, in a single use case, meet mutable ones for performance and that single case is:

> -read speed on aligned, flattened data

This is the case outlined in TFA.

> But, that ignores that fact that getting to a point of aligned, flattened data with immutable structures is incredibly slow.

It might, if that weren't the whole point of TFA.

wredue · 2024-08-23T00:54:35 1724374475

And my point was that immutability isn’t granting any performance here. It is, in fact, causing a performance detriment for no reason other than “a medium dudebro wrote an article that included the words beautiful and elegant, and our engineers are idiots that eats that brain dead nonsense up”.

The whole article is, hilariously, about how shitty immutability is, so they hack and slash and bend over backwards to make idiotic nonsense comport to their needs, when had they just fired all their medium reading seniors off the bat, they wouldn’t be in a position of spinning up 10x the CPUs to mangle their data around this bullshit.

wesnerm2 · 2024-08-22T16:50:13 1724345413

> There is no performance benefit that immutable structures offer that cannot be had by mutable ones. It is asinine to assert otherwise, especially when literally every single measurement ever done demonstrates that immutable structures perform orders of magnitude slower.

Copying is free. Comparisons and change detection are much faster. Data-sharing, thread-safety, content-addressing, versioning/persistence have faster and more efficient implementations--often for zero cost. Immutable data structures have more guarantees, which lend themselves to more optimizations.

Chart parsing uses immutable data structures and many other DP algorithms rely on immutability to take an algorithm from exponential running time and space to polynomial running time and space. Git uses content-addressing to implement zero-cost branches, which used to be inefficient in traditional version control systems, which were more imperative.