*Meanwhile, the main resource in software development is man*hours. The faster y...

_vvhw · on Oct 30, 2021

I agree with you in general, but O(n²) is always dangerous. Perhaps this example you give is a little to the extreme.

Writing slow software is also a form of waste. Then it becomes a question of ranking waste and getting rid of the most wasteful, according to a cost/benefit ratio, but waste is never a good thing to tolerate in any cultural sense.

Chris_Newton · on Oct 30, 2021

O(n²) is only dangerous if you’re scaling past the point where the n² behaviour outweighs any constant factors and lower order terms. When n is small, a fancy algorithm with lower big-O complexity could still be outperformed by a brute force O(n²) one in practical situations.

On the other hand, a poor choice for a critical algorithm working with a large data set could easily increase your costs by orders of magnitude, so if anything I’d say the example I gave (assuming you meant the AWS costs one) was conservative.

I agree that we shouldn’t be careless about being wasteful, but big-O complexity rarely tells the whole story when it comes to performance.

_vvhw · on Nov 1, 2021

Sure, I guess there are cases where O(n²) is okay and sometimes even faster, but in my experiences with those runtimes I've usually typically regretted it. It tends to come back and bite as your original comment made clear. I prefer O(n) to O(n²)!

Yes, definitely agreed about big-O complexity analysis vs mechanical sympathy.

In fact, I'm currently pair-programming on an Eytzinger layout-based binary search at work for TigerBeetleDB.

Eytzinger layouts are a case in point where big-O fails nicely, since you have exactly the same log2(n) number of comparisons as per binary search, but you're getting better memory locality by grouping the first few nodes of the binary tree together into a single cache line.

At first this looks like less cache misses, but then we actually exploit this even further by telling the memory subsystem to prefetch all 16 great-great grandchildren of a node (whether we need them later or not), so now we're also doing more cache misses (!) but to achieve lower memory latency, by trading off against memory bandwidth, and thus something that's way faster than binary search.

The paper on which this is based is "Array Layouts for Comparison-Based Searching".