So why would you want to talk about arbitrarily large input (where the input is an array that is stored in memory)?
As I understood, this big-O notation is intended to have some real-world usefulness, is it not? Care to elaborate what that usefulness is, exactly? Or is it just a purely fictional notion in the realm of ideas with no real-world application?
And if so, why bother studying it at all, except as a mathematical curiosity written in some mathematical pseudo-code rather than a programming or engineering challenge written in a real-world programming language?
Big-O analysis is about scaling behavior - its real-world implications lie in what it tells you about relative sizes, not absolute sizes.
E.g., if you need to run a task on 10M inputs, then knowing that your algorithm is O(N) doesn't tell you anything at all about how long your task will take. It also doesn't tell you whether that algorithm will be faster than some other algorithm that's O(N^2).
But it does tell you that if your task size doubles to 20M inputs, you can expect the time required for the first algorithm to double, and the second to quadruple. And that knowledge isn't arcane or theoretical, it works on real-world hardware and is really useful for modeling how your code will run as inputs scale up.
thank you for this explanation! to me it looks like the algo sorts the whole array but in groups of 5; the number of chunks should scale O(N/5) = O(N), no? so how can you claim just by chunking you can ignore the fact that you still sorted N elements e.g. a selection sort would still perform N^2 comparisons total.
Ah, so the issue here is the difference between quickSort and quickSelect. In both cases you pick a pivot and divide the data into two partitions - but in quickSort you then recurse on both partitions, and in quickSelect you determine which partition your target is in and recurse only on that one.
So you're right that dividing into chunks of 5 and iterating over them doesn't affect the scaling - you wind up with an array of (N/5) medians, and it's still O(N) to iterate over that array. But the next step isn't to sort that array, you only need to quickSelect on it, and that's why the final step is O(N) rather than O(NlogN).
> (where the input is an array that is stored in memory)?
If the input is an array that is stored in a piece of real-world memory, then the only possible complexity is O(1). It's just how big-O works. Big-O notation is an abstraction that is much much closer to mathematics than to engineering.
> this big-O notation is pretending to have some real-world usefulness...
Big-O notation is not a person so I'm not sure whether it can pretend something. CS professors might exaggerate big-O notation's real-world usefulness so their students don't fall asleep too fast.
> fictional
Theoretical. Just like the other theoretical ideas, at best big-O notation gives some vague insights that help people solve real problems. It gives a very rough feeling about whether an algorithm is fast or not.
By the way, Turing machine is in this category as well.
Except that there is no such thing as "arbitrarily large storage", as my link in the parent comment explained: https://hbfs.wordpress.com/2009/02/10/to-boil-the-oceans/
So why would you want to talk about arbitrarily large input (where the input is an array that is stored in memory)?
As I understood, this big-O notation is intended to have some real-world usefulness, is it not? Care to elaborate what that usefulness is, exactly? Or is it just a purely fictional notion in the realm of ideas with no real-world application?
And if so, why bother studying it at all, except as a mathematical curiosity written in some mathematical pseudo-code rather than a programming or engineering challenge written in a real-world programming language?
Edit: s/pretending/intended/