> At this scale, threads won’t cut it—while they’re pretty cheap, fire up a thread per connection and your computer will grind to a halt.
Maybe in the 2000's but I feel this reasoning is no longer valid in 2023 and should be put to rest.
10k problem.. Wouldn't modern computing not work if my Linux box couldn't spin up 10k threads? Htop says I'm currently at 4,000 threads on an 8 core machine.
The context switch for threads remains very expensive. You have 4,000 threads but that's lots of different processes spinning up their own threads. it's still more efficient to have one thread per core for a single computational problem, or at most one per CPU thread (often 2 threads per core now). You can test this by using something like rayon or GNU parallel using more threads than you have cores. It won't go faster, and after a certain point, it goes slower.
The async case is suited to situations where you're blocking for things like network requests. In that case the thread will be doing nothing, so we want to hand off the work to another task of some kind that is active. Green threads mean you can do that without a context switch.
> The context switch for threads remains very expensive
It got even more expensive in recent years after all the speculative execution vulnerabilities in CPUs, so now you have additional logic on every context switch with mitigations on in kernel.
Since that time, context switching changed from a O(log(n)) operation to an O(1) one.
I have no doubt that having a thread per core and managing the data with only non-blocking operations is much faster. But I'm pretty current machines can manage a thousand or so threads locked almost the entire time just fine.
> Since that time, context switching changed from a O(log(n)) operation to an O(1) one.
I'm not sure how that's relevant here, if for example something takes 1ms and I do it 1000 times a second, I'm using 1000 ms of CPU time vs not doing it at all.
So if you want to use big o notation in this context it should be O(n) where n is the number of context switches, because you are not comparing algorithms used to switch between threads but you are comparing doing context switch or not doing it at all.
> Maybe in the 2000's but I feel this reasoning is no longer valid in 2023 and should be put to rest.
So do we discard existing ways of making software more efficient because we can be more wasteful on more recent hardware? What if we could develop our software such that 2000s computers are still useful, rather than letting those computers become e-waste?
> The numbers reported here paint an interesting picture on the state of Linux multi-threaded performance in 2018. I would say that the limits still exist - running a million threads is probably not going to make sense; however, the limits have definitely shifted since the past, and a lot of folklore from the early 2000s doesn't apply today. On a beefy multi-core machine with lots of RAM we can easily run 10,000 threads in a single process today, in production. As I've mentioned above, it's highly recommended to watch Google's talk on fibers; through careful tuning of the kernel (and setting smaller default stacks) Google is able to run an order of magnitude more threads in parallel.
so, in that benchmark, context switch is comparable to copying 64k mem, which is kinda significant, I run some heavy load database with few hundreds threads, and see that it does 100k context switching per sec some times.
> 10k problem.. Wouldn't modern computing not work if my Linux box couldn't spin up 10k threads? Htop says I'm currently at 4,000 threads on an 8 core machine.
By the 2010s the problem had been updated to C10M. The people discussing it (well, perhaps some) aren't idiots and understand that the threshold changes as hardware changes.
Also, the issue isn't creating 10k threads it's dealing with 10k concurrent users (or, again, a much higher number today).
Maybe in the 2000's but I feel this reasoning is no longer valid in 2023 and should be put to rest.
10k problem.. Wouldn't modern computing not work if my Linux box couldn't spin up 10k threads? Htop says I'm currently at 4,000 threads on an 8 core machine.