Well, sure, in practice work stealing makes correct distribution difficult, but in theory, work stealing is to repair an incorrect work distribution, right?
If every CPU is 100% utilized without needing context switch (and running the right number of worker threads without switching those), then work stealing is not required.
But my comment is solidly "Rule of thumb". I claim no theoretical basis other than "Giving fewer longer tasks to fewer threads, (still >= number of worker threads), is better than giving more shorter ones"
If every CPU is 100% utilized without needing context switch (and running the right number of worker threads without switching those), then work stealing is not required.
But my comment is solidly "Rule of thumb". I claim no theoretical basis other than "Giving fewer longer tasks to fewer threads, (still >= number of worker threads), is better than giving more shorter ones"