> Instead of trying to speculatively make one thread fast to mask IO stalls, run a large pool of threads that can stall frequently but still keep the execution units and memory channels busy.
Isn't something like that done for GPUs? They have the advantage of having a massive number of threads to execute. For CPUs, the number of runnable threads tends to be lower.
Isn't something like that done for GPUs? They have the advantage of having a massive number of threads to execute. For CPUs, the number of runnable threads tends to be lower.