Yes. The GPU is doing most of the work in a lot of modern games.
It isn't great at everything though, and there are limitations due to its architecture being structured almost solely for the purpose of computing massively parallel instructions.
> when do we start to wonder if the GPU can “offload” anything to the whole computer that is hanging off of it
The main bottleneck for speed on most teams is not having enough "GPU devs" to move stuff off the CPU and onto the GPU. Many games suffer in performance due to folks not knowing how to use the GPU properly.
Because of this, nVidia/AMD invest heavily in making general purpose compute easier and easier on the GPU. The successes they have had in doing this over the last decade are nothing less than staggering.
Ultimately, the way it's looking, GPU's are trying to become good at everything the CPU does and then some. We already have modern cloud server architectures that are 90% GPU and 10% CPU as a complete SoC.
Eventually, the CPU may cease to exist entirely as its fundamental design becomes obsolete. This is usually called a GPGPU in modern server infrastructure.
I’m pretty sure CPUs destroy GPUs at sequential programming and most programs are written in a sequential style. Not sure where the 90/10 claim comes from but there’s plenty of cloud servers with no GPU installed whatsoever and 0 servers without a CPU.
Yup, and until we get a truly general purpose compute GPU that can handle both styles of instruction with automated multi-threading and state management, this will continue.
What I've seen shows me that nVidia is working very hard to eliminate this gap though. General purpose computing on the GPU has never been easier, and it gets better every year.
In my opinion, it's only a matter of time before we can run anything we want on the GPU and realize various speed gains.
As for where the 90/10 comes from, it's from the emerging architectures for advanced AI/graphics compute like the DGX H100 [0].
AI is different. Those servers are set up to run AI jobs & nothing else. That’s still a small fraction of overall cloud machines at the moment. Even if in volume they overtake, that’s just because of the huge surge in demand for AI * the compute requirements associated with it eclipsing the compute requirements for “traditional” cloud compute that is used to keep businesses running. I don’t think you’ll see GPUs running things like databases or the Linux kernel. GPUs may even come with embedded ARM CPUs to run the kernel & only run AI tasks as part of the package as a cost reduction, but I think that’ll take a very long time because you have to figure out how to do cotenancy. It’ll depend on if the CPU remains a huge unnecessary cost for AI servers. I doubt that GPUs will get much better at sequential tasks because it’s an essential programming tradeoff (e.g. it’s the same reason you don’t see everything written in SIMD as SIMD is much closer to GPU-style programming than the more general sequential style)
> Eventually, the CPU may cease to exist as its fundamental design becomes obsolete. This is usually called a GPGPU in modern server infrastructure.
There’s no reason yet to think CPU designs are becoming obsolete. SISD (Single Instruction, Single Data) is the CPU core model and it’s easier to program and does lots of things that you don’t want to use SIMD for. SISD is good for heterogenous workloads, and SIMD is good for homogeneous workloads.
I thought GPGPU was waning these days. That term was used a lot during the period when people were ‘hacking’ GPUs to do general compute when the APIs like OpenGL didn’t offer general programmable computation. Today with CUDA, and compute shades in every major API, it’s a given that GPUs are for general purpose computation, and it’s even becoming an anachronism that the G in GPU stands for graphics. My soft prediction is that GPU might get a new name & acronym soon that doesn’t have “graphics” in it.
This is somewhat reassuring. A decade ago when clock frequencies had stopped increasing and core count started to increase I predicted that the future was massively multicore.
Then the core count stopped increasing too -- except only if you look in the wrong place! It has in CPUs, but they moved to GPUs.
SIMD parallelism has been improving on the CPU too – although the number of lanes hasn’t increased that much since the MMX days (128 to 512 bits), the spectrum of available vector instructions has grown a lot. And being able to do eight or sixteen operations at the price of one is certainly nothing to scoff at. Autovectorization is a really hard problem, though, and manual vectorization is still something of a dark art, especially due to the scarcity of good abstractions. Programming with cryptically named, architecture-specific intrinsics and doing manual feature detection is not fun.
Yes. The GPU is doing most of the work in a lot of modern games.
It isn't great at everything though, and there are limitations due to its architecture being structured almost solely for the purpose of computing massively parallel instructions.
> when do we start to wonder if the GPU can “offload” anything to the whole computer that is hanging off of it
The main bottleneck for speed on most teams is not having enough "GPU devs" to move stuff off the CPU and onto the GPU. Many games suffer in performance due to folks not knowing how to use the GPU properly.
Because of this, nVidia/AMD invest heavily in making general purpose compute easier and easier on the GPU. The successes they have had in doing this over the last decade are nothing less than staggering.
Ultimately, the way it's looking, GPU's are trying to become good at everything the CPU does and then some. We already have modern cloud server architectures that are 90% GPU and 10% CPU as a complete SoC.
Eventually, the CPU may cease to exist entirely as its fundamental design becomes obsolete. This is usually called a GPGPU in modern server infrastructure.