> It looks like the GPU is doing most of the work Yes. The GPU is doing most of ...

vlovich123 · on Dec 19, 2023

I’m pretty sure CPUs destroy GPUs at sequential programming and most programs are written in a sequential style. Not sure where the 90/10 claim comes from but there’s plenty of cloud servers with no GPU installed whatsoever and 0 servers without a CPU.

softfalcon · on Dec 19, 2023

Yup, and until we get a truly general purpose compute GPU that can handle both styles of instruction with automated multi-threading and state management, this will continue.

What I've seen shows me that nVidia is working very hard to eliminate this gap though. General purpose computing on the GPU has never been easier, and it gets better every year.

In my opinion, it's only a matter of time before we can run anything we want on the GPU and realize various speed gains.

As for where the 90/10 comes from, it's from the emerging architectures for advanced AI/graphics compute like the DGX H100 [0].

[0] https://www.nvidia.com/en-us/data-center/dgx-h100/

vlovich123 · on Dec 20, 2023

AI is different. Those servers are set up to run AI jobs & nothing else. That’s still a small fraction of overall cloud machines at the moment. Even if in volume they overtake, that’s just because of the huge surge in demand for AI * the compute requirements associated with it eclipsing the compute requirements for “traditional” cloud compute that is used to keep businesses running. I don’t think you’ll see GPUs running things like databases or the Linux kernel. GPUs may even come with embedded ARM CPUs to run the kernel & only run AI tasks as part of the package as a cost reduction, but I think that’ll take a very long time because you have to figure out how to do cotenancy. It’ll depend on if the CPU remains a huge unnecessary cost for AI servers. I doubt that GPUs will get much better at sequential tasks because it’s an essential programming tradeoff (e.g. it’s the same reason you don’t see everything written in SIMD as SIMD is much closer to GPU-style programming than the more general sequential style)

dahart · on Dec 20, 2023

> Eventually, the CPU may cease to exist as its fundamental design becomes obsolete. This is usually called a GPGPU in modern server infrastructure.

There’s no reason yet to think CPU designs are becoming obsolete. SISD (Single Instruction, Single Data) is the CPU core model and it’s easier to program and does lots of things that you don’t want to use SIMD for. SISD is good for heterogenous workloads, and SIMD is good for homogeneous workloads.

I thought GPGPU was waning these days. That term was used a lot during the period when people were ‘hacking’ GPUs to do general compute when the APIs like OpenGL didn’t offer general programmable computation. Today with CUDA, and compute shades in every major API, it’s a given that GPUs are for general purpose computation, and it’s even becoming an anachronism that the G in GPU stands for graphics. My soft prediction is that GPU might get a new name & acronym soon that doesn’t have “graphics” in it.

kqr · on Dec 19, 2023

This is somewhat reassuring. A decade ago when clock frequencies had stopped increasing and core count started to increase I predicted that the future was massively multicore.

Then the core count stopped increasing too -- except only if you look in the wrong place! It has in CPUs, but they moved to GPUs.

Sharlin · on Dec 20, 2023

SIMD parallelism has been improving on the CPU too – although the number of lanes hasn’t increased that much since the MMX days (128 to 512 bits), the spectrum of available vector instructions has grown a lot. And being able to do eight or sixteen operations at the price of one is certainly nothing to scoff at. Autovectorization is a really hard problem, though, and manual vectorization is still something of a dark art, especially due to the scarcity of good abstractions. Programming with cryptically named, architecture-specific intrinsics and doing manual feature detection is not fun.