They mention simulating fire and smoke for games, and doing fluid simulations on the GPU. Something I’ve never understood, if these effects are to run in a game, isn’t the GPU already busy? It seems like running a CFD problem and rendering at the same time is a lot.
Can this stuff run on an iGPU while the dGPU is doing more rendering-related tasks? Or are iGPUs just too weak, better to fall all the way down to the CPU.
> Something I’ve never understood, if these effects are to run in a game, isn’t the GPU already busy?
Short answer: No, it's not "already busy". GPU's are so powerful now that you can do physics, fancy render passes, fluid sims, "Game AI" unit pathing, and more, at 100+ FPS.
Long answer: You have a "frame budget" which is the amount of time between rendering the super fast "slide show" of frames at 60+ FPS. This gives you between 5 and 30 ms to do a bunch of computation to get the results you need to compute state and render the next frame.
That could be moving units around a map, calculating fire physics, blitting terrain textures, rendering verts with materials. In many game engines, you will see a GPU doing dozens of these separate computations per frame.
GPU's are basically just a secondary computer attached to your main computer. You give it a bunch of jobs to do every frame and it outputs the results. You combine results into something that looks like a game.
> Can this stuff run on an iGPU while the dGPU is doing more rendering-related tasks?
Almost no one is using the iGPU for anything. It's completely ignored because it's usually completely useless compared to your main discrete GPU.
In theory it's perfectly possible to do all you describe in 8ms (i.e. VR render times). In reality we're
1. still far from properly utilizing modern graphics APIs as is. Some of the largest studios are close but knowledge is tight lipped in the industry.
2. even when those top studios can/do, they choose to focus more of the budget on higher render resolution over adding more logic or simulation. Makes for superficially better looking games to help sell.
3. and of course there are other expensive factors right now with more attention like Ray traced lighting which can only be optimized so much on current hardware.
I'd really love to see what the AA or maybe even indie market can do with such techniques one day. I don't have much faith that AAA studios will ever prioritize simulation.
If you enjoy simulation on that level, you might find Star Citizen [0] or the latest updates to No Man's Sky [1] interesting.
You're right that simulation is often an after thought. Most games prioritize narrative, combat, artist's vision and high fidelity environments over complex systems you can interact with. There are a few outliers though.
Alternatively, you get so far into the simulation side of things with stuff like Dwarf Fortress [2] and all visual fidelity is thrown away for the sake of prioritizing simulation complexity!
> No one is using the iGPU for anything. It's completely ignored because it's usually completely useless compared to your main discrete GPU.
Modern iGPUs are actually quite powerful is my understanding. I think the reason no one does this is that the software model isn’t actually there/standardized/able to work cross vendor since the iGPU and the discrete card are going to be different vendors typically. There’s also little motivation to do this because not everyone has an iGPU which dilutes the economy of scale of using it.
It would be a neat idea to try to run lighter weight things on the iGPU to free up rendering time on the dGPU and make frame rates more consistent, but the incentives aren’t there.
I agree the incentives aren't there. Also agree that it is possible to use the integrated GPU for light tasks, but only light tasks.
In the high performance scenarios where there is all three (discrete GPU, integrated GPU, and CPU) and we try and use the integrated GPU alongside the CPU, it often causes thermal throttling on the shared die between iGPU and CPU.
This slows the CPU down from executing well, keeping up with state changes, and sending needed data to the keep the discrete GPU utilized. In short, don't warm up the CPU, we want it to stay cool, if that means not doing iGPU stuff, don't do it.
When we have multiple discrete GPU's available (render farm), this on-die thermal bottleneck goes away and there are many render pipelines that are made to handle hundreds, even thousands of simultaneous GPU's working on a shared problem set of diverse tasks, similar to trying to utilize both iGPU and dGPU on the same machine but bigger.
Whether or not to use the iGPU is less about scheduling and more about thermal throttling.
That’s probably a better point as to why it’s not invested in although most games are not CPU bound so thermal throttling wouldn’t apply then as much. I think it’s multiple factors combined.
The render pipelines you refer to are all offline non-realtime rendering though for movies/animation/etc right? Somewhat different UX and problem space than realtime gaming.
I actually have my desktop AMD iGPU enabled despite using a discrete Nvidia card. I use the iGPU to do AI noise reduction for my mic in voice and video calls.
I'm not sure if this is really an ideal setup, having both GPUs enabled with both driver packages installed (as compared to just running Nvidias noise reduction on the dGPU, I guess,) but it seems to all work with no issue. The onboard HDMI can even be used at the same time as the discrete cards monitor ports.
It looks like the GPU is doing most of the work… from that point of view when do we start to wonder if the GPU can “offload” anything to the whole computer that is hanging off of it, haha.
Yes. The GPU is doing most of the work in a lot of modern games.
It isn't great at everything though, and there are limitations due to its architecture being structured almost solely for the purpose of computing massively parallel instructions.
> when do we start to wonder if the GPU can “offload” anything to the whole computer that is hanging off of it
The main bottleneck for speed on most teams is not having enough "GPU devs" to move stuff off the CPU and onto the GPU. Many games suffer in performance due to folks not knowing how to use the GPU properly.
Because of this, nVidia/AMD invest heavily in making general purpose compute easier and easier on the GPU. The successes they have had in doing this over the last decade are nothing less than staggering.
Ultimately, the way it's looking, GPU's are trying to become good at everything the CPU does and then some. We already have modern cloud server architectures that are 90% GPU and 10% CPU as a complete SoC.
Eventually, the CPU may cease to exist entirely as its fundamental design becomes obsolete. This is usually called a GPGPU in modern server infrastructure.
I’m pretty sure CPUs destroy GPUs at sequential programming and most programs are written in a sequential style. Not sure where the 90/10 claim comes from but there’s plenty of cloud servers with no GPU installed whatsoever and 0 servers without a CPU.
Yup, and until we get a truly general purpose compute GPU that can handle both styles of instruction with automated multi-threading and state management, this will continue.
What I've seen shows me that nVidia is working very hard to eliminate this gap though. General purpose computing on the GPU has never been easier, and it gets better every year.
In my opinion, it's only a matter of time before we can run anything we want on the GPU and realize various speed gains.
As for where the 90/10 comes from, it's from the emerging architectures for advanced AI/graphics compute like the DGX H100 [0].
AI is different. Those servers are set up to run AI jobs & nothing else. That’s still a small fraction of overall cloud machines at the moment. Even if in volume they overtake, that’s just because of the huge surge in demand for AI * the compute requirements associated with it eclipsing the compute requirements for “traditional” cloud compute that is used to keep businesses running. I don’t think you’ll see GPUs running things like databases or the Linux kernel. GPUs may even come with embedded ARM CPUs to run the kernel & only run AI tasks as part of the package as a cost reduction, but I think that’ll take a very long time because you have to figure out how to do cotenancy. It’ll depend on if the CPU remains a huge unnecessary cost for AI servers. I doubt that GPUs will get much better at sequential tasks because it’s an essential programming tradeoff (e.g. it’s the same reason you don’t see everything written in SIMD as SIMD is much closer to GPU-style programming than the more general sequential style)
> Eventually, the CPU may cease to exist as its fundamental design becomes obsolete. This is usually called a GPGPU in modern server infrastructure.
There’s no reason yet to think CPU designs are becoming obsolete. SISD (Single Instruction, Single Data) is the CPU core model and it’s easier to program and does lots of things that you don’t want to use SIMD for. SISD is good for heterogenous workloads, and SIMD is good for homogeneous workloads.
I thought GPGPU was waning these days. That term was used a lot during the period when people were ‘hacking’ GPUs to do general compute when the APIs like OpenGL didn’t offer general programmable computation. Today with CUDA, and compute shades in every major API, it’s a given that GPUs are for general purpose computation, and it’s even becoming an anachronism that the G in GPU stands for graphics. My soft prediction is that GPU might get a new name & acronym soon that doesn’t have “graphics” in it.
This is somewhat reassuring. A decade ago when clock frequencies had stopped increasing and core count started to increase I predicted that the future was massively multicore.
Then the core count stopped increasing too -- except only if you look in the wrong place! It has in CPUs, but they moved to GPUs.
SIMD parallelism has been improving on the CPU too – although the number of lanes hasn’t increased that much since the MMX days (128 to 512 bits), the spectrum of available vector instructions has grown a lot. And being able to do eight or sixteen operations at the price of one is certainly nothing to scoff at. Autovectorization is a really hard problem, though, and manual vectorization is still something of a dark art, especially due to the scarcity of good abstractions. Programming with cryptically named, architecture-specific intrinsics and doing manual feature detection is not fun.
A game is more than just rendering, and modern games will absolutely get bottlenecked on lower-end CPUs well before you reach say 144 fps. GamersNexus has done a bunch of videos on the topic.
You are not wrong that there are many games that are bottle-necked on lower end CPUs.
I would argue that for many CPU bound games, they could find better ways to utilize the GPU for computation and it is likely they just didn't have the knowledge, time, or budget to do so.
It's easier to write CPU code, every programmer can do it, so it's the most often reached for tool.
Also, at high frame rates, the bottleneck is frequently the CPU due to it not feeding the GPU fast enough, so you lose frames. There is definitely a real world requirement of having a fast enough CPU to properly utilize a high end video card, even if it's just for shoving command buffers and nothing else.
Now that LLMs run on GPU too, future GPUs will need to juggle between the graphics, the physics and the AI for NPCs. Fun times trying to balance all that.
My guess is that the load will become more and more shared between local and remote computing resources.
In high performance scenarios, the GPU is running full blast while the CPU is running at full blast just feeding data and pre-process work to the GPU.
The GPU is the steam engine hurtling forward, the CPU is just the person shoveling coal into the furnace.
Using the integrated GPU heats up the main die where the CPU is because they live together on the same chip. The die heats up, CPU thermal throttles, CPU stops efficiently feeding data to the GPU at max speed, GPU slows down from under utilization.
In high performance scenarios, the integrated GPU is often a waste of thermal budget.
Doesn't this assume inadequate cooling? A quick google indicates AMD's X3D CPUs begin throttling around 89°C, and that it's not overly challenging to keep them below 80 even under intense CPU load, although that's presumably without any activity on the integrated GPU.
Assuming cooling really is inadequate for running both the CPU cores and the integrated GPU: for GPU-friendly workloads (i.e. no GPU-unfriendly preprocessing operations for the CPU) it would surely make more sense to use the integrated GPU rather than spend the thermal budget having the CPU cores do that work.
What the specs say they'll do and what they actually do are often very different realities in my experience.
I've seen thermal throttling happening at 60°C because overall, the chip is cool, but one core is maxing or two cores are maxing. Which is common in game dev with your primary thread feeding a GPU with command buffer queues and another scheduling the main game loop.
Even when water cooled or the high end air cooling on my server blades, I see that long term, the system just hits a trade-off point of ~60-70°C and ~85% max CPU clock even when the cooling system is industry grade, loud as hell, and has an HVAC unit backing it. Probably part of why scale out is so popular to distribute load.
When I give real work to any iGPU's on these systems, I see the temps bump 5-10°C and clocks on the CPU cores drop a bit. Could be drivers, could be temp curves, I would think these fancy cooling systems I'm running are performing well though. shrug
What do you mean in the future? Multiple GPU chips is already pretty common (on the same card or having multiple cards at the same time). Besides, GPUs are massively parallel chips as well, with specialized units for graphics and AI operations.
Welp. It used to be PhysX math would run on a dedicated gpu of choice. I remember assigning or realising this during the Red Faction game with forever destructing walls.
Almost Minecraft, but with rocket launchers on mars
I remember PhysX, but my feeling at the time was “yeah I guess NVIDIA would love for me to buy two graphics cards.” On the other hand, processors without any iGPUs are pretty rare by now.
You don't have to be gaming to use a GPU. Plenty of rendering software have a GPU mode now. But writing GPU aglos are often different from a CPU simulation algo, because it is highly parallelized.
Can this stuff run on an iGPU while the dGPU is doing more rendering-related tasks? Or are iGPUs just too weak, better to fall all the way down to the CPU.