Nvidia have been making different architecture for gaming and datacenter for few...

adrian_b · on Sept 21, 2022

No, the NVIDIA datacenter and gaming GPUs do not have different architectures.

They have some differences besides the different set of implemented features, e.g. ECC memory or FP64 speed, which are caused much less by their target market than by the offset in time between their designs, which gives the opportunity to add more improvements in whichever comes later.

The architectural differences between NVIDIA datacenter and gaming GPUs of the same generation are much less than between different NVIDIA GPU generations.

This can be obviously seen in the CUDA version numbers, which correspond to lists of implemented features.

For example, datacenter Volta is 7.0, automotive Volta is 7.2 and gaming Turing is 7.5, while different versions of Ampere are 8.0, 8.6 and 8.7.

The differences between any Ampere and any Volta/Turing are larger than between datacenter Volta and gaming Turing, or between datacenter Ampere and gaming Ampere.

The differences between two successive NVIDIA generations can be as large as between AMD CDNA and RDNA, while the differences between datacenter and gaming NVIDIA GPUs are less than between two successive generations of AMD RDNA or AMD CDNA.

TomVDB · on Sept 20, 2022

I don't agree.

Turing is an evolution of Volta. In fact, in the CUDA slides of Turing, they mention explicitly that Turing shaders are binary compatible with Volta, and that's very clear from the whitepapers as well.

Ampere A100 and Ampere GeForce have the same core architecture as well.

The only differences are in HPC features (MIG, ECC), FP64, the beefiness of the tensor cores, and the lack of RTX cores on HPC units.

The jury is still out on Hopper vs Lovelace. Today's presentation definitely points to a similar difference as between A100 and Ampere GeForce.

It's more: the architectures are the same with some minor differences.

You can also see this with the SM feature levels:

Volta: SM 70, Turing SM 75

Ampere: SM 80 (A100) and SM 86 (GeForce)

terafo · on Sept 20, 2022

Turing is an evolution of Volta, but they are different architectures.

A100 and GA102 DO NOT have same core architecture. 192KB of L1 cache in A100 SM, 128KB in GA102 SM. That already means that it is not the same SM. And there are other differences. For example Volta started featuring second datapath that could process one INT32 instruction in addition to floating point instructions. This datapath was upgraded in GA102 so now it can handle FP32 instructions as well(not FP16, only first datapath can process them). A100 doesn't have this improvement, that's why we see such drastic(basically 2x) difference in FP32 flops between A100 and GA102. It is not a "minor difference" and neither is a huge difference in L2 cache(40MB vs 6MB). It's a different architecture on a different node designed by a different team.

TomVDB · on Sept 20, 2022

GP100 and GP GeForce has a different shared memory structure as well, so much so that GP100 was listed as having 30 SMs instead of 60 in some Nvidia presentations. But the base architecture (ISA, instruction delays, …) were the same.

It’s true tbat GA102 has double the FP32 units, but the way they works is very similar to the way SMs have 2x FP16 in that you need to go out of your way to benefit front them. Benchmark show this as well.

I like to think that Nvidia’s SM version nomenclature is a pretty good hint, but I guess it just boils down to personal opinion about what constitutes a base architecture.

Melatonic · on Sept 20, 2022

AMD as well. The main difference being that Nvidia kills you big time with the damn licensing (often more expensive than the very pricy card itself) while AMD does not. Quite unfortunate we do not have more budget options for these types of cards as it would be pretty cool to have a bunch of VM's or containers with access to "discrete" graphics

skuhn · on Sept 20, 2022

Nvidia's datacenter product licensing costs are beyond onerous, but even worse to me is that their license server (both its on-premise and cloud version) is fiddly and sometimes just plain broken. Losing your license lease makes the card go into super low performance hibernation mode, which means that dealing with the licensing server is not just about maintaining compliance -- it's about keeping your service up.

It's a bit of a mystery to me how anyone can run a high availability service that relies on Nvidia datacenter GPUs. Even if you somehow get it all sorted out, if there was ANY other option I would take it.