Developers weren't forced into using CUDA, that was entirely because of their ec...

sennight · on July 5, 2022

> Developers weren't forced into using CUDA...

... no, just as any other consumer isn't necessarily "forced" by companies employing anticompetitive practices.

> Facebook and Google...

lol, they have such a high churn rate on hardware that I seriously doubt they'd give it much thought at all. Their use case is unique to a tiny number of companies - high churn, low capital constraint, no tolerance for supplier delay. In such a scenario CUDA vendor lock in wouldn't even register as a potential point of pain.

> OpenCL existed but the implementation on AMD was just as bad as the one on NVIDIA.

For those unaware of how opencl works: an API is provided by the standard, to which software can be written by people - even those without signed NDAs. The API can interface to a hardware endpoint that has further open code and generous documentation... like an open source DSP, CPU, etc - or it can hit an opaque pointer. If your hardware vendor is absurdly secretive and insists on treating microcode and binary blobs as competitive advantages, then your opencl experience is wholly dependent on that vendor implementation. Unfortunately for GPUs that means either NVIDIA or AMD (maybe Intel, we'll see)... so yeah - not good. AMD has improved things open sourcing a great deal of their code, but that is a relatively recent development. While I'm familiar with some aspects of their codebase (had to fix an endian bug, guess what ISA I use), I dunno how much GPGPU functionality they're still hiding behind their encrypted firmware binary blobs. Also, to the point on NVIDIA's opencl sucking: anybody else remember that time that Intel intentionally crippled performance for non-Intel hardware running code generated by their compiler or linked to their high performance scientific libraries? Surely NVIDIA would never sandbag opencl...

Anyway, this is kind of a goofy thing to even discuss given two facts:

* There are basically two GPU vendors - so vendor lock is practically assured already.

* CUDA is designed to run parallel code on NVIDIA GPUs - full stop. Opencl is designed for heterogeneous computing, and GPUs are just one of many computing units possible. So not apples to apples.

kllrnohj · on July 5, 2022

> CUDA is designed to run parallel code on NVIDIA GPUs - full stop. Opencl is designed for heterogeneous computing, and GPUs are just one of many computing units possible. So not apples to apples.

This is really why OpenCL failed. You really can't write code that works just as well on CPUs as it does on GPUs. GPGPU isn't really all that general purpose, it's still quite specialized in terms of what it's actually good at doing & the hoops you need to jump through to ensure it performs well.

This is really CUDA's strength. Not the API or ecosystem or lock-in, but rather because CUDA is all about a specific category of compute and isn't afraid to tell you all the nitty gritty details you need to know in order to make effective use of it. And you actually know where to go to get complete documentation.

> There are basically two GPU vendors - so vendor lock is practically assured already.

Depends on how you scope your definition of "GPU vendor." If you only include cloud compute then sure, for now. If you include consumer devices then very definitely no, not at all. You also have Intel (Intel's integrated being the most widely used GPU on laptops, after all), Qualcomm's Adreno, ARM's Mali, IMG's PowerVR, and Apple's PowerVR fork. Also Broadcom's VideoCore that's still in use by the very low end like the Raspberry Pi and TVs.

pjmlp · on July 5, 2022

CUDA is designed to support C, C++, Fortran as first class languages, with PTC bindings for anyone else that wants to join the party, including .NET, Java, Julia, Haskell among others.

OpenCL was born as C only API, requires compilation at run time. The later additions for SPIR and C++ were an afterthought after they started to take an heavy beating. Still no IDE or GPGPU debugging that compares to CUDA, and OpenCL 3.0 is basically 1.2.

Really not apples to apples.

dotnet00 · on July 5, 2022

>lol, they have such a high churn rate on hardware that I seriously doubt they'd give it much thought at all. Their use case is unique to a tiny number of companies - high churn, low capital constraint, no tolerance for supplier delay. In such a scenario CUDA vendor lock in wouldn't even register as a potential point of pain

Considering that PyTorch and Tensorflow are the two most popular deep learning frameworks used in the industry, this argument doesn't make sense. Of course they care about CUDA lock-in, it makes them dependent on a competitor and limits the range of hardware they support and thus potentially limits the adoption of their framework. The fact that they chose CUDA anyway is essentially confirmation that they didn't see any other viable option.

>Also, to the point on NVIDIA's opencl sucking: anybody else remember that time that Intel intentionally crippled performance for non-Intel hardware running code generated by their compiler or linked to their high performance scientific libraries? Surely NVIDIA would never sandbag opencl...

If NVIDIA were somehow intentionally crippling OpenCL performance on non-NVIDIA hardware, it would be pretty obvious since they don't control all the OpenCL compilers/runtimes out there. They very likely were crippling OpenCL on their own hardware, but that obviously wouldn't matter if the competitors (as you mentioned, OpenCL was designed for heterogenous compute in general, so there would have been competition from more than just AMD) had a better ecosystem than CUDA's.

SideQuark · on July 5, 2022

>For those unaware of how opencl works: an API is provided by the standard, to which software can be written by people - even those without signed NDAs

And no one has made it work as well as CUDA - developers that want performance will choose CUDA. If OpenCL worked as well people would choose it, but it simply doesn't.

>I seriously doubt they'd give it much thought at all.

Having talked to people at both companies about exactly this, they have put serious thought into it - it amounts to powering their multi-billion dollar cloud AI infrastructure. The alternatives are simply so bad that they choose CUDA/NVidia stuff, as do their clients. Watching them (and AWS and MS) choose NVidia for their cloud offerings is not because all are stupid of cannot make new APIs if needed - they choose it because it works.

>Surely NVIDIA would never sandbag opencl...

So fix it. There's enough people that can and do reverse engineer such things that one would have likely found such conspiracies. Or publish the proof. Reverse engineering is not that hard that if this mythical problem existed that you could not find it and prove it and write it up, or even fix it. There's enough companies besides NVidia that could fix OpenCL, or make a better API for NVidia and sell that, yet neither of those have happened. If you really believe it is possible, you are sitting on a huge opportunity.

Or, alternatively, NVidia has made really compelling hardware and the best software API so far, and people use that because it works.

Open source fails at many real world tasks. Choose the tool best suited to solve the problem you want solved, regardless of religious beliefs.

badsectoracula · on July 5, 2022

Sorry but this part...

> Choose the tool best suited to solve the problem you want solved, *regardless of religious beliefs*.

...is nonsense. Open source isn't about "religion", is about actually being able to do something like...

> So fix it.

...without needing to do stuff like...

> do reverse engineer such things

...which is a pointless waste of time regardless of how "not that hard" it might be (which is certainly not easy and certainly much easier to have the source code around).

This association of open source / free software with religion doesn't have any place here, people didn't come up with open source / free software because of some mystical experience with otherworldly entities, they came up with it because they were faced with actual practical issues.

SideQuark · on July 5, 2022

OP complains people use CUDA instead of a non-existent open source solution.

That's religion.

And a significant amount of open source solutions are the result of reverse engineering. It's a perfectly reasonable and time tested method to replace proprietary solutions.

> they came up with it because they were faced with actual practical issues

People use CUDA for actual practical issues. If someone makes a cross platform open source solution that solves those issues people will try it.

So far it has not been done.

badsectoracula · on July 7, 2022

First of all, i replied to the generalization "Open source fails at many real world tasks. Choose the tool best suited to solve the problem you want solved, regardless of religious beliefs" not just about CUDA. Open source might fail at tasks but it isn't pushed or chosen because of religion. It has nothing to do with religion. In fact...

> OP complains people use CUDA instead of a non-existent open source solution. That's religion.

...that isn't religion either. The person you replied to complains because CUDA not only is closed source but also is vendor locked to Nvidia both of which have a ton of issues inherent to being vendor locked and closed source software, largely around control - the complaint comes from these issues. These issues for many can either be showstoppers or just make them look and wish and push for alternatives and they come from practical concerns, not out of religious issues.

> And a significant amount of open source solutions are the result of reverse engineering. It's a perfectly reasonable and time tested method to replace proprietary solutions.

It is not reasonable at all, it is the last-ditch effort when nothing else seems to do, can be a tremendous waste of time and telling people "So fix it" when doing that would require reverse engineering is practically the same as telling them to shut up and IMO can't even be taken seriously as anything else than that.

The proper way to fix something is to have access to the source code.

And again to be clear:

> People use CUDA for actual practical issues. If someone makes a cross platform open source solution that solves those issues people will try it.

The "actual practical issues" i mentioned have nothing to do with CUDA or any issues they might use with CUDA or any other closed source (or not) technology. The "actual practical issues" i mentioned are about the issues inherent to closed source technologies in general - like fixing any potential issues one might have and being under the control of the vendor of those technologies.

These are all widely known and talked about issues, it might be a good idea to not dismiss them.

Const-me · on July 5, 2022

> they choose it because it works

MS DirectCompute also works. Yet last time I checked, MS Azure didn’t support DirectCompute with their fast GPUs. These virtual machines come with TCC (Tesla Compute Cluster) driver which only supports CUDA, DirectCompute requires a WDM (Windows Driver Model) driver. https://social.msdn.microsoft.com/forums/en-US/2c1784a3-5e09...

my123 · on July 5, 2022

You can flip the model from TCC to WDDM via nvidia-smi.

But AFAIK, C++ AMP is deprecated and going away.

https://docs.microsoft.com/en-us/cpp/parallel/amp/cpp-amp-ov...

> C++ AMP headers are deprecated, starting with Visual Studio 2022 version 17.0. Including any AMP headers will generate build errors. Define _SILENCE_AMP_DEPRECATION_WARNINGS before including any AMP headers to silence the warnings.

So please don't rely on DirectCompute. It's firmly in legacy territory. Microsoft didn't invest the effort necessary to make it thrive.

Const-me · on July 6, 2022

> flip the model from TCC to WDDM via nvidia-smi

I’m not sure that’s legal. I think NV wants extra money, details there https://www.nvidia.com/content/dam/en-zz/Solutions/design-vi...

> C++ AMP is deprecated and going away.

Indeed, but I never used that thing.

> don't rely on DirectCompute

DirectCompute is a low-level tech, a subset of D3D11 and 12. It’s not deprecated, used by lots of software, most notably videogames. For instance, in UE5 they’re even rasterizing triangles with compute shaders, that’s DirectCompute technology.

Some things are worse than CUDA. Different programming language HLSL, manually managed GPU buffers, compatibility issues related to FP64 math support.

Some things are better than CUDA. No need to install huge third-party libraries, integrated with other GPU-related things (D2D, DirectWrite, desktop duplication, media foundation). And vendor agnostic, works on AMD and Intel too.

my123 · on July 6, 2022

> I’m not sure that’s legal. I think NV wants extra money, details there https://www.nvidia.com/content/dam/en-zz/Solutions/design-vi...

Use: nvidia-smi -g {GPU_ID} -dm 0

Cloud providers do pay for an extensive vGPU license, don't worry about that part.

> DirectCompute is a low-level tech, a subset of D3D11 and 12.

D3D11. The compute subset of D3D12 is named D3D12 and directly got rolled into that. Also, you have CLon12 today which does support SPIR-V.

Const-me · on July 6, 2022

> Use: nvidia-smi -g {GPU_ID} -dm 0

I think I tried that a year ago, and it didn’t work. Documentation agrees, it says “GRID drivers redistributed by Azure do not work on non-NV series VMs like NCv2, NCv3” https://docs.microsoft.com/en-us/azure/virtual-machines/wind... Microsoft support told me the same. I wanted NCv3 because on paper, V100 GPU is good at FP64 arithmetic which we use a lot in our compute shaders.

> The compute subset of D3D12 is named D3D12

Interesting, didn’t know about the rebranding.

jjoonathan · on July 5, 2022

> Surely NVIDIA would never sandbag opencl...

In my experience the AMD OpenCL implementation was worse than NVIDIA's OpenCL implementation, and not a little worse, but a lot worse. NVIDIA beat AMD at AMD's own game -- even though NVIDIA had every incentive to sandbag. It was shameful.