Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> This is what having a monopoly looks like !

As someone who has been in the AI/ML space for over a decade, and even had an AMD/Radeon card for more than half of that, I can't help but feel that this is partially AMD's own fault.

For many, many years it seemed to me that AMD just didn't take AI/ML seriously whereas, for all it's faults, NVIDIA seemed to catch on very early that ML presented a tremendous potential market.

To this day getting things like Stable Diffusion to run on an AMD card requires extra work. At least from my perspective it seems like dedicating a few engineers to getting ROCm working on all major OSes with all major scientific computing/deep learning libraries would have been a pretty good investment.

Is there some context I'm missing for why AMD never caught up in this space?



Until very recently, AMD was struggling for survival. Rather than making the big bet on AI, they went for the sure thing by banking on revolutionary CPU tech. I'm sure if they were in a better financial position 5 years ago, they would have gone bigger on AI.


And arguable their bet on CPU tech worked! AMD is in a much better position today than they were 5+ years ago. They have some catching up to do but that doesn't mean their completely out of the game.


"much better" is an understatement! "AMD predicted to go bankrupt by 2020"[0]

[0] https://www.overclock3d.net/news/cpu_mainboard/amd_predicted...


Great achievement! Of course, they can also thank Intel for waiting for them to catch up.


They also focused on game consoles, where they won contracts for both major platforms this generation.


The other non-major Nvidia based console has higher sales numbers than both of so-called major consoles combined.


It's ... difficult to compare a low-power SoC released in 2015 (so, design dating back to 2014 if not earlier) with high-power consoles developed in 2019 onwards.


They belong to the same generation. Until Switch 2 is released.


That doesn't mean they're meaningfully comparable (in the context we are talking about). A $100 budget android phone that was released at the same time as the latest iphone also belongs to the same generation, but that doesn't mean the chip manufacturers profit equally from the two (of course apple makes their own chips so in a literal sense this comparison doesn't make sense, but I'm sure you understand what I mean)

And I don't mean to disparage the switch, just point out that the way it's designed is very different which makes the comparison questionable


Previous one too. Xbox One and Series and PS4 and PS5 are all AMD, and there’s several revisions of each (e.g. PS4 OG, Slim, Pro).


Yes, this is true.

However, a lot of this has to do with the fact that AMD was on the brink of bankruptcy before the launch of Zen in 2016 (when their share price was ~$10). They simply did not have the capital to the kind of things Nvidia was doing (since '08 ?).

The bet on OpenCL and the 'open-source' community failed. However, ROCM/HIP etc. really seems to be catching up (I even see them packaged on Arch linux).


> However, ROCM/HIP etc. really seems to be catching up (I even see them packaged on Arch linux).

There are now distro-provided packages on Arch, Gentoo, Debian, Ubuntu, and Fedora.


What really strikes me is Nvidia's been working hard on doing practical work on their GPUs even just 10~15 years ago with PhysX, while both Intel and AMD just existed.

Nvidia's dominance today is the product of at least over a decade of work and investments to make better products. Today they are finally reaping their rewards.


I remember meeting NVIDIA in the late aughts (2007?) first launching their CUDA efforts. Really the product was a re-branded 780GTX or whatever their high end gaming card was at the time more or less, but they already laid out a clear pathway to today (more or less).


I remember meeting with them in the mid aughts when they were first talking to HPC folks about using their cards for science. I'll never forget what the chief scientist from nVidia said. "What is the color of a NaN? That is, when you render a texture with a nan value, what does it look like? I'll tell: it's nvidia green."


That is a funny way to signal their commitment to HPC! But compared to other tooling (non GPU) CUDA is still really clunky. Way ahead of everything else in the GPGPU space but still surprisingly clunky. Also I don't get what they are fearing with all their "Account required for download" (e.g. for CuDNN) what are they fearing? And is it really worth the trade-off for the pain it causes for dev environments and CI pipelines? It really seems like Intel and AMD have to step in to break this monopoly to force them to improve the situation for everyone.


No, you're not missing anything, NVIDIA's software is super clunky by the standards of most of the software world. However, for the last decade, the competition has been much worse: OpenCL development on AMD would be riddled with VRAM leaks, hard lockups, invisible limits on things like function length and registers that would cause the hard lockups when you tripped over them without any indication as to what you did wrong or how to fix it, that sort of thing. Cryptic error messages would lead to threads scattered around the internet, years old, with pleas for help and no happy endings.

The thing that caused me to ragequit the AMD ecosystem was when I took an OpenCL program I had been fighting for two days straight and ran it on my buddy's Nvidia system in hopes of getting an error message that might point me in the right direction. Instead, the program just ran, and it ran much faster, even though the nvidia card was theoretically slower.

In terms of quality, I expect the competition to catch up in a generation or two, but then there is still the decade+ of legacy code to consider. Hopefully with how fast AI/ML churns that isn't actually an insurmountable obstacle.


Years ago I gave up on OpenCL (1.2 on an AMD card) because of those hard lockups, with no way to debug it. nVidia didn't even support OpenCL 1.2 (and IIRC didn't support the synchronisation primitives I wanted in CUDA either -- AMD was more capable on paper). Thanks, I feel better to hear just how bad it was -- so it wasn't just my fault for quitting.


It's a quality meme but I'm having trouble figuring out the settings that make it work. It looks like RGBA8 would be blue:

    >>> struct.pack('f',math.nan)
    b'\x00\x00\xc0\x7f'
maybe that becomes green if you composite over white or something? Or maybe there is a common type of NaN that fills some of the unspecified bits? ("Just use the particular NaN that makes it green" is cheating unless you have an excuse)


They mean big-endian NaN, taking only the first 3 bytes. No alpha channel.

https://encycolorpedia.com/76b900 says Nvidia green #76b900.


Encycolorpedia looks like a great resource, thank you very much. A similar one would be Colorhexa. Not affiliated.

https://colorhexa.com

[Edit] Could not find it under the name, but it shows how color blind users perceive it. And it loads much faster.

https://www.colorhexa.com/76b900

[Edit] Encycolorpedia has a color blindness simulator too. Have to check on desktop.


Cool, thanks!


It was an arbitrary decision by the engineers who made the early GPUs, they just mapped NaN to an RGB

It was a nice way to debug tensors: render them to the screen, the green sticks out.


32-bit NaN is encoded: s111 1111 1xxx xxxx xxxx xxxx xxxx xxxx

Where both the sign (s) and the x bits can be anything and it will still be treated as a NaN.

There are lots of ways to encode colour, but there would be too much red with RGBA, and ARGB could be almost any opaque colour, but the red channel has to be at least 0x80, which is still too much red.

So NaNs are too red to encode nvidia green.


I once ended up having nans get interpreted as 32 bit colors accidentally, and it made everything red and white, like Christmas decoration.

Wonder what caused the difference in the latter bits being all 1 or 0 together.


I remember uni course on GPGPU and only discovering during first lecture Nvidia donated hardware to make sure it would be Cuda only course.


>doing practical work ... even just 10~15 years ago with PhysX

practical work with PhysX 13 years ago: https://www.realworldtech.com/physx87/

"For Nvidia, decreasing the baseline CPU performance by using x87 instructions and a single thread makes GPUs look better."

Nvidia magically released PhysX compiled with multithreading enabled and without flags disabling SSE a week after this publication. But couple of days before release they made those funny statements:

"It's fair to say we've got more room to improve on the CPU. But it's not fair to say, in the words of that article, that we're intentionally hobbling the CPU," Skolones told Ars.

"nobody ever asked for it, and it wouldn't help real games anyway because the bottlenecks are elsewhere"

>Nvidia's dominance today is the product of at least over a decade of work

Nvidias decade of work:

Ubisoft comments on Assassin’s Creed DX10.1 controversy https://techreport.com/news/14707/ubisoft-comments-on-assass...

AMD says Nvidia’s GameWorks “completely sabotaged” Witcher 3 performance https://arstechnica.com/gaming/2015/05/amd-says-nvidias-game...

AMD Dubs Nvidia’s GameWorks Tragic And Damaging, Fight Over The Developer Program Continues https://wccftech.com/fight-nvidias-gameworks-continues-amd-c...

"Number one: Nvidia Gameworks typically damages the performance on Nvidia hardware as well, which is a bit tragic really. It certainly feels like it’s about reducing the performance, even on high-end graphics cards, so that people have to buy something new.

"That’s the consequence of it, whether it’s intended or not - and I guess I can’t read anyone’s minds so I can’t tell you what their intention is. But the consequence of it is it brings PCs to their knees when it’s unnecessary. And if you look at Crysis 2 in particular, you see that they’re tessellating water that’s not visible to millions of triangles every frame, and they’re tessellating blocks of concrete – essentially large rectangular objects – and generating millions of triangles per frame which are useless."

"The world's greatest virtual concrete slab" https://web.archive.org/web/20121002034311/http://techreport... (images "somehow" vanished from original article at techreport where Nvidia runs marketing campaigns)

"Unnecessary geometric detail slows down all GPUs, of course, but it just so happens to have a much larger effect on DX11-capable AMD Radeons than it does on DX11-capable Nvidia GeForces. The Fermi architecture underlying all DX11-class GeForce GPUs dedicates more attention (and transistors) to achieving high geometry processing throughput than the competing Radeon GPU architectures."


> But GameWorks' capabilities are necessarily Nvidia-optimized; such code may perform poorly on AMD GPUs.

From the arstechnica article about Witcher 3.

How dare Nvidia optimize their game enhancing effects for Nvidia hardware and forget to do it for their competitors hardware as well! And as for a lot of these complaints, could it be that a lot of companies only optimize for hardware that has the largest market share?

According to the steam hardware survey in July of 2023, Nvidia accounts for 75% of the GPUs[0]. Nvidia and Amd have a lot of incompatibilities, and it can be hard to make the same code performant on both. It makes sense, as a game company, to prioritize optimizations for the largest market. No collusion and evil corporate mega lord scheming needed for this.

Edit: Also, Nvidia does put out a lot of research efforts for free. Path rendering on the GPU for example (PhysX being another). You can find research papers and videos published by Nvidia for these things. I would consider that practical work. You can hate on Nvidia for lots of things, but this is one thing I find weird to be combative over.

Second Edit: Also, why do you find the statements Nvidia said about the PhysX improvements funny? They’re right. Most games 13 years ago left a lot of idle time on the GPU while the CPU worked overtime to do logic, physics, sound, culling, etc. Lots of that stuff has also been moved to the GPU to minimize the amount of idle time on either the CPU or GPU. Nothing funny about what they said there.

[0]: https://store.steampowered.com/hwsurvey/


> could it be that a lot of companies only optimize for hardware that has the largest market share?

Yes, but also Nvidia partners directly with companies too (money can/does change hands).

Now the flip side is: so does AMD.


>Now the flip side is: so does AMD.

And their antics shutting out Nvidia (eg: FSR only, no DLSS) aren't being received well, not the least because their offerings are objectively inferior to Nvidia's.


While Nvidia isn’t doing these silly antics today, they’ve absolutely done them in the past. None of the large consumer silicon companies have clean hands with respect to anticompetitive/anti-consumer behaviours. They’ve all got too much power frankly.


This is factually jot the case as confirmed by Crytek developers at the time. Wireframe mode turns off clipping and cranks up LOD to max, and normally neither the water table would be visible (under the ground) nor would that block be rendered at that LOD.

https://old.reddit.com/r/pcgaming/comments/3vppv1/crysis_2_t...


100% this. I, and many others, bought multiple AMD cards due to disliking NVidia and tried to get ROCm set up to no avail. It just never worked except under hard to maintain configurations. I switched to an nvidia card and within the hour import tensorflow just worked


to be fair Nvidia drivers are also a nightmare under linux


Not anymore


Until they are. When the system breaks it breaks real bad. Nvcc/gcc/cuda/kernel mismatches are a pain to match up right. It gets gnarly super fast.

All systems hit snags. In most, you skid the tires a bit, maybe lose balance. With nvidia you're flying over the handlebars on to the asphalt.

I got snagged by this just about 2 weeks ago. It gets nasty. Not as bad as CUPS, but probably #2.


Eh, depends what you want out of them.

Do you want solid dependability and settings that are right first time, because you've only got one PC and if the graphics get broken you've got no browser to google for a fix?

Do you want the absolute most up-to-date drivers, to support the very newest GPUs, while running an LTS version of your OS?

Do you want to always run the latest driver version and upgrade without testing or worrying, like we do for web browsers?

Do you want to run CUDA and ML stuff, but also want to run Steam which for some reason wants 32-bit support available?

Do you want to run on a laptop with hybrid graphics, and have suspend/resume work reliably every time?

Do you have a small /boot/ partition, because you expected initrd.img to be 50MB or less?

Do you want to support Secure Boot?

If you want to achieve all these things at once, it'll take you a few tries to get it right :)


Catching up in this space requires a significant, sustained investment over multiple years and competent software engineers. It's not a simple thing for a hardware company to suddenly become competitive with Nvidia in AI/ML.

Instead, they've been going after the CPU market (and winning), HPC/scientific computing (high FP64 performance, in contrast to Nvidia's focus on low-precision ML compute), and integrating Xilinx.

However, I agree that it's an unfortunate situation, and I hope AMD becomes competitive in this space soon.


I think their hardware is comparable with nvidia. The problem is the software is awful by comparison. It’s hard to run any of the AI workloads with AMD, and even when you can the performance is poor. The software investment just hasn’t been made. Until then they are not even in the game.


AMD has an entire line specifically for AI/ML... https://www.amd.com/en/graphics/instinct-server-accelerators

They just don't have those capabilities in their consumer GPUs.

AMD is also nearly 50/50 with nVidia for supercomputers in the Top500 (and dominates at the top)

It took a few years after completeing the massive purchase of Xilinx to get going, but they are picking up speed rapidly.


AMD should do a high-memory-density MCD variant of 7900XT/XTX with a MCD that has 4 PHYs instead of 2. You could get 7900XTX to 48GB with no clamshell and 96GB with clamshell, which is getting into H100 territory.


Look at the good thing instead: they are catching up, and open source devs are starting to be serious about AMD because of its price/performance.

I believe it's a highly undervalued stock right now.


AMD gave up on the market for parallel compute entirely




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: