I have access to an Nvidia A100. But as a layman, what specs does the rest of the system need to use it for some real work? I would assume there needs to be at least as much ram as vram and maybe a few terabytes of disk space. Does anyone have experience with this?
A whole different department made those decisions and I don't think they had any idea of what is actually needed. They wanted to buy such a GPU because training your own model is a trendy thing and they wanted to stay within their budget (mostly used up by the GPU).
I'm just trying to scramble together something from the hacked together thing I now have to deal with.
I'm running models locally on my 3090 and it's fast enough, although for example building a vector database can take a while. I can run LoRa training but I haven't done anything meaningful with it so far. I chose 3090 because of the cable issue of 4090 (also, no nvlink, although I'm not sure that matters) but it's debatable if my fears are justified. I need to leave the gpu running while I'm away and I just don't feel comfortable doing that with a 4090. I rather take the lower performance.
One caveat though, my asus b650e-f is barely supported by the currently used ubuntu kernel (e.g. my microphone doesn't work, before upgrading kernel + bios I didn't have lan connection...) so expect some problems if you want to use a relatively new gaming setup for linux.
Windows generally works but there may be a somewhat small performance hit. IMO linux is much easier to get to work judging by all the github issue threads I see able SD/LLaMa stuff on windows - but I don't use windows so I dont have personal experience.
4090 24GB is 1800USD, The Ada A6000 48GB is like 8000USD and idk where you buy it? So if you want to run games and models locally the 4090 is honestly the best option.
EDIT: I forgot - there is a rumored 4090ti with 48gb of vram, no idea if thats worth waiting for.
The A6000 is actually the old generation, Ampere. The new Ada generation one is called 6000. Seems many places still sell A6000 (Ampere) for the same price as RTX 6000 (Ada) though, even though the new one is twice as fast.
Seems you can get used RTX A6000s for around $3000 on ebay.
You're kidding? So they called it the RTX 6000, then called it the RTX A6000 for ampere, then back to RTX 6000 for Ada?
Why do they do this? Sometimes consumer products are versioned weirdly to mislead customers (like intel cpus) - but these wouldn't even make sense to do that with as they're enterprise cards?
Actually the first one is called Quadro RTX 6000, while the Ada one is just RTX 6000 without "Quadro" in front. Not that it makes the naming make much more sense.
According to GPT-4 the next generation one will be called Galactic Unicorn RTX 6000 :D
4090 is amazing, but very large card. 3090 is "good enough" for ML - same 24gb vram - and you can pick them up used for half the price of a new 4090. That's what I did.
WSL on windows apparently decent, or native PyTorch, dual boot windows/ubuntu still prob best tho.
Getting CUDA on OpenSUSE was super easy. The Nvidia blob drivers are easy to install and CUDA needs another download and some copy paste. Even Unreal Editor was easier to install than on Windows.
Save some money and go 3090, same vram, speed difference probably isn't worth the premium for the 4090. Then upgrade when the rumored 5090 generational leap happens.
Unless you have two graphics cards (well, you can use an integrated GPU) and need to run both OSes at once I think for most people this will be less convenient than a dual boot setup though
You can’t switch which GPU Linux is using without restarting the session
Get a 3090 or 4090. Forget about AMD.