Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

From reading reviews, dont have either yet: the nvidia actually has unified memory, AMD you have to specify the allocation split. Nvidia maybe has some form of gpu partitioning so you can run multiple smaller models but no one got it working yet. The Ryzen is very different from the pro gpus and the software support wont benefit from work done there, while nvidia is same. You can play games on Ryzen.


But on the ryzen the vram allocation can be entirely dynamically allocated. I saw a review showing excellent full GPU usage during inference with the bios vram allocation set to the minimum level, using a very large model. So it's not so simple as you describe (I used to think this was the case too).

Beyond that, seems like the 395 in practice smashes the dgx spark in inference speeds for most models. I haven't seen nvfp4 comparisons yet and would be very interested to.


Yes you can set it but in the BIOS, not dynamically as you need it.

I dont think there are any models supporting nvfp4 yet but we shall probably start seeing them.


That's what I'm saying, in the review video I saw they allocated as little memory as possible to the GPU in the bios, then used some kind of kernel level dynamic control.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: