Hacker News new | past | comments | ask | show | jobs | submit login

Note that while UMA is great in the sense that they allow LLM models to be run at all, M-series chips aren't faster[1] when the model fits in VRAM.

  1: screenshot from[2]: https://www.igorslab.de/wp-content/uploads/2023/06/Apple-M2-ULtra-SoC-Geekbench-5-OpenCL-Compute.jpg
  2: https://wccftech.com/apple-m2-ultra-soc-isnt-faster-than-amd-intel-last-year-desktop-cpus-50-slower-than-nvidia-rtx-4080/



The problem is you're limited to 24 GB of VRAM unless you pay through the nose for datacenter GPUs, whereas you can get an M-series chip with 128 GB or 192 GB of unified memory.


Surely! The point is that they're not million times faster magic chips that makes NVIDIA bankrupt tomorrow. That's all. A laptop with up to 128GB "VRAM" is a great option, absolutely no doubt about that.


They are powerful, but I agree with you, it's nice to be able to run Goliath locally, but it's a lot slower than my 4070.


that's openCL compute, LLM models ideally should be hitting the neural accelerator, not running on generalized gpu compute shaders.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: