Hacker News new | past | comments | ask | show | jobs | submit login

I wonder how much it is even possible to train cutting-edge models models. I am sure that there are still tasks where a simple feed forward network or RNN.

However, you can just barely finetune any for the base pretrained transformer models (e.g. BERT base or XLM-R base) with 8GB VRAM and need 12GB or 16GB VRAM to finetune larger models. Given that M1 Macs are currently limited to 16GB of shared RAM, I think training competitive models is currently very limited with the memory limitations.

I guess the real fun only starts when Apple releases higher-end machines with 32 or 64GB of RAM.




You are probably right, but I guess there are plenty of interesting use cases << 8GB VRAM :D


The framework is supposed to work on AMD64 too, so Pros with 32GB+ RAM.


Well unless they also support acceleration on AMD GPUs, this is not so interesting. Training on x86_64 CPU cores or integrated Intel GPUs is really slow compared to training on modern NVIDIA GPUs with Tensor Cores (or AMD GPUs with ROCm, if you can get the ROCm stack running without crashing with obscure bugs).

Also see the benchmarks in the marketing PR:

https://blog.tensorflow.org/2020/11/accelerating-tensorflow-...

The M1 blows away Intel CPUs with integrated GPUs (and modern NVIDIA GPUs will probably blow away the M1 results, otherwise they'd show the competition ;)).


AMD GPUs are supported.


Interesting! Do you have a source?


In computer vision you can go far wit 8GB VRAM.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: