Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Modern GPUs are extremely programmable, but this flexibility isn't that heavily used by neural networks. NN inference is pretty much just a huge amount of matrix multiplication.

Especially for mobile applications (most Arm customers), you pay extra energy for all that pipeline flexibility that isn't being used. A dedicated chip will save a bunch of power.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: