Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's pretty much what SLIDE [0] does. The driver was achieving performance parity with GPUs for CPU training, but presumably the same could apply to running inference on models too large to load into consumer GPU memory.

https://github.com/RUSH-LAB/SLIDE



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: