Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

... but not in deep learning or am I missing something important here?


Yes, absolutely in deep learning. Custom fused CUDA kernels everywhere.


Yep. MoE, FlashAttention, or sparse retrieval architectures for example.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: