Hacker News new | past | comments | ask | show | jobs | submit login

Short answer: it has been a big pain in the butt. The GPU hardware is mostly really great, but the drivers/APIs were not designed for such a low-latency use case. There's (for audio) a large overhead latency in kernel execution scheduling. I've had to do a lot of fun optimization in terms of just reducing the runtime of the kernel itself, and a lot of less-fun evil dark magic optimization to e.g. trick macOS into raising the GPU clock speed.

Long answer: I've written a fair bit about this on my devlog. You might check out these tags:

https://anukari.com/blog/devlog/tags/gpu https://anukari.com/blog/devlog/tags/optimization






Thanks for the extra info, I read through some of your entries on GPU optimization and it definitely seems like it's been a journey! Thanks for blazing the trail.



Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: