Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The idea of connecting CV to audio via spectrograms pre dates Jeremy Howard's course by quite a bit. That's not really the interesting part here though. The fact that a simple extension of an image generation pipeline produces such impressive results with generative audio is what is interesting. It really emphasizes how useful the idea of stable diffusion is.

edit: added a bit more to the thought



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: