Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Question since your now doing diffusion couldn't you also train something akin to a "upscaler" to improve the overall quality of the output as that seems to be a big complaint, it feels like it should be possible to train an upscaling audio model by feeding it lower quality versions of songs and high quality FLAC for it to learn how to improve audio via diffusion upscaling


This can definitely be done. There are approaches that turn the decoder part of the autoencoder into another diffusion model. The drawback is that's much more expensive computationally. We think there's still a lot of room for better quality on the AE side and can't wait to show our improvements.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: