Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I believe youtube still uses 40 mel-scale vectors as feature data, whisper uses 80 (which provides finer spectral detail but is computationally more intensive to process naturally, but modern hardware allows for that)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: