Hacker News new | past | comments | ask | show | jobs | submit login

It was really hard to do the first time. :) I'm honored to have been part of the first team to do any viable acoustic music recognition, in 2001 (much earlier than Shazam, a point of pride of course[0]).

You're dead on that it's pretty difficult if you don't benefit from others, we did a ton of work that in retrospect wasn't necessary. I liked the advanced psychoacoustic model, faithfully implemented in high performant C direct from Zwicker. (Psychoacoustics). To a first approximation, about 10/s model -> pca -> top 16 dim -> VQ and the resulting bytes contain more than 50% of the entropy (!!) Shove all of those in a home grown what-you-now-call-a vector DB, do dozens of range queries, and search for any song common to multiple results. Boom, music recognition. Understandable in retrospect but things like that aren't Everest they're like... multiple unclimbed mountains.

0. And far too early to have any applications. Company existed 2000-2001 \o/




Looks like others, including Shazam, beat you to the punch in 2000:

https://patents.google.com/patent/US7853664B1/

https://patents.google.com/patent/US6941275

Very interesting hearing about all of the differing approaches people have taken to solving this problem! Do you have further writings on this topic?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: