Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Adjustable vocals: Users now have control over a song’s vocal levels. They can sing with the original artist vocals, take the lead, or mix it up on millions of songs in the Apple Music catalog.

I think this only requires pre-making two audio files per track, and simultaneously streaming these.

Real-time lyrics, Background vocals and Duet view are all nice features too, but the hardest part processing-wise is analysing how loud you sing into the microphone. It's just karaoke with a good UI.



> I think this only requires pre-making two audio files per track, and simultaneously streaming these.

That’s the understatement of the century. “this only requires […] simultaneously streaming these”


Don't movies already have stereo sound? Is going from two audio channels to four really that difficult?


Well, it is indeed tricky but not requires-dedicated-chip tricky :-)


They say it supports millions of songs so I doubt that’s how they are doing this.

They are likely using a sophisticated ML version of what old karaoke machines did, and removing the vocals in real-time.


Even if they're using ML, I don't imagine they'd do it on-device. They don't even do voice recognition on-device by default.


> [Apple says it is] relying on an on-device machine learning algorithm that processes the music in real-time. The tech builds on Apple’s noise-cancellation expertise and other developments it’s made for FaceTime, the company said.

Source: https://techcrunch.com/2022/12/06/apple-music-is-getting-a-n...


I stand corrected as well!

Wonder why they take this approach though, as it is clearly over-engineering (if I correctly understand that the goal is just to make vocals volume adjustable).


> Wonder why they take this approach though, as it is clearly over-engineering (if I correctly understand that the goal is just to make vocals volume adjustable).

Depends what the other non-functional requirements were. i.e. if the NFRs were as follows:

* Cannot increase bandwidth / mobile data usage.

* Cannot impact music quality / bitrate.

* Has to work offline.

* Cannot increase on-device storage.

* Has to be responsive.

Then two audio streams might not work.

Another advantage of doing it on-device is that it doesn't actually change any of the backend architecture too. It might be a lot of change to a lot of systems for a feature which only adds a small amount of functionality - i.e. architecting your entire backend and streaming around seperating audio tracks might not be the right focus.


Maybe it's licensing? I can imagine copyright holders being squeamish about Apple processing, permanently storing, and serving heavily altered versions of their music. The difference is silly and pedantic, but by processing it in real-time during playback, one might argue it's just a filter effect like EQ.


Ah, I stand corrected! I wonder why they took that approach...


Not sure - although I would imagine that it would effectively double the storage and bandwidth/data requriement for Apple Music in general if they had to send two files with equal bitrate.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: