Hacker News new | past | comments | ask | show | jobs | submit login
"MP3 is dead" missed the real, much better story (2017) (marco.org)
123 points by Tomte on Oct 11, 2023 | hide | past | favorite | 72 comments



This is a great piece and deserves to be revisited from time to time - it shows how much journalism is actually just PR for for-profit companies.

The most insane thing about this is that none of the patents specify enough detail to actually implement MP3 - the official specifications are behind lock and key, and can no longer be purchased for any price. Unless they leak from someone who had a license, the public will never get to see them.

For all intents and purposes, the LAME source code is now the official specification for MP3.


Iirc encoding is not exactly specified by design. Leaving implementers (some) degree of freedom on what acoustic models to use, what audio data to throw away, etc. This is one area where LAME lagged in its early days (not today).

How a decoder should work is exactly defined (a quick look on Wikipedia mentions ISO/IEC 11172-3, not sure if that's a free/open standard or not). So one should expect 1:1 correspondence between encoded stream and decoded sound (minus rounding errors?), regardless of decoder used.

For already-encoded MP3s in people's archives this doesn't matter anyway, as the loss in "lossy compression" was already eaten, and no amount of re-coding or bitrates can recover that. Transcoding = more loss.

Not to mention the hassles involved in transcoding, likely for little gain if any.


Extremely minor nitpick: In this case the Fraunhofer Society is a non-profit, although it earns 70% of its income through contracts. That's why I can’t be that angry about the MP3 patent monopoly: The income financed other applied science research of the Fraunhofer Society, and financed research seems better to me than non-financed (and thus non-existant) research.

I'm with you with the closed specifications: Those should be definitely open now.


Is that not the same logical argument in favor of taxes though, eg property tax to fund schools? I still don't see software patents as a net positive for society, even if some companies are attempting to do some kind of societal good with them, and that logic gets really tenuous IMO with things like this.


I did an experiment recently where I managed to read all the english language investigative journalism worldwide, produced in the preceding week and indexed via a search engine, in just a few hours.

Even though there are literally over a billion people that can write English across the entire planet, and probably a few hundred thousand journalists active at any given time. Very very little is actually published in any given week.


Do you have any information on how you did that? I am curious.

I used to read the Wikipedia current events portal to get a relatively unfiltered stream of world news: https://en.m.wikipedia.org/wiki/Portal:Current_events


I just cross referenced the major search engines with a normal search query.


I remember fondly that I did a masters thesis back in 1997 implementing an MP3 decoder on a 40 MIPS DSP, that was also network connected. It might have been the first networked audio streamer in the world :) But I had to go with their super-unoptimized C-based specification. I did have the MP3 standard to read as well as a reference, and I'm sure it's spread all over the internet and its corners..


> none of the patents specify enough detail to actually implement MP3

So, it was never a valid patent to begin with.


No, you misunderstood what parents are for.

Patents only cover the non obvious details to do some very specific thing. They aren’t CAD drawings with every detail specified, so you may be able to make a windshield wiper from one but it wouldn’t be a drop in replacement for a specific vehicle’s windshield wiper, in fact multiple different designs can all fall under the same patent.

The existence of multiple MP3 related patents makes it clear it’s not simply implementing a single idea. So you may be able to make a similar compression algorithm, but it wouldn’t be MP3.


Unlike other software patents?


I've build an analysis of audio usage in podcasts. Currently 93% are using MP3 and 7% AAC. Ogg and Opus have virtually no usage at all.

https://podcast-standard.org/audio/

"Spotify for Podcasters" is the biggest hosting service pushing AAC usage, with ~30% of their inventory being AAC. https://podcast-standard.org/hosting_systems/spotify/


> Ogg and Opus have virtually no usage at all.

I mostly use OPUS nowadays. Just download a podcast/audiobook/whatever, convert it to 32 kbps (which even is redundant! even lower is totally Okay, 24 kbps still can sound subjectively lossless for human speech, 16 kbps will probably be still good yet audibly different) OPUS with the voip profile and fit tons of content to listen to offline on whatever a humble storage your portable device has left.

OPUS really saves the day as today smartphones storage usually is filled with high-quality pictures and videos you have taken and apps getting gigger and bigger every year.

I wish Apple would introduce first-class OPUS support in the M4B container in iTunes but as long as it doesn't (and I doubt it ever will) 3-rd party audiobook players do a great job.


What audiobook player do you recommend?


On iPhone we use "MP3 Audiobook Player Pro"[1]. Despite saying "MP3" in its title it supports many formats including OPUS. They broke OPUS speed-up and hadn't fixed it for some time but have fixed it recently so it's great again.

On Android "Smart AudioBook Player"[2] does a perfect job.

[1] https://apps.apple.com/us/app/mp3-audiobook-player-pro/id889...

[2] https://play.google.com/store/apps/details?id=ak.alizandro.s...


Not OP, but I have all mine stored on my plex server, and use Prologue to access them on iOS devices and like it a lot. https://apps.apple.com/app/id1459223267


> OPUS with the voip profile and fit tons of content to listen to offline on whatever a humble storage your portable device has left.

I'm glad you're saving that old 64MB player from landfill but the size of podcasts isn't a problem from even budget phone perspective.


Not really true, I've got 51.7GB/128GB of just podcasts on my phone.


Sadly my old "64 MB" (not really, it's 128 MB IIRC) player doesn't support OPUS. It's there where I have to fallback to AAC or MP3. But it luckily supports FLAC and Vorbis (which weren't listed in the specs) so I have some FLACs and OGGs on it as well. It also lacks speed adjustment so I had to pree-upspeed books with Audacity during the days when first Android devices were a luxury thing. If only it had support for OPUS, pitch-preserving speed adjustment and position persistance it could indeed make a great audiobook/podcast listening device.


Ogg and Opus will never have much adoption on the web as long as Apple devices exist. I don't think any single tech company has managed to be as destructive to the proliferation of patent-unencumbered formats in the short history of computing.


That's one nice thing about pirate releases: they just use whatever technology is best patents be damned!


About the only place I saw OGG take any traction was in the video game space. A lot of games used it. But that is anecdotal so not a great data point.


Opus hasn't caught on for music and podcasts, but is very popular in some other areas. YouTube uses Opus heavily, for example, and its the default audio codec for webm video files. It's also very popular for real time audio like in video conferencing.


This needs to be more widely shouted, Opus derives from CELT which was primarily focused on low latency audio encoding (which mp3 really sucks at) and it's the defacto standard for VoIP. Although for professional applications like wireless companies will use their own proprietary codecs.


Do the AAC podcasts provide MP3 files as a fallback for podcast players that don't support AAC? Is that even a possibility?

Thank you for the fantastic website, by the way! It's fascinating.


There are ways to do that, but I don't know how many pod catchers support these.

The most common solution I can see is to provide multiple RSS feeds, one for each audio format.


It's the standard for YouTube, which runs both Opus and AAC.


> AAC and other newer audio codecs can produce better quality than MP3, but the difference is only significant at low bitrates

AFAIR this is not correct (it's based on the Opus marketing page). In the last decade, interest in transparent bitrates (192+ kb) has faded, so it's hard to find listening tests.

A few blind tests on the Hydrogenaudio page¹ report MP3 to be inferior to AAC at 192 kbps. A research reports² that the quality is the same (although if I understand correctly, looking at one analysis, results of noise/distortion are inconsistent, e.g. WAV being in some cases noisier/more distorted than MP3³). The same research references another research that find MP3 to be transparent at bitrates >= 256 Kbps.

Something I remember is that MP3 spends a disproportionate amount of storage in order to encode frequencies > 16 Khz. This may explain why in some tests, it performs worse in mid-high bitrates (160/192 Kbps), although I don't know the technical details.

¹=https://hydrogenaud.io/index.php/board,40.0.html

²=https://www.hindawi.com/journals/ijdmb/2019/8265301

³=https://www.hindawi.com/journals/ijdmb/2019/8265301/fig2


I was listening to some of the LAME quality test samples[0] and I found that the hi-hat pre-echo test sounds worse at 160kbps than at 128kbps (LAME 3.100, -q 0). I thought it might be because of the different low-pass filter frequency, but even with manually specified matching lowpass (16805Hz) for the 160kbps version the pre-echo sounds worse.

The 128kbps version doesn't sound much like the original, but it's still a pleasant sound with hardly any pre-echo. The 160kbps version, even with the extra lowpass, has obvious and annoying pre-echo on the first hit.

[0] https://lame.sourceforge.io/quality.php


I can confirm this. I tested with 3.100.1 and the following encodes of hihat.wav:

    -q0 -b128 (b128)
    -q0 -b160 (b160)
    --preset cbr 128 (pc128)
    --preset cbr 160 (pc160)
    --preset 128 (pa128)
    --preset 160 (pa160)
Using the Foobar2000 ABX comparator I was able to 5/5 ABX:

    wav vs. b128
    wav vs. b160
    b128 vs. b160 (this one was 7/8 ABX)
    wav vs. pa128
    wav vs. pa160
    b128 vs. pa128 (more difficult)
    pa128 vs. pa160 (more difficult)
I did not successfully ABX b160 vs. pc160.

I didn't do ABC/HR so I can't say for sure what my ranking would be, but my guess is:

wav >> b128 > (b160 or pa160) > pa128


I did an investigation into MP3 vs AAC. My tests were at 320kbps.

https://andrewrondeau.com/blog/archive/2016-07

Basically, MP3 is crap at 320kbps, and AAC is somewhat better.

The problem is that both formats roll off high frequencies. It's fine if you're listening in a car, through cheap speakers, on a noisy plane, aren't sensitive to higher frequencies, ect, ect. But, if you're in a quiet room with good speakers / headphones, and you have good ears, the high frequency roll off is noticeable.


This appears to be visual comparisons of graphs of various automated measurements. Lossy codecs are designed to exploit weaknesses of human hearing, so the only way to test if they're working as intended is with listening tests. This is most conveniently done with ABX testing software. If that high frequency roll off is really noticeable you should be able to produce a statistically significant result.

Here's an online ABX test of 320kbps MP3 using a modern encoder:

http://abx.digitalfeed.net/lame.320.html

Very few people are capable of hearing the difference, and even then only in difficult-to-encode samples. I don't think it's reasonable to call something of this quality "crap".


> Very few people are capable of hearing the difference, ... I don't think it's reasonable to call something of this quality "crap".

But some people can tell the difference. Therefore, the roll off is crap.

Granted, if you only give yourself ~3.5 bits / sample, MP3 is quite impressive for what it does.


The audible difference is mostly pre-echo and warbling noises, not low-pass filtering.


"Investigations" without blind tests are useless. Try a blind test (the website of the sibling post is very interesting) and you'll be surprised at the bitrate at which you'll fail to notice differences.


Did you consider re-testing 16bit vs 24bit?


HE-AACv2 is better at low bitrates (spoken podcasts) then mp3. ffmpeg -i input.wav -c:a libfdk_aac -vbr 3 -b:a 32k -profile:a aac_he_v2 output.m4a

There is also xHE-AAC https://gitlab.com/ecodis/exhale https://www.mainconcept.com/hubfs/PDFs/User%20Guides/MainCon...


Most encoders apply a low pass filter to cut off frequencies above 16kHz. Some do it above 18kHz but most choose 16kHz as default setting. This is common in all lossy codecs - mp3, aac etc.


It depends on the bitrate.

AFAIR, LAME applies at 16 KHz cutoff at 128 kbps, and 17 Khz at 160 kbps. I don't have data about the others, but AFAIK a similar approach should apply.

I'm not able to hear above 16 Khz. I've tried only once an experiment with another person, and they were definitely able to discern the difference.


>Something I remember is that MP3 spends a disproportionate amount of storage in order to encode frequencies > 16 Khz. This may explain why in some tests, it performs worse in mid-high bitrates (160/192 Kbps), although I don't know the technical details.

I wonder if just straight out filtering them would result in better 160Kbps MP3s, it's not like most people can hear them...


LAME does by default.


MP3 above 16Khz is the sfb21 ( scale factor band 21) problem.. Can’t be solved without lots of bits being thrown at it..


Thank you, that's very interesting. Are more modern format less susceptible to this problem?


Discussed at the time with 290 comments: https://news.ycombinator.com/item?id=14347648


One of amusing outtakes from the previous comments:

I somewhat recently found out my phones do support FLAC natively, so I don't need to transcode. It's not like I did transcode anything in years, besides obvious CD ripping decades ago, but there is no need to do so for some years because everything just plays and the storage is not a concern with both TransFla^W sorry, microSD cards and modern phones with gigabytes of flash.


I only recently started switching to FLAC, because I always felt space-constrained ;) At some point, I’ll have to redownload all those Bandcamp .ogg Albums.


I'd highly recommend bandcamp-collection-downloader[1]. As a heavy buyer of music from bandcamp it's great to just run a command[2] and have everything download while also keeping track of previous downloads so it doesn't re-download them. Personally I have them moved to a temp folder then I process/tag them with Picard[3] but you could probably automate that step with Beets[4] but I've not delved into it yet.

1. https://github.com/Ezwen/bandcamp-collection-downloader

2. https://gist.github.com/ryanwalder/d5d6d1d43b4b77fb92bde75b5... (stick it in ~/bin/bandcamp and just run `bandcamp` to download everything)

3. https://picard.musicbrainz.org/

4. https://github.com/beetbox/beets


Thanks, I actually got the idea from recently finding bandcamp-collection-downloader, as doing it by myself just does not sound like fun, at all :D

The tagging I’ll keep manual (I have the Jellyfin library location shared via SMB and added the share to MediaMonkey on my Desktop PC) as I have specific needs for genre, by far my most important field besides the basics (Artist/Album/Year/Song, and those are all tagged properly by Bandcamp).


> The tagging I’ll keep manual

Picard is not automatic, I'd describe it mostly as "assisted".

There's a two step process of track clustering and release metadata searching that are automatic, but each transition to the next step is manual so that you can review and adjust (either by using another matching method, fixing the lookup by pointing it to the correct release, or manually editing).

There's a "gold disc" visual feedback for when it detected perfect matches.

You can also decide if some/all tags should be updated from the global database or kept from the original file thus using the database as enrichment.


Yeah, but what is the advantage? Artist/Album/Track are already correct, genre I do manually. I don’t really use other tags.


You know FLAC is an archive format right? If you are planning to transcode the music in future into new files then it's ok. If you're planning to simply listen to it, then mp3 v0 sounds identical and is much smaller.


It's good to treat your music collection as an archive - it's not unheard of that artists pull their albums from Bandcamp or other services, with no possibility for you to redownload it if you ever decide that the format you downloaded it in is no longer up for the task. And especially with commercial releases you will never know if buying the release again in some other place will give you the same experience - it might be a bad remaster and getting hold of the original version might be impossible legally.


Well, while I listen to FLAC natively (pretty sure Jellyfin knows that all my clients are capable of FLAC playback), the whole point is that I won’t be dependent on Bandcamp existing in 40 years to still enjoy my music in the LLM generated meta-crypto-space that only supports Matroobis files ;)

With the quality of my speakers (the fanciest being a Logitech 5.1 system), I could probably go even lower without hearing a difference ;)


Well, it doesn't but I'll not reiterate why, for the third time.

If your system is not transparent enough, you can't hear what FLAC or a CD offers, and that's OK. MP3 v0 sounds pretty impressive given the album is not brickwalled.


I also use FLAC exclusively now. Space is not a problem any more and your archive can be used in many situations - different players, parties, as samples etc. Its not worth thinking about it.

Furthermore, people that share FLAC are way more pedantic about their archive in general - like checking for re-encoding low quality mp3s, file hashs etc.


I used to use FLAC in my phone, but now I have over two terabytes of music in my Plex server, so I just stream it as OPUS which works great for this.


Patents give patent pirates an advantage while hindering people who play by the rules.


Software patents also don't apply in some jurisdictions, like the EU.

For a long time, linux mint didn't bother asking if you wanted mp3 support on install (like ubuntu of the era) because it was packaged in France.


That's true for any law and its scoff laws.


Anyone know of a Rust crate that encodes MP3?

I'm looking for one that isn't a wrapper around LAME, whose licensing is complicated.


It is annoying that a lot of Wikipedia pages still have sound samples in some crazy Ogg format that normal people’s browsers cannot play.


According to caniuse.com [1] all major browsers support ogg vorbis except for one: Safari.

[1] https://caniuse.com/ogg-vorbis


And even that is limited to iOS, macOS Safari does support Vorbis (with some limitations). It is so widely supported that you would consider a JS decoder for the iOS polyfill.


Yeah, that's what Wikipedia do using this https://github.com/brion/ogv.js/


But why not choose a format that every device supports that is also non-proprietary?


MP3 was proprietary when they started. The Vorbis polyfill works. Probably many files do not have a higher quality copy to encode.


Wikimedia commons recommends Ogg Opus over Vorbis these days


Wikipedia has, rightfully imo, chosen to use an open format. My browser plays it just fine.


RTFA. MP3 has been a free format for over 6 years.


Does every desktop Linux distro include MP3 support in the base system? How many of the audio files on Wikipedia are older than 6 years old?


Linux users are self-selecting. Wikipedia is an encyclopedia for everyone, not tech enthusiasts who decide to pick software that doesn't support an extremely common, non-propriety (!!!!!!) file format.


Ogg Vorbis kicks ass. the 320k ogg that spotify uses are top notch




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: