This is a great piece and deserves to be revisited from time to time - it shows how much journalism is actually just PR for for-profit companies.
The most insane thing about this is that none of the patents specify enough detail to actually implement MP3 - the official specifications are behind lock and key, and can no longer be purchased for any price. Unless they leak from someone who had a license, the public will never get to see them.
For all intents and purposes, the LAME source code is now the official specification for MP3.
Iirc encoding is not exactly specified by design. Leaving implementers (some) degree of freedom on what acoustic models to use, what audio data to throw away, etc. This is one area where LAME lagged in its early days (not today).
How a decoder should work is exactly defined (a quick look on Wikipedia mentions ISO/IEC 11172-3, not sure if that's a free/open standard or not). So one should expect 1:1 correspondence between encoded stream and decoded sound (minus rounding errors?), regardless of decoder used.
For already-encoded MP3s in people's archives this doesn't matter anyway, as the loss in "lossy compression" was already eaten, and no amount of re-coding or bitrates can recover that. Transcoding = more loss.
Not to mention the hassles involved in transcoding, likely for little gain if any.
Extremely minor nitpick: In this case the Fraunhofer Society is a non-profit, although it earns 70% of its income through contracts. That's why I can’t be that angry about the MP3 patent monopoly: The income financed other applied science research of the Fraunhofer Society, and financed research seems better to me than non-financed (and thus non-existant) research.
I'm with you with the closed specifications: Those should be definitely open now.
Is that not the same logical argument in favor of taxes though, eg property tax to fund schools? I still don't see software patents as a net positive for society, even if some companies are attempting to do some kind of societal good with them, and that logic gets really tenuous IMO with things like this.
I did an experiment recently where I managed to read all the english language investigative journalism worldwide, produced in the preceding week and indexed via a search engine, in just a few hours.
Even though there are literally over a billion people that can write English across the entire planet, and probably a few hundred thousand journalists active at any given time. Very very little is actually published in any given week.
I remember fondly that I did a masters thesis back in 1997 implementing an MP3 decoder on a 40 MIPS DSP, that was also network connected. It might have been the first networked audio streamer in the world :) But I had to go with their super-unoptimized C-based specification. I did have the MP3 standard to read as well as a reference, and I'm sure it's spread all over the internet and its corners..
Patents only cover the non obvious details to do some very specific thing. They aren’t CAD drawings with every detail specified, so you may be able to make a windshield wiper from one but it wouldn’t be a drop in replacement for a specific vehicle’s windshield wiper, in fact multiple different designs can all fall under the same patent.
The existence of multiple MP3 related patents makes it clear it’s not simply implementing a single idea. So you may be able to make a similar compression algorithm, but it wouldn’t be MP3.
I mostly use OPUS nowadays. Just download a podcast/audiobook/whatever, convert it to 32 kbps (which even is redundant! even lower is totally Okay, 24 kbps still can sound subjectively lossless for human speech, 16 kbps will probably be still good yet audibly different) OPUS with the voip profile and fit tons of content to listen to offline on whatever a humble storage your portable device has left.
OPUS really saves the day as today smartphones storage usually is filled with high-quality pictures and videos you have taken and apps getting gigger and bigger every year.
I wish Apple would introduce first-class OPUS support in the M4B container in iTunes but as long as it doesn't (and I doubt it ever will) 3-rd party audiobook players do a great job.
On iPhone we use "MP3 Audiobook Player Pro"[1]. Despite saying "MP3" in its title it supports many formats including OPUS. They broke OPUS speed-up and hadn't fixed it for some time but have fixed it recently so it's great again.
On Android "Smart AudioBook Player"[2] does a perfect job.
Not OP, but I have all mine stored on my plex server, and use Prologue to access them on iOS devices and like it a lot. https://apps.apple.com/app/id1459223267
Sadly my old "64 MB" (not really, it's 128 MB IIRC) player doesn't support OPUS. It's there where I have to fallback to AAC or MP3. But it luckily supports FLAC and Vorbis (which weren't listed in the specs) so I have some FLACs and OGGs on it as well. It also lacks speed adjustment so I had to pree-upspeed books with Audacity during the days when first Android devices were a luxury thing. If only it had support for OPUS, pitch-preserving speed adjustment and position persistance it could indeed make a great audiobook/podcast listening device.
Ogg and Opus will never have much adoption on the web as long as Apple devices exist. I don't think any single tech company has managed to be as destructive to the proliferation of patent-unencumbered formats in the short history of computing.
Opus hasn't caught on for music and podcasts, but is very popular in some other areas. YouTube uses Opus heavily, for example, and its the default audio codec for webm video files. It's also very popular for real time audio like in video conferencing.
This needs to be more widely shouted, Opus derives from CELT which was primarily focused on low latency audio encoding (which mp3 really sucks at) and it's the defacto standard for VoIP. Although for professional applications like wireless companies will use their own proprietary codecs.
> AAC and other newer audio codecs can produce better quality than MP3, but the difference is only significant at low bitrates
AFAIR this is not correct (it's based on the Opus marketing page). In the last decade, interest in transparent bitrates (192+ kb) has faded, so it's hard to find listening tests.
A few blind tests on the Hydrogenaudio page¹ report MP3 to be inferior to AAC at 192 kbps. A research reports² that the quality is the same (although if I understand correctly, looking at one analysis, results of noise/distortion are inconsistent, e.g. WAV being in some cases noisier/more distorted than MP3³). The same research references another research that find MP3 to be transparent at bitrates >= 256 Kbps.
Something I remember is that MP3 spends a disproportionate amount of storage in order to encode frequencies > 16 Khz. This may explain why in some tests, it performs worse in mid-high bitrates (160/192 Kbps), although I don't know the technical details.
I was listening to some of the LAME quality test samples[0] and I found that the hi-hat pre-echo test sounds worse at 160kbps than at 128kbps (LAME 3.100, -q 0). I thought it might be because of the different low-pass filter frequency, but even with manually specified matching lowpass (16805Hz) for the 160kbps version the pre-echo sounds worse.
The 128kbps version doesn't sound much like the original, but it's still a pleasant sound with hardly any pre-echo. The 160kbps version, even with the extra lowpass, has obvious and annoying pre-echo on the first hit.
Using the Foobar2000 ABX comparator I was able to 5/5 ABX:
wav vs. b128
wav vs. b160
b128 vs. b160 (this one was 7/8 ABX)
wav vs. pa128
wav vs. pa160
b128 vs. pa128 (more difficult)
pa128 vs. pa160 (more difficult)
I did not successfully ABX b160 vs. pc160.
I didn't do ABC/HR so I can't say for sure what my ranking would be, but my guess is:
Basically, MP3 is crap at 320kbps, and AAC is somewhat better.
The problem is that both formats roll off high frequencies. It's fine if you're listening in a car, through cheap speakers, on a noisy plane, aren't sensitive to higher frequencies, ect, ect. But, if you're in a quiet room with good speakers / headphones, and you have good ears, the high frequency roll off is noticeable.
This appears to be visual comparisons of graphs of various automated measurements. Lossy codecs are designed to exploit weaknesses of human hearing, so the only way to test if they're working as intended is with listening tests. This is most conveniently done with ABX testing software. If that high frequency roll off is really noticeable you should be able to produce a statistically significant result.
Here's an online ABX test of 320kbps MP3 using a modern encoder:
Very few people are capable of hearing the difference, and even then only in difficult-to-encode samples. I don't think it's reasonable to call something of this quality "crap".
"Investigations" without blind tests are useless. Try a blind test (the website of the sibling post is very interesting) and you'll be surprised at the bitrate at which you'll fail to notice differences.
Most encoders apply a low pass filter to cut off frequencies above 16kHz. Some do it above 18kHz but most choose 16kHz as default setting. This is common in all lossy codecs - mp3, aac etc.
AFAIR, LAME applies at 16 KHz cutoff at 128 kbps, and 17 Khz at 160 kbps. I don't have data about the others, but AFAIK a similar approach should apply.
I'm not able to hear above 16 Khz. I've tried only once an experiment with another person, and they were definitely able to discern the difference.
>Something I remember is that MP3 spends a disproportionate amount of storage in order to encode frequencies > 16 Khz. This may explain why in some tests, it performs worse in mid-high bitrates (160/192 Kbps), although I don't know the technical details.
I wonder if just straight out filtering them would result in better 160Kbps MP3s, it's not like most people can hear them...
One of amusing outtakes from the previous comments:
I somewhat recently found out my phones do support FLAC natively, so I don't need to transcode. It's not like I did transcode anything in years, besides obvious CD ripping decades ago, but there is no need to do so for some years because everything just plays and the storage is not a concern with both TransFla^W sorry, microSD cards and modern phones with gigabytes of flash.
I only recently started switching to FLAC, because I always felt space-constrained ;) At some point, I’ll have to redownload all those Bandcamp .ogg Albums.
I'd highly recommend bandcamp-collection-downloader[1]. As a heavy buyer of music from bandcamp it's great to just run a command[2] and have everything download while also keeping track of previous downloads so it doesn't re-download them. Personally I have them moved to a temp folder then I process/tag them with Picard[3] but you could probably automate that step with Beets[4] but I've not delved into it yet.
Thanks, I actually got the idea from recently finding bandcamp-collection-downloader, as doing it by myself just does not sound like fun, at all :D
The tagging I’ll keep manual (I have the Jellyfin library location shared via SMB and added the share to MediaMonkey on my Desktop PC) as I have specific needs for genre, by far my most important field besides the basics (Artist/Album/Year/Song, and those are all tagged properly by Bandcamp).
Picard is not automatic, I'd describe it mostly as "assisted".
There's a two step process of track clustering and release metadata searching that are automatic, but each transition to the next step is manual so that you can review and adjust (either by using another matching method, fixing the lookup by pointing it to the correct release, or manually editing).
There's a "gold disc" visual feedback for when it detected perfect matches.
You can also decide if some/all tags should be updated from the global database or kept from the original file thus using the database as enrichment.
You know FLAC is an archive format right? If you are planning to transcode the music in future into new files then it's ok. If you're planning to simply listen to it, then mp3 v0 sounds identical and is much smaller.
It's good to treat your music collection as an archive - it's not unheard of that artists pull their albums from Bandcamp or other services, with no possibility for you to redownload it if you ever decide that the format you downloaded it in is no longer up for the task. And especially with commercial releases you will never know if buying the release again in some other place will give you the same experience - it might be a bad remaster and getting hold of the original version might be impossible legally.
Well, while I listen to FLAC natively (pretty sure Jellyfin knows that all my clients are capable of FLAC playback), the whole point is that I won’t be dependent on Bandcamp existing in 40 years to still enjoy my music in the LLM generated meta-crypto-space that only supports Matroobis files ;)
With the quality of my speakers (the fanciest being a Logitech 5.1 system), I could probably go even lower without hearing a difference ;)
Well, it doesn't but I'll not reiterate why, for the third time.
If your system is not transparent enough, you can't hear what FLAC or a CD offers, and that's OK. MP3 v0 sounds pretty impressive given the album is not brickwalled.
I also use FLAC exclusively now. Space is not a problem any more and your archive can be used in many situations - different players, parties, as samples etc. Its not worth thinking about it.
Furthermore, people that share FLAC are way more pedantic about their archive in general - like checking for re-encoding low quality mp3s, file hashs etc.
And even that is limited to iOS, macOS Safari does support Vorbis (with some limitations). It is so widely supported that you would consider a JS decoder for the iOS polyfill.
Linux users are self-selecting. Wikipedia is an encyclopedia for everyone, not tech enthusiasts who decide to pick software that doesn't support an extremely common, non-propriety (!!!!!!) file format.
The most insane thing about this is that none of the patents specify enough detail to actually implement MP3 - the official specifications are behind lock and key, and can no longer be purchased for any price. Unless they leak from someone who had a license, the public will never get to see them.
For all intents and purposes, the LAME source code is now the official specification for MP3.