VBR: is it actually as bad as they say?
I recently wrote up a lengthy post about the MP3 encoding settings of the top one hundred podcasts on the iTunes charts. One of my suggestions was very controversial: folks on Reddit disagreed about whether it’s okay to use VBR encoding in podcasts.
I was surprised by the vehemence with which folks insisted that VBR is bad. “It shouldn’t be used!” “Stay away from VBR.” There was no shortage of folks suggesting to avoid VBR, but little in the way of substance behind those claims.
I set out to collect all of the arguments against VBR that I could find, and researched each to determine whether it was possible to verify the claims behind each of them.
First, though, some background.
What is VBR?
To save you the click, I’ll give some quick background. In an MP3, you have a bitrate. The bitrate is the number of bits it takes to store one second of audio. A 128kbps MP3 file takes 128 kilobits to store one second of audio. If you have a 128kbps MP3 file that’s ten seconds long, it’ll take 1280 kilobits to store the file. Simple.
That’s how CBR, or Constant BitRate, works. The whole file has one bitrate. The downside to this is that not all audio is created equal. Some audio requires fewer bits to store (say, a moment of silence). Some audio requires more. Having one bitrate means that you’re potentially wasting bits storing audio fidelity that you don’t need. That’s where VBR, or Variable BitRate, comes in.
VBR allows chunks of the file to be encoded at different bitrates. That second of near-silence might squish down to 40kbps, while a second of music might jump up to 160kbps. Done correctly, this can yield very substantial savings in size.
What are the arguments against VBR?
Rather than beat around the bush, let’s jump in and look at the arguments against VBR and test the validity of each.
VBR breaks seeking in lots of apps.
This is true, and I specifically call this out in my post:
With a CBR file, skipping forward or backward is easy because you can calculate exactly where to jump to. With VBR, skipping ahead ten seconds might mean skipping up to 1280 kilobits — but that might be too much if the quality is lowered within those ten seconds.
Essentially, you can’t know where to jump to in the file to start playing at a specific timecode, because instead of being a simple multiplication, you need to know the bitrates of all of the audio leading up to that timecode.
There are ways to avoid this. Long, long ago, folks created a number of standards that allow metadata to be embedded in the MP3, allowing decoders to figure out where to seek to. I could write more about this, but it’s a moot point because virtually nobody implements the standard.
It’s worth noting that the amount by which the timecode is off grows as you get further along in the file. At the beginning of the audio file, it’s unlikely that the quality was dropped by very much at all, and the difference might only be a few milliseconds. After a few minutes, though, that will grow into seconds. After an hour and up, it can get grow to a minute or more.
Some podcasts are very short. Consider The Memory Palace, which generally has episodes less than 15 minutes. I would be more than surprised to hear that seeking in a VBR-encoded T.M.P. episode was off by more than a few handfuls of seconds by the end of the file. (I would measure this, but it’s impossible to do correctly without access to the raw source audio)
Other podcasts don’t really require a robust seeking feature. ASMR podcasts, podcasts with little dialog or without dialog at all, and podcasts with mindless jabber as the hosts, say, play video games all don’t need the ability to accurately seek to a particular timecode. This is a trade-off that a non-zero number of podcasts are willing to make.
Relative seeking is also largely unaffected by VBR encoding. The podcast My Brother My Brother and Me uses VBR encoding, and it’s possible to skip ahead by thirty seconds and back by ten seconds with very good accuracy. There’s a good reason for this technically: just like seeking from the beginning of a file, it’s unlikely that the quality dips by very much during the small chunk of time you’re skipping ahead by. Skipping ahead by thirty seconds might mean actually skipping ahead by, say, thirty one seconds. The amount of inaccuracy is determined by the amount of audio you’re skipping past, which with relative seeking is usually quite small.
VBR doesn’t actually make files smaller.
This is half true. VBR will produce files of almost equal size to CBR if the average bitrate of the VBR file is the same as the fixed bitrate of the CBR file. VBR will also produce files equal in size to a CBR file if it never changes the bitrate (i.e., the encoder never chooses to lower the quality, such as with random noise).
Excluding the case where the file contains only random noise (why are you publishing that in your podcast anyway?) the difference in size has the obvious caveat that the VBR file will have an equal or greater audio quality overall than the CBR file.
Consider this: you have a ten second file. The first half is near-silence, and the second half is high-fidelity music. If we encode this as CBR at 128kbps, it’ll be 1280kb. If we encode it as VBR, and the encoder hypothetically encodes the first half at 64kbps and the second half at 192kbps, the file size will still be 1280kb, and the average bitrate is still 128kbps. Comparing the quality, though, we’ll find the VBR file sounds much better, since the silence is using only the bits that it needs and more bits were devoted to the music.
By tuning your encoder’s settings, you can effectively lower the average bitrate of your VBR-encoded file such that the quality roughly matches the equivalent CBR-encoded file. In theory, this will lead to an overall reduction in file size. If you choose VBR settings without knowing what you’re doing, though, you can easily end up negating any file size benefit you would derive from using VBR to begin with.
VBR files don’t show the correct duration.
By default, no, a VBR file’s duration will be calculated by its byte length, resulting in an overestimate (for the same reason that seeking doesn’t work). This is easily remedied, though: simply specifying the audio duration in the ID3 tags using a TLEN
frame will fix the duration. Some decoders don’t read the TLEN
frame correctly, but they’re few and far between and are almost never used with the apps and devices someone might consume a podcast from.
Encoders like Adobe Audition generate broken VBR-encoded files.
This is something I found mentioned online in a number of places, tracing back to a post on Adobe’s forums. Without reading the details, it’s easy to create a cloud of FUD around this issue. It turns out that this is directly related to the last claim about duration: Audition simply wasn’t (allegedly) adding the TLEN
data.
Update: I’d like to note that I haven’t been able to reproduce this issue with Adobe Audition. It may be that an issue existed in a previous version, but that no longer appears to be the case. I’ve updated this section to more explicitly state that I do not believe there is an issue with Adobe Audition. Thanks to @audiblychuck on Twitter for reaching out.
I’d make the argument that this is the responsibility of the podcaster, not a problem for the listener. It’s easy to add ID3 tags, and Audition isn’t the only horse in this race. Behind the scenes, Audition uses the Fraunhofer MP3 encoder. The post on Adobe’s forums also refers to Audition CS6, released in 2012; I would be unsurprised if a more recent version fixed the issue.
Even if Adobe didn’t fix this, numerous posts around the internet recommend tools (MP3val, MP3Diag, etc.) that detect and fix this problem. Ffmpeg and LAME both correctly add the appropriate ID3 tag, meaning most other audio editing software will work correctly by default.
Almost all modern MP3 decoders do not require a TLEN
ID3 tag to determine the correct duration of a VBR MP3 file.
VBR doesn’t work with certain devices.
There is anecdotal evidence to support this. I found a HackerNews comments thread about device support. Here is the root comment of the discussion, talking about an experience from over a decade ago:
As it turns out, not everyone is listening using a modern device. When we tried VBR a significant number of people could not listen because their MP3 playing hardware/software of choice did not support VBR files properly. They didn’t realize this was the problem. They just complained that the file was corrupted while it was working fine for everyone else.
One commenter had a problem with their EigerMan F20:
My favorite bug about this was on an _ancient_ MP3 player I had (an EigerMan F20), which supported VBR MP3s…incompletely. It didn’t support decoding regions with certain bitrates, so it would just silently skip them, leading to extreme confusion on my part.
Another commenter had better luck with his Nomad Jukebox 3:
I’m pretty certain my Nomad Jukebox 3 supported VBRs fine, and that’s coming up on 14 years old now.
A user on hydrogenaudio had bad luck with a DVD player in 2006:
My DVD player (Samsung HD-860) doesn’t play mp3 vbr files. It’s about 2 years old and even comes with a HDMI output.
Another commenter in the same thread had trouble with his car:
My friend purchased a new 2008 Pontiac G5 (this is basically the Grand Am but they have since renamed it to G5) and it came with a factory installed mp3-CD compatible deck. The unit will play VBR files just fine but we have discovered that all frames in the mp3 must be encoded at 128kbps or higher.
I won’t keep copying and pasting posts about cars and MP3 players from over a decade ago. Most of the devices that folks are mentioning wouldn’t even be able to hold a full podcast episode from 2017!
My research across the rest of the web yielded similar results. I couldn’t find a single report of a device made in the last ten years that failed to play VBR files, and this does not surprise me. An uncited claim on Wikipedia states:
As of December 2006, devices that support only CBR encoded files are largely obsolete, as the vast majority of modern portable music devices and software support VBR encoded files.
Without any evidence to the contrary, I don’t believe device compatibility is a valid argument against VBR.
If you have experienced VBR compatibility issues with a device, I’d love to hear about it. Please reach out!
Firefox doesn’t support VBR.
This is no longer true. Firefox does support VBR files. I tested myself on both macOS and Windows 10. Firefox uses the host platform’s audio decoder to play MP3 rather than bundling its own MP3 decoder. On Windows, the file allegedly stops playing mid-stream due to the timecode issues discussed above. This no longer seems to be the case at all. The file played just fine, with no truncation and no seek issues.
The professionals say not to use VBR.
I was referred to a podcast authorities and other industry professionals for their advice on why to avoid VBR. I was interested in the arguments that these folks put forth.
Update: At the time of writing, a bug in my analysis’s code incorrectly identified 15 podcasts in the iTunes top 100 podcasts as using VBR. In truth, only one uses VBR encoding. This number was cited in my correspondence with Rob Walch.
The first person I was told to get in contact with is Rob Walch, who’s the current VP of podcaster relations at Libsyn. I sent him an email, and he responded with a link to a blog post. Here’s a snippet from that post:
VBR is an old tech / hack that was created to make MP3 music files smaller and was popular back in the heyday of file sharing. Today there is no need for it — available bandwidth and storage today is much different than 15 and 20 years ago. But more importantly ISO standards for MP3 do not require players support it.
According to the standard (ISO/IEC 11172–3:1993) Section 2.4.2.3
“In order to provide the smallest possible delay and complexity, the decoder is not required to support a continuously variable bitrate when in layer I or II. Layer III supports variable bitrate by switching the bitrate index. However, in free format, fixed bitrate is required.”
and
“For Layer II, not all combinations of total bitrate and mode are allowed.”
Hence, most Layer II coders would not have been written with VBR in mind, and Layer II VBR is a hack. It works for limited cases. Getting it to work to the same extent as MP3-style VBR will be a major hack.
In short VBR’s day in the light and mass use is way way behind us — back in the late 1990’s and pre-podcasting.
All of these arguments are the same as we’ve covered above, with a handful of exceptions. For one, Rob claims that bandwidth and storage is cheap. This is true, but podcast listenership has also exploded in recent years (even since his post in 2014). Internationally, especially in emerging markets, bandwidth is expensive for the listener, which can be a barrier to increasing listenership outside the US.
He also cites the MPEG ISO spec, but the quotes he extracts are misinterpreted. MP3 stands for “MPEG-2 Audio Layer 3,” so the quote “Layer III supports variable bitrate by switching the bitrate index,” really means “MP3 supports variable bitrate.” To my understanding, you cannot be MP3-compatible and not support VBR (per the spec). The second quote about “Layer 2” refers to MPEG-2 Audio Layer 2, which is a different codec from MP3 entirely and is irrelevant to the discussion.
I replied with these comments, asking if he had data to help substantiate these claims. The response I got was a bit…salty.
Matt,
Honestly — the article title said it all — The first and last word on VBR.
VBR is dead — anyone pushing for it is just fighting windmills.
CBR = good
VBR = bad
It really is that simple — don’t try to make more out of this — VBR is NOT fully support by players and standards.
If you are trying to push for VBR — then eventually you will look back on this email and wish you just listened to me. :)
and followed quickly by
Hi Matt,
If you were thinking of using VBR or are using VBR and after you read my article you are not convinced to change — you need to really really read this:
There’s a bitter irony in his reply, which I’ll let you find as you read Matthew Inman’s fine strip about the backfire effect. I pressed him again to provide details, and received another chilly response:
Good luck on your quest.
I consider VBR a dead issue and roll eyes when it comes up. Which is the reason for the post I made.
It seems every couple of years it raises its ugly head.
Not sure what 15% you saw — the last time I checked top shows it was 0%
http://podcast411.libsyn.com/will-increasing-your-bit-rate-equal-more-listeners
See this post.
At this point — it is my last reply on VBR.
Too much to do to waste time on this — the post I made gives you all the info you need if you look at it objectively.
I really recommend you move on to CBR and you will not have any issues.
The linked post only repeats Rob’s mantra: “VBR = bad.” Without pointing to objective facts to back up the claims that he makes, I can’t say that Rob’s opinions on the matter hold much water.
Todd Cochrane was another name mentioned to me. Todd is the CEO of RawVoice, which is the company that runs Blubrry. I was unable to find any public comment by Todd on the matter at all, even in his hour-long Podcast Engineering School interview (yes, I listened to the whole thing). I reached out to him over Twitter, but at the time of writing have not heard back.
Update: Mr. Cochrane responded to my tweet shortly after this was posted.
I replied with a request for additional information as I did with Mr. Walch, and will update this post if I receive a response.
Marco Arment was the last that I was referred to. He co-founded Tumblr and created the Overcast podcast aggregator. He has a very direct post from last year about why you should avoid VBR with podcasts:
Without accurate seeking, streaming and web audio players don’t work properly, including share-at-timestamp links that are becoming key drivers of the sharing and spreading of podcasts.
That’s the crux of the piece, and this is the first argument addressed here. Marco applauds VBR and its benefits in his post, but seeking is a dealbreaker for him. If seeking is not an issue for you, I would go so far as to call Marco’s post a mixed endorsement of VBR.
The major hosting platforms say not to.
Libsyn has this to say about VBR:
Your editing software may offer a checkbox for VBR or Variable Bit Rate. Do not check this box, your bit rate needs to be constant. The reason behind this is variable bit rate will change how different players and systems scan the file, and they often will either be incompatible with VBR completely, or at the very least will get the duration of the file incorrect causing the file to end up cut off towards the end.
These are the same arguments that we’ve already seen. The last point about the file getting cut off is likely a reference to the Firefox issue, but that’s moot.
Blubrry has this to say:
As noted above, never use a Variable Bit Rate (VBR) for podcasting. Reasons to avoid a VBR include:
- Some hardware and software may not play VBR media correctly, if at all
- Streamed playback (playing within a web browser) complications. Firefox browser can have issues playing VBR mp3 files.
- Many players display the duration (total time in seconds) wrong for VBR media. Sometimes this appears as a constantly changing value during the loading of the media file in the media player.
- Older versions of Flash media players (commonly found on older Web sites) cannot play VBR media files
Again, these arguments are almost all mentioned above. I hadn’t heard the argument about Flash before, and searching the internet produced no useful information. In researching it, I read an anecdote that VBR support was added around Flash Player 7 (2004). I also read that Flash CS3 (2007) had some trouble with VBR MP3 files. Both of these technologies are over a decade old, and I almost certainly doubt anything that relies on them is in wide use.
Conclusion
Many of the long-repeated arguments against VBR are obsolete. It’s clear, through both its adoption and active use, that VBR isn’t the half-baked technology that it was two decades ago. If you understand the tradeoffs around VBR and are okay with them, there is no compelling reason to avoid it.
That said, VBR does come with costs. For one, seeking may be unacceptably inaccurate. It also can be more difficult on the podcaster’s end to produce a correctly encoded VBR file. It is important to understand these tradeoffs before considering whether or not to use VBR.
If VBR adoption increases, Apple may even invest the engineering time to add proper support for VBR timecode indexes. It may be the case that they already have, with the forthcoming release of iOS 11 featuring an updated Podcasts app.
If you have more information that I’ve missed, please don’t hesitate to reach out. I’m open to amending and updating this post to reflect objective facts, and will not hesitate to change my conclusion if there is a compelling reason to do so.
If you found this post interesting or useful, you can find more like it on the Pinecast blog. I try to post content interesting and relevant to podcasters and software nerds of a variety of skill levels.