As someone who needs closed captioning at this point in their life, but can still understand most things, let me tell you how bad closed captioning is. I would not rely on cc to be more than 65% accurate.
I am sorry and surprised to hear that. I would think especially now it wouldn't be too hard to auto generate a good portion.
My experience is very limited but sometimes people ask me to turn on subtitles and they are usually perfectly fine, even the ones that random people on the internet contribute for free - can you elaborate?
> even the ones that random people on the internet contribute for free - can you elaborate
these typically are the best available, but not always quite right.
the cc provided by wgbh are usually pretty good as well, but they are mission focused on delivering good closed captioning - it's just that not everyone wants to use them (or they can't schedule? not sure)
unfortunately, once you leave those two, it's a craps shoot as to whether they make sense or truly convey the message that the audio is conveying. some of them seem to go by what I'm guessing is the original script, which can be fairly close in meaning to what they're saying, others are just a mish-mash of words that are misspelled and don't necessarily convey the meaning that the audio is trying to.
the worst is live sports, which really don't have to be: I've been to conferences that were extremely inclusive and have had real-time (remote) closed captioning - seeing how good those can be make me shake my head in wonder when I see the typical sports closed captioning. eek.
I'm stuck in the middle where I can still hear and interpret most words, so have a lot of incongruence when I am relying on both the cc and the audio when they don't match.
You ever try google docs voice dictation? It is surprisingly good, I think they pushed an update recently. I know it isn't quite the same thing but it seems like such a solvable problem considering they have all the bits.
Youtube CC is hit or miss but sometimes is perfect. I figure a narrow domain like live sports is immanently solvable. Maybe it is time to promote audio captchas. Hmm. This is a bummer, thank you for sharing, would not have guessed.
I've done some commercial CC work recently for a studio and it is REALLY, REALLY hard work. Way harder than it looks to get it right.
Some pieces I would do would have computer-generated first pass. This would be good in places where the words were very clear and the vocabulary was regular. But TV and films can use a lot of weird domain words (see e.g. sci-fi or fantasy) that the computer can't track. And the computer has serious problems with proper nouns and names.
On top of that you have to assign each phrase to a character who might or might not be visible when they are speaking (which sucks in a whole room full of people with similar voices), and you have to time it correctly.
I got fired over an argument about comma placement.