Hacker News new | past | comments | ask | show | jobs | submit login

I don't know if this extension foxes it, but I've found learning language through Netflix difficult as the subtitles and dialog don't match, neither for dubbing or original.



I find YouTube to be vastly better for language learning. Good subtitles are surprisingly easy to come by, the variety of content is much greater, and you can also download the transcripts for offline study. And, perhaps even better yet, the videos tend to be short. Repetition is super important for rapid progress, and it's much easier for me to watch a 10 minute video two or three times than it is something that's 30 minutes or longer.

If you're willing to shell out some money, YouTube + LingQ (which has a plugin for automatically ripping audio+transcripts into lessons) is so effective it's almost like cheating.


YouTube appears be using ML SST subtitles most of the time and tends to trip over simple things like homonyms and at worst throws up its hands and just skips over difficult (noisy, cross-talk, etc.) segments. I say this as someone taking a 2nd language course where we'll watch a video together and do a worksheet and class discussion in that language after. Sometimes the instructor's reaction at the end will be "wow those subtitles were bad!"

Edit for clarity: the subtitles are in the foreign language, not English so it's not an issue of machine translation.


There's a lot of inconsistency, Channels like Tom Scott have remarkably good subtitles (Done by a team of humans), with different color for each speaker, closed captions and even colorfully describing sounds

Auto generated subtitles work a lot better for Video Essays and Podcasts than for TV style content (And it's really inconsistent, Spanish <-> English works surprisingly well as long as there aren't crazy accents in the way)


[Music]


This plugin (Language Reactor) works with YouTube as well, you should try it. + Target and native captions in parallel. + Translate any word on hover (more on click). + Hotkeys to repeat, move to previous/next segment. + Pronounce single word. - Moderate downside that it's not possible to select and translate multiple words. I hardly use LingQ with this plugin, it's more immersive and much allows you focus on listening even more (which is good).


Really? Most transcripts and translations I encounter are so bad. Do you mean specific videos or channels?


Right. It's not that most channels have good hand-edited subtitles; it's that it's not too hard to find channels that have them. Usually once you find an interesting video with subtitles, most the rest of the ones on that channel will also have them, so that can easily be tens or hundreds of hours of subtitled content to work with.

It gets even easier if you set up a separate account that's dedicated to your target language so the algorithm's not just feeding you endless content in your primary language.


To this day I can't understand Youtube's asinine decision to not let me deactivate autotranslation of titles and the fact that it isn't consistently triggered makes it so much worse

Their multilingual UX is terrible


> I don't know if this extension foxes it, but I've found learning language through Netflix difficult as the subtitles and dialog don't match, neither for dubbing or original.

100 000 times this. I don't understand why it's like that but they simply often don't match. And it's not some automated translation that went wrong: it's as if the subtitles didn't match exactly the final "script". They don't match but the subtitles are still totally correct. Sometimes the sentences are formulated differently.

It's honestly both a mystery and a gigantic WTF for me. Are these only meant for deaf people? And how did they manage to get "correct but non-matching" subtitles?


They typically hire two different companies to do the translations, and the translations are optimized for different goals. Subtitles are just meant to be easy to read. With dubs they try to make what's being said at least vaguely line up with what the actors' lips are doing in an effort to avoid the infamous "1970s kung fu movie" effect.


The english subtitles for italian shows on netflix are so bad. They just mistranslate words or sentences for some reason.


Afaik, they are done by two different teams.

Plus, dubbing is sorta kinda trying to match the length of time actors need to say stuff. You cant have sound going while actors mouth are not moving at all. Nor the opposite - translation is done and actors mouth is still moving. And so those movements can not look completely odd. Written subtitles has no such limitations, resulting in different translation.


Good insight. In this case, why don't they display the dubbing text when dubbed audio is playing?


Call me strange, I actually like the effect. I feel that parallel translations can provide a richer context of what's being said than a single translation. For example, idiomatic phrases are frequently split where the text will provide the meaning of the idiom and the speech will transliterate the words. The cultural exposure feels richer to me.


Interesting. I watch everything with English subtitles/cc and get irritated when the subtitles/cc don't match what is being said. But maybe - as you said - I am a minority.


If I had to guess, it just does not exists in subtitle form, no one ever added time information to the translation. Otherwise you had it in subtitles options with cc.

Some shows have two versions of subtitles available - one with cc other without. Likely, majority of consumers are not learning language specifically and are just watching the show and normal subtitles are superior in that case.


I always assumed the opposite: the translated subtitles reflect exactly what the script says, while the actor may have remembered an approximation of the exact line, which is normally good enough not to bother with another take.


I would assume it’s the same as book translations, the point isn’t to translate it directly but in a way that makes sense in the target language. Although maybe a lot of subtitles for lesser TV and movies don’t have a lot of human input and the handler just goes with the softwares suggestion a lot of the time.


Exactly. Japanese translations to English almost never match sentence-by-sentence, but then again a direct translation wouldn't be what an English-speaking person would say anyway.


Sometimes the subtitles leave out filler words I presume to shorten the subtitle.


They translate it idiomatically (which means sometimes completely different sentences) and are constrained by length. The might also start from a voice translation that tries to match the lips.


Language learning via hearing comprehension of content not produced in the target language is almost impossible, because the subtitles never match.

However there‘s s difference between CC (close captions) and subtitles, with the former being the verbatim representation (including sfx, music etc.) in my experience.

I already commented [0] on this 2 years ago.

[0] https://news.ycombinator.com/item?id=27420959#27435311


Correct, and you can find CCs more likely on movies and shows that were shot in the respective language itself. For example, the stuff from https://www.netflix.com/browse/genre/100396 is much more likely to have 100% accurate captions if your goal is to learn Spanish


I found dubbed shows significantly easier to listen then native shows. It is actually easier to learn from those then from native shows. Dubbing is almost always better pronounced and less mixed with background sounds.

Also, the claim that it is impossible to learn if you don't have perfect cc subtitle in target language is absurd. You can use subtitles in own language to get the meaning.


CC is also space constrained so won’t match word for word.

Subtitles and CC are not transcripts.


It still helped me because even though it didn't match 100%, it at least gave me an idea of what the original dialog was about. Then I could derive the content from it. And it made fun to figure out the differences between what was written and what was said.


Toucan is better for language learning. It replaces words in a page with your target language.

https://jointoucan.com/


Alas toucan is a security nightmare (or was it when I tried it last year) - it was sending all the URLs I was visiting to the server - even local host stuff I was running on my machine. I checked that by using a local proxy and looking at all requests made to the toucan servers.

While I love the idea, I don’t trust the company with my complete surfing history. What they should have done is to have me opt into each website that I want to use toucan on and do not do anything if I visit others.


Crazy idea, if local LLMs are good enough to translate languages reliably, an open source extension that translates every page you visit would be so incredibly useful. You don't even have to change your habits, just carry on like normal, but while becoming a language sponge.

I guess you can already do this with Chrome's built-in translation, but that built-in translation leaves a lot to be desired, doesn't it?


Hm, I really don't get Toucan.

So it's doesn't appear to even attempt show the grammar of the target language, including the very basics, such as the word order.

Even for vocabulary acquisition, how is one going to learn noun classes (genders), case endings and articles, things like German separable verbs etc?


Is that the difference between subtitles and closed captioning?

If I recall, one is made from original script, one is typed up from aftually spoken audio.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: