Doesn't even need to be old movies. Certain types of video content in the US is legally required to have subtitles(e.g. a lot of youtube content). You could programmatically download them and use that as your training set. And, since it is a transformative work, you can train your models even on copyrighted works freely.
Much of the YouTube content has auto-generated subtitles, i.e. Google is running their speech-recognition software on the audio stream and then using that to caption the video. If you used that as your training set, you're effectively training on the output of an AI. Which is kind of a clever way to get information from Google to your open-source library, but will necessarily be lower-fidelity than just using the Google API directly.
In the US, if it's ever been played out on broadcast TV then it must have Closed Captions.
This is enforced by the FCC [0], but as more and more "internet" content gets consumed I imagine the same regulations will eventually come, at which point you've got a fantastic training set.