This isn't really true. You can do a lot with weights and no training data - for...

This isn't really true.

You can do a lot with weights and no training data - for example you can pull the end layer off it and use it as a feature extractor.

And to modify it for Japanese speakers you'd fine train the existing model on additional data. If you wanted to modify the model you can (sometimes, depending on what you want to do) modify an existing architecture by removing layers, adding replacements and fine tuning.

I don't quite know what the right analogy of trained data is. In many ways it is more valuable than the training data because the compute needed to generate it is significant. In other ways it is nice to be able to inspect the data.

> The source code must be the preferred form in which a programmer would modify the program.

As a machine learning programmer I'd much prefer the weights than the raw data. It's no realistic for me to use that training data in any way with any compute I have access to.