I hope that one day we have a tool that can convert any proprietary binary to source code with a single click. It would be so much fun to have an "open source" version of all games. Currently, there are projects like https://github.com/Try/OpenGothic and https://github.com/SFTtech/openage, but these require years of community effort.
Current SOTA models are really bad at RE and i don't really expect this to improve through training on open data.
There are just not a lot of high quality examples on the internet, and more importantly the people writing this code are doing their best to make it actively more difficult.
It is quite easy to produce high quality synthetic data to train reverse engineering. Just take any open source project and ask the model to produce the code (or something equivalent) given the binary.
Much of reverse engineering involves analyzing existing code, and this is not a secret. There are forums where people discuss and share their reverse engineering findings. Without this, creating a nearly 100% compatible clone, such as one that can use the original game files, would be nearly impossible.