> Maybe we'll start seeing licenses with a section saying "not for use as training data for commercial models."
Considering that the impact of a single example is extremely small in training a model, and that it is trained on an ungodly amount of examples, then I wonder if the effort of forbidding its use has any real benefits.
Yes, of course it does, because if every user opted out then the model would not work as well as it does, and github would not be able to profit off the work of others to the degree they are (or will be). Just because they are taking code on a massive scale does not mean the outcome is inevitable: don't get it twisted, copilot only works because of the code human beings have written.
Considering that the impact of a single example is extremely small in training a model, and that it is trained on an ungodly amount of examples, then I wonder if the effort of forbidding its use has any real benefits.