Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Maybe we'll start seeing licenses with a section saying "not for use as training data for commercial models."

Considering that the impact of a single example is extremely small in training a model, and that it is trained on an ungodly amount of examples, then I wonder if the effort of forbidding its use has any real benefits.



I would change your question from “does it have any real benefits” to “does it have a practical effect on the model”

Benefits to me are clear: giving a developer choice over how their source code is used with for-profit, opaque, next generation ML models.

But yes, drop in the ocean in terms of the full data set. But that shouldn’t be an excuse to remove user choice.


Yes, of course it does, because if every user opted out then the model would not work as well as it does, and github would not be able to profit off the work of others to the degree they are (or will be). Just because they are taking code on a massive scale does not mean the outcome is inevitable: don't get it twisted, copilot only works because of the code human beings have written.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: