> In any case, I hope that GitHub is at least limiting any training data to a sensible whitelist of licenses (MIT, BSD, Apache, and similar). Otherwise, I think it would probably be too much risk to use this for anything important/revenue-generating.
I'm going to assume that there is no sensible whitelist of licenses until someone at GitHub is willing to go on the record that this is the case.
I'm going to assume that there is no sensible whitelist of licenses until someone at GitHub is willing to go on the record that this is the case.