Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There is logic to ensure that copilot does not emit exact duplicates of code in the training set... but that logic is significantly newer than that tweet.


Link? I couldn't find anything "significantly newer" than 7/2/21 (though I'm sure GitHub is doing a lot here). They had this blog post 6/30/21 regarding efforts on avoiding raw code: https://github.blog/2021-06-30-github-copilot-research-recit.... They concluded:

> We will both continue to work on decreasing rates of recitation, as well as making its detection more precise.


Source: I work on the copilot team.


Was that decision informed by legal or product? Because derivative works are still derivitative works even if you don't replicate the original verbatim.


I mean, it was informed by both, but basically everyone thinks it's a good idea.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: