Also, the hundreds of GitHub employees who worked on GA. It was a dogfooding project internally before talented engineers in GitHub turned it into a commercially viable product.
He's not calling himself "one of the creators" though. He's calling himself the creator, which shits all over the hard work of OpenAI, "Albert" and more.
> We built a filter to help detect and suppress the rare instances where a GitHub Copilot suggestion contains code that matches public code on GitHub. You have the choice to turn that filter on or off during setup. With the filter on, GitHub Copilot checks code suggestions with its surrounding code for matches or near matches (ignoring whitespace) against public code on GitHub of about 150 characters. If there is a match, the suggestion will not be shown to you. We plan on continuing to evolve this approach and welcome feedback and comment.
That is for people using Copilot. I’d like a setting that tells GitHub to not scan my code at all. And I am curious about that sneaking into the terms in between me signing up and paying and them taking code for free.
I have also never heard of “public code” being used in that way.
> I’d like a setting that tells GitHub to not scan my code at all.
What about people forking/mirroring your code? Or people merely contributing code? There is no one-to-one correspondence between copyright holders and Github users.
Copilot should just comply with the license, that's it.
> OpenAI Codex was trained on publicly available source code and natural language, so it works for both programming and human languages. The GitHub Copilot extension sends your comments and code to the GitHub Copilot service, and it relies on context, as described in Privacy below - i.e., file content both in the file you are editing, as well as neighboring or related files within a project. It may also collect the URLs of repositories or file paths to identify relevant context. The comments and code along with context are then used by OpenAI Codex to synthesize and suggest individual lines and whole functions.
Depending on your preferred telemetry settings, GitHub Copilot may also collect and retain the following, collectively referred to as “code snippets”: source code that you are editing, related files and other files open in the same IDE or editor, URLs of repositories and files paths.