Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This response expresses some of the things, that are to criticize about MS' Copilot project. But also I don't like the instant attempt to subtle discredit the report by dropping something like "I don’t know how the original poster’s machine was set-up" in the first (or second, if you want to be technical) phrase.

First consider that you made a mistake yourself, _then_ ask, whether the fault could be on the other side. I really dislike this high-horse down-talking tone. Maybe it was not meant to sound like that, maybe this kind of talk has become a habit without noticing. Lets assume that, giving a benefit of a doubt.

Onto the actual matter:

> If similar code is open in your VS Code project, Copilot can draw context from those adjacent files. This can make it appear that the public model was trained on your private code, when in fact the context is drawn from local files. For example, this is how Copilot includes variable and method names relevant to your project in suggestions.

How comes, that Copilot hasn't indicated, where the code came from? How can it ever seem, like the code came from elsewhere? That is the actual question. We still need Copilot to point us to repositories or snippets on Github, when it suggests copies of code (including just renaming variables). Otherwise the human is taken out of the loop and no one is checking copyright infringements and license violations. This has been requested for a long time. Time for Copilot to actually respect rights of developers and users of software.

> It’s also possible that your code – or very similar code – appears many times over in public repositories.

So basically it propagates license violations. Great. Like I said, the human needs to be kept in the loop and Copilot needs to empower the user to check where the code came from.

> This is a new area of development, and we’re all learning.

The problem is not, that this is a new development or that we are all learning. That is fine. Sure, we all need to learn. However, when there is clearly a problem with how Copilot works, it is the responsibility of the Copilot development team to halt any further violations and first fix that problem, before letting the train roll on and violating more people's rights. The way this is being handled, by just shrugging and rolling on, maybe at some point fixing things, is simply not acceptable.



> How comes, that Copilot hasn't indicated, where the code came from?

I can't say for sure about copilot but in general you don't have that kind of information. The problem is a bit like trying to add debug symbols back to some highly optimized binary program.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: