Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How to take credit for someone else's work on GitHub (repography.com)
491 points by arraypad on Feb 28, 2022 | hide | past | favorite | 180 comments


I do see a point in it working like it does, though. I'm one of the lead developers on a free software project with over 20 years of history. Even though the project has used multiple version control systems (and hosting providers) over time, we have imported our entire project's history going back to the very first commit into git and GitHub.

Not every contributor has kept their email address for over 20 years. Some don't have access to the old addresses they once used for commits. Still they want the commits to be associated with their current GitHub account; even if it's just for statistics and "bragging rights".

If GitHub required email address verification, how would this be done?

EDIT: To be clear: With "working like it does" I'm referring to the possibility to add unverified email addresses to your account and have commits attributed to you.


> Still they want the commits to be associated with their current GitHub account

Well, tough luck? I don't think it's that important. Just accept it as a fact of life: you lost access to your email account and can't verify you still own it (you don't, clearly).

GitHub should just show the e-mail address when it can't associate that to an account, maybe show it's unverified and link to a help page explaining anyone could have faked that in the commit. I don't understand why they care so much to put a face (the GH account) on the commit when the address is not verified.


> Well, tough luck? I don't think it's that important. Just accept it as a fact of life: you lost access to your email account and can't verify you still own it (you don't, clearly).

This case might not be super important in the long run, but why does it have to be a fact of life? If a system doesn't work as its human operators intend, that's a system failure, not a human being failure.


> This case might not be super important in the long run, but why does it have to be a fact of life?

For the very reason mentioned in this article: people can claim the commits without verification.


Why doesn't the fact of life go the other way?

"people can claim the commits without verification." - Well, tough luck? I don't think it's that important. Just accept it as a fact of life. You didn't cryptographically sign your commit and now nobody (including you) can prove who made it.


The distinction is in where the potential harm can be.

With the current status quo (unverified email addresses can "steal" commits), you create confusion in the general developer community. Anyone who looks at those mis-attributed commits will be confused, and possibly misled.

If GH didn't associate commits unless the email address was verified, then, yes, some people wouldn't get "bragging rights", but the harm would be limited to that person. Others who look at those commits would still see the correct person's name, even though it wouldn't be associated with a particular GH account.


> "Others who look at those commits would still see the correct person's name,"

They would see the name which was written into the commit; assuming that's "the correct person" is the same mistake. Associating to the GitHub verified email account is incorrect in the same fashion, but going the other way. They're both only text saying "Linus Torvalds", in the absence of signing, neither is more or less authoritative than the other. Connecting it to a random profile looks wrong, but trying to correct it to the 'right' profile lends it an air of legitimacy it shouldn't have.

Papering over that is like teaching people to click through warning messages, or that HTTP is fine because the site shows the right looking text.


> Papering over that is like teaching people to click through warning messages, or that HTTP is fine because the site shows the right looking text.

That's a completely separate problem, and whether it is papered over is completely independent of this problem.

With this problem, even if you already verified the repo, even if there are signatures, it still shows the wrong profile.

> Connecting it to a random profile looks wrong, but trying to correct it to the 'right' profile lends it an air of legitimacy it shouldn't have.

Displaying nothing is not "trying to correct it to the 'right' profile"


I think the issue is that this isn't a "fact of life", so github could either show the original email or pull in the account with a verified email. What it does now appears to be a bug.


Because we exist in the real world with real constraints. We have only two options, allow people to stick their name on others commits, or have unverified emails just show plainly.

Neither of these is ideal but at least the second one never tells lies.


This- if the commit is unsigned, then just show what the commit says. Don't "enhance" the commit in an even more insecure way than the "original sin"


You could add humans into the verification process. I imagine the number of people who want to associate old emails they don't have access to with accounts to be small. If you could prove that you owned the account via other means it could be manually added to your profile.


Exactly, such as.... by contacting Github support (which they offer as a way to undo the incorrect enhancement on the original commit anyway). I think Github should reverse course on this one.


> If a system doesn't work as its human operators intend, that's a system failure, not a human being failure.

Well, I think we have to consider who was responsible for it not working as its human operators intended (my use of 'who', rather than 'what', in that sentence is not a grammatical error; it is a clue as to the correct answer.)

Unless the outcome described here was one of the explicit goals of the creators of the system, then you cannot assume it was an intended outcome, as opposed to an unintended consequence of what they chose to do. If it was the latter, then "that's what it will do" is not a justification for what it does do; if it was the former, then it was just a bad choice. The only way to justify the situation is if there was no alternative that was not strictly worse, and if so, then it should be made clear that it is so.


Bitbucket allows the repository administrator to set up a mapping of email address to user. I think this is a nice middle ground.


I guess they can also say tough luck to you, because they don’t think preventing people from posting old commits that are credited to an unverified email is that important. Just accept that we don’t verify it as a fact of life; you can’t be sure of commit ownership.


What do you mean tough luck? The system works exactly like he said. It’s tough luck for people who want it to work differently.


Maybe github should use gravatar if the email doesn't match a github account. Not that that helps with old email address you no longer control but it does let you add an image to an arbitrary email you do control, separate from github.


In this case I think you could use a .mailmap [1] in the repo to associate the old email addresses with current, verified addresses.

[1] https://git-scm.com/docs/gitmailmap


Interesting. One never stops learning new git features...

However, while this works for git (i.e., maps old address to new address in "git log" for example), GitHub does not seem to honor this file.


Well, yes, but maybe they should? It doesn't seem like a huge feature...


Now which commit's mailmap file should be used for the association? Whatever is on the "default" branch currently?


Whatever is in the branch you are looking at? Seems fairly straightforward.


Viewing a specific commit or viewing the repositories' statistics ("insights" tab) both have no attached branch.


in Github or in Git itself? While possible, I don't believe there's many floating commits around.

The other issue of course is that at this point in time, there would not be a .mailmap. Of course, github can then fall back to a .mailmap file in the latest commit of the main branch.


Commits are not usually floating but are very commonly in more than one branch.


What about if somebody clones a repo, then adds a .mailmap pointing all the addresses in the history to their own?


If somebody clones the repo they can even rewrite history. But that is not the same as them being able to claim the ownership of a commit in the original repo.


Then in their clone, they made all the commits. But who cares about their clone?


GitHub bases the association of commits to user accounts on the list of e-mail addresses configured in the user’s profile: https://github.com/settings/emails


> I do see a point in it working like it does, though

really? I'd think a product whose _primary_ value proposition is "integrity and assurance over what was committed when by who" this is such an odd edge case github refuses to fix. If their security team isn't looking at this and thinking "oh my these aren't the secure-by-design defaults we should have", fire them.

To say "oh but you ought to use cryptographic keys to allow attribution", then why not a color code, or a subtle warning, whatever ... to hint at the fact that attribution in this case isn't only weak but totally impossibly (instead of pointing to a user-id that is 100% wrong and saying "yupp, that's the one who we believe did the commit").

Not exactly great UE. Also terrible for supply chain security. The whole thing is hard to explain because it wasn't that UE was chosen over security. There is no logic to why it was half-arsed like this and they actually get away with it all while grand-standing about all the things they do for "security".

All Github-security brings to the world (other than hype) is:

- Cockpit: which allows me to produce security vulns in an automated manner, and

- the GHSA vulnerability severity scoring which essentially labels anything that has CVSS3 critical as moderate and allows downplaying of the real score.

As a Microsoft company perhaps this is just what it is and I'm to blame for expecting things to make sense.


> Still they want the commits to be associated with their current GitHub account; even if it's just for statistics and "bragging rights".

> If GitHub required email address verification, how would this be done?

If it can't be done securely - e.g. you verify you own the e-mail address - it shouldn't be done, IMO. It's like losing your 2FA token and all recovery methods; for the sake of security, you should consider that account lost. Because if you can get it back through other means, then a scammer / impostor can do so as well.

At some point you just have to give up. Anyway in the case of e-mail addresses, ideally you have your real name in the address itself for anything formal / work related.

Alternatively, what GitHub could do is mark these accounts as unverified. It would also mean they should allow multiple user accounts to be associated with a commit's e-mail address though.

And finally, some manual work might be involved. I'm sure there would be ways and means to get verified on github (like on twitter), and somehow claim that e-mail address in a more formalized fashion - and to dispute it. In the case of Linux and Go, it's obvious and well-known who the original authors / committers are, so a bit of manual work to associate those commits to a GH account shouldn't be too much of an issue.


One idea: just show the statistics of unverified commits, demarcated as such, in the user’s profile. This is a more transparent variation on the current behavior.


>If GitHub required email address verification, how would this be done?

author data in a commit can be replaced by repository owners. You can replace all old email addresses to new ones https://github.com/jayphelps/git-blame-someone-else https://github.com/SilasX/git-upstage


Doesn't changing the author affect the commit sha? If so, doing this would cause some amount of pain with syncing all copies of the repo, branches, etc, making it a non-starter I think.


You could just flair the username as unverified and have the flair link to an explanation.


If everyone is concerned about commit identity hijacking, you can configure your repo settings to reject any commits which aren't GPG signed.

https://docs.github.com/en/authentication/managing-commit-si...

https://www.devopsauthority.tech/2020/07/18/github-getting-s...


that's great but it requires an active step on behalf of the user which is violating secure defaults principle. it also violates the principle of good UE


There's no way around it, because git commits by default can be forged due to the way it was designed.


Security requirements are often at odds with good UE.


> If GitHub required email address verification, how would this be done?

You could just run a script which rewrites the email address in all the git commits, and force-push the revised version.


Does this redo all the commit hashes?


Yes, as it is rewriting history. And that would be a massively bad idea.


Thank you, wanted to know, if so, this should really be noted when suggesting these sort of things since it has unaccounted consequences especially when you consider most people use git but don't necessarily know how to use it beyond the basics.


Given that GGP's use case is:

> we have imported our entire project's history going back to the very first commit into git

Then it isn't a problem. Just change the email addresses while importing.


But what if you don't think of it at the time?

Or what if you don't actually want to change the e-mail addresses because they are important historic data? or part of a commit's signature?


What do you want; official Git identities? Sanctioned by Linus himself? Or would you rather log into Xbox Live?


In spite of GitHub's claims that nothing wrong, something is wrong and fixable.

GitHub should be showing the identity pulled from the e-mail address, and not replacing it with the name of an associated GitHub account. Just like it does when there is no associated GH account.

A reasonable compromise would be to show that name, but turn it into a link to the account if there is one. Then only someone curious clicking on "Linus Torvalds" would see: hey, how come this leads to some VanTudor account?


That wouldn't work, for example, when you're pushing commits someone else did in another repository. Git is decentralized, so you end up pushing a lot of code that you didn't commit if you use it the way it was intended.

The solution, in my opinion, is to show a great big warning or error icon next to the name of every unverified commit, and to every unverified push as well. Developers and version control managers can easily prevent this from happening but few see a reason to sign their commits, and perhaps with a UI change discrediting commits this can change in the future. Setting up signed commits takes five minutes, less if you already have a PGP or S/MIME certificate.

The trick the article shows is a neat trick that will confuse people that don't have any knowledge of how Git works, but the dangers of unconfirmed commits exist go beyond that. A malicious actor could easily inject a backdoor by injecting fake commits impersonating a trusted project member and very few people would be the wiser, unless they actually check the commit manually.

Enterprise/Pro versions of version control software (such as Gitlab) have this feature, but bots exist for free versions as well. You could change CI/CD pipelines to fail if the branch contains unverified commits to hack the functionality into the free version of such systems.


GitHub already has this, called Verified Mode [1]. It has to be opted into rather than out of though.

[1] https://docs.github.com/en/authentication/managing-commit-si...


> That wouldn't work, for example, when you're pushing commits someone else did in another repository.

What? Why??


Because git is designed to be decentralised. You can push and pull changes to your coworkers' laptops, merge everything, and then push it to a centralised place like Github. Great for when you work at an office with limited internet connectivity or develop your code across multiple mirrors (gitlab + github, for example).

Consider the way the Linux kernel is developed. Change sets are committed and emailed back and forth, and eventually merged. The merged git repository is the official, released source code, but before that change sets are just (attachments to) emails. Then, at some point, someone pushes the changes to Github, containing thousands of merges from tons of people.

Git was designed to work this way, and cryptographic signatures solve the problems that can arise from allowing random change sets. You can sign the commits themselves, as well as the pushes (although signed pushes aren't as universally supported by version control software).


Yes, so GitHub showing the individual commit's email addresses and not replacing it with the name of an associated GitHub account wouldn't work ... because?


Because it doesn't solve anything without validation of email ownership. It doesn't matter if the commit says "Jen-Hsun Huang" or "ceo@nvidia.com". The commit usually contains both (in the "name <email>" format) and both can be spoofed just as easily. The name associated with the email address of a Github account is actually harder to spoof because it requires creating an account with said email.

If people think "the" Linus Torvalds committed something just because they have a github account with the name Linus Torvalds, they're fools. Names aren't unique identifiers and they shouldn't be regarded as such.

There is something to be said got displaying the name and email address for signed commits on Github, to prevent creating an account and impersonating someone who actually signs their commits. In the default configuration, many version control systems will just add a "verified" badge to a commit if the email address of the signature matches an account with the same email address. That protection is obviously useless for unsigned commits, because there's no way to trust the author anyway, but for verified commits a clear indication of which email address created the commit makes sense. Alternatively, the github user account name could be used, of course, if such an account exists.

If you really want to find out the email address of the person who committed something, you can clone the repo and look at the git log manually, I suppose. I think hiding it was done to prevent shitty scrapers from collecting email addresses.


The difference though is, that this is about an existing commit. You can't change the mail address of the first commit to the git/git repository retroactively.

What you're describing is another issue.


GitHub’s response is pretty surprising. How can anyone think this is expected? Having to follow Git’s commit message emails makes sense and indeed anybody can use any email they want to make a commit. But then for GitHub to make the connection between (unverified) commit emails and (unverified) GitHub.com accounts is the issue for me. Since they can’t verify the commit email belongs to a GitHub account, why show that as though it were true?


The response seems a lazy, thoughtless, self-serving cop-out, one that permits a false identity claim and then pushes the entire burden of challenging it back to the primary victim, who may well be unwitting, and where secondary victims (anyone defrauded by believing a false attribution) have no standing at all.

You can file the whole line of thinking - from design to support - under "worst practices" and "things not to emulate in your own product".

When it comes to handling how unverified author identity is presented and cross-reference, Github could stand to do a lot better.

Fortunately, the problem appears confined to Github's web interface; a git show --quiet e83c516 still produces

    Author: Linus Torvalds <torvalds@linux-foundation.org>
    Date:   Thu Apr 7 15:13:13 2005 -0700
    
    Initial revision of "git", the information manager from hell


And if "the proper way to verify committer identity is, as per GitHub's response, a cryptographic signature", Github is certainly not pushing this.

If the only real security around attribution is "a cryptographic signature", GitHub could do a lot better in pushing this, making it essential part of the signup or "getting started" and such.


What happens if Linus Torvalds has a verified Github account, and I commit to my rudely named project on my local computer with his email address and then push to Github; do they then show the commit with his Github account because his Github email is verified?


Yes, and there are a number of past stunts that include forking the Linux repo, pushing fake/misleading commits, and then showing how Github lets you see those commits in a context that implies they are part of the upstream Linux repo.


Github now shows a warning when people visit links like that


Not on every relevant view though, so there are still new waves of pranks and surprised people.


> and I commit to my rudely named project on my local computer with his email address

and how exactly would this be out of character for the real Linus Torvalds?


Because that's git's underlying mechanisms in action. In a distributed system, there's no centralized database to check things against, so there's no (distributed) way to do verification, leading to the issue described here. With the use of public key cryptography, there's a disconnected way to authenticate commit, and it works as well as public key cryptography does, but GitHub has all the levers needed to moved the needle, with `hub` and them holding public keys for registered users.


Really? In this case isn't it just the Github web service that makes a decision on what to display?

Sure if you clone the git repo you get the e-mail address, but then you also won't get any information about who the email belongs to on Github.

I don't see how the design of Git affects this issue. This is simply one if clause away from being solved in Githubs frontend source code. Just check if email.verified: display the user, else, don't do that.

In the else-case all you'd get is a situation where one unknown user has Linus e-mail in their list of e-mail addresses, in Githubs DB, and a bunch of git commit logs with the same e-mail. No one would ever make the connection from the commit log to the list of Emails in the DB. Because the e-mail hasn't been verified. It's as simple as that to me but I could be wrong.


Github could easily add its own handling for historical addresses of well-known developers, and block others from masquerading as them, once abuses like this are reported.


One rule for famous developers, another for all the rest of us. Completely backwards because everyone knows Linus made Git so that's no threat to him, but someone retro-claiming authorship of normal person's work is much worse because nobody knows differently. Also expecting Github support people to mediate "he said/she said" arguments about who made what commits, or owned what email address that many years ago.


There are some kinds of attacks that are more likely to be launched against well known people. But even for lesser-known developers their names with corresponding addresses will appear in the archives of mailing lists, and this could be used to correct the record.

But if there's no record of someone at all then there isn't an easy way to detect this kind of abuse.


> Because that's git's underlying mechanisms in action.

no? git just has an email field on the commit. it's github that has decided to allow people to associate unverified emails with their accounts, and also display this association whereever the email is listed in a commit.


I don't see how verifying the email fixes the problem. Emails (like domains) expire and the original user of the address may lose control of them.


I thought this might be something different. Have seen this happen multiple times over the years - even once just last week.

Colleague files an issue with a PR. Project owners close it, say 'no, not a bug', then... commits the same thing themselves as "fixed!". Saw this years before in cvs/svn, and... at least in the GH world there's some evidence of the original PR author having done the work in the first place (vs being invisibly cut out).


The owner of huey does this. He closes PRs and submits the code himself


I had this happen on a small PR I submitted within the past year. I didn’t think anything of it at the time, but your comment led me to glancing through the past PRs and it’s comical how many are closed with a “thanks, I’ve committed an equivalent patch” comment.

One the one hand, it’s his repo and he’s free to do whatever he wants. I actually admire how ruthless the maintainer is on closing issues, must be great for staving off OSS burnout.

On the other hand, I don’t love how antagonistic it is to outside contributors. Litestream[0] is an example of open source-closed contributions, but at least it’s upfront about that in the README. (And the policy has actually changed to open for bug fixes.) I would open an issue/PR on Huey suggesting adding a similar disclaimer, but it’d probably be closed, ha.

[0] https://github.com/benbjohnson/litestream#contribution-polic...


In many cases this is the right thing for a maintainer to do: a contributor produces a PR and a proposed patch, but often that patch doesn't solve the whole problem, or clashes with the coding style, or isn't very efficient, so the maintainer does their own fix, because that is faster than getting the contributor to produce a modified version.


Yes, I do this for my own OSS projects. The standard approach of giving feedback and waiting for the user to fix something is fine, but for a small change it's easier to just expedite the process and do it myself.

But if my version of the code has substantial changes (ie changes beyond just whitespace, small tweaks to the code, changing the commit message), I push it to a branch and ask the PR author to review and approve it first. Only after they approve it do I merge it into master and close the PR.

I also retain the GIT_AUTHOR of the original PR so that they still get credit; my user is only the GIT_COMMITTER. And I add a "Closes #" ref to the GH PR in the commit message so that it can be tracked later. git also has a de-facto standard of having multiple authors for a commit via `Co-authored-by:` lines in the commit message. This is useful for when my contribution is large enough to be equivalent to the PR author's.

Note that this doesn't work for workflows that require signed commits. If you have such a workflow, you have to go back to giving feedback and waiting for the PR author to make changes.


Thank you for all of these tips!

I've lately been feeling bad, and thinking I must look like an ungrateful asshat, about closing lower quality PRs (IMHO) with valid bugfixes but which introduces some new, possibly subtle, bug instead. Or having to close abandoned PRs because the submitter gave up before that last polishing to match the standard of my own repo. :(

Now I feel better knowing that I can do that final polish myself, while keeping the submitters original contrib!


> Note that this doesn't work for workflows that require signed commits. If you have such a workflow, you have to go back to giving feedback and waiting for the PR author to make changes.

While everyone else who uses the project suffers with the bug that was being fixed as they wait for the person who contributed the patch to go through some hazing process involving code formatting that they (hopefully: I realize some people are in it mostly for the GitHub gamification credit of being a "contributor" on their landing page and thereby will do absolutely anything to get exactly and precisely the author credit on the commit) didn't sign up for. No: please for the love of everyone you are responsible for just commit the fix and thank the person later.


Well, I wasn't talking about GH OSS repos specifically in that point. In any case, I imagine that any repo that requires signed commits is also corporate-enough that it'll have a CLA requirement, so if the PR author is unresponsive the maintainer could take the code and commit it as themselves anyway.


The polite thing is to fork the contributor's PR branch back into the project repo, make changes preserving history, and then merge or squash merge the result.


It is a bit awkward though.

Some projects by nature attract high quality PRs. Others, like a game I built, have the unfortunate curse of attracting PRs with such low quality that it's kinda heart-breaking to shut them down.

It's one thing to read a good feature request in a Github issue and build it yourself. It's a whole other thing to modify a low quality PR in a polite way, see what they were trying to do, clean it up, refactor it. It can easily be 5x the work of just doing it from scratch as the project maintainer.

This experience, especially after your hundredth time, can jade you in a way and make you seem rude when you decide it's not worth the courtesy nor charity.


Yap and that's when you start feeling that OSS burnout creeping up on you.


Yeah, I get why this happens. And to be clear, I didn't dig in to the all the PRs and compare them vs the maintainer's commits, so I have no idea of the difference in code quality between the two.

I'm sure it's frustrating when maintaining a fairly popular OSS tool to receive a PR that's 95% of the way there. Having to go back and forth to coach someone on getting that last 5% (or the contributor just dropping the PR then ghosting) vs just doing it yourself, I totally get it.

However from the contributor's point of view, when GH has support for co-authored commits, it comes of as a bit of jerky move when you take the time to submit a PR to not at least get credit via a co-author commit message.


GitHub's "support" for this is the difference between a commit credit and an author credit, which is a mechanism in git that has particular meaning with respect to cherry-picks and rebases. It should be considered awkward to attach someone else as "author" on a commit they might only sort of recognize.

Maybe instead of "taking the time to submit a PR" you should first submit an issue and only work on substantial code changes you are going to become emotionally invested in after you've negotiated the correct path forward with the maintainer? Open source used to be about communication and collaboration, not cowboy coding.


The commit log could still credit the bug discoverer even if that person's proposed fix isn't used.


No, not cool. If you modify a submitted commit, set the Author field to the original author and add a Signed-off-by field with your email. Then you BOTH appear.


On GitHub you can push commits to the PR branch. I use that to fix up rough edges myself and then merge the PR.


Oh wow never realised that was possible! Thank you!

That will save so much time and energy having to deal with back-and-forths or abandoned PRs


it can be, yes. there's chicken and egg issue of trying to help other people be more proficient in making those mods, learning about larger parts of the system, etc.

there's also the issue of being told "no, that's not a bug" or "no, WONTFIX", then... hours or days later... producing the same code patch as your own. Definitely a jerk move.


The alternative is way worse: it turns into this culture of forcing people who just tried to help with something minor to suddenly be bullied asking for updates to code they were done with. I gave you a fix. It was a potential fix. It was one of many possible fixes. It is your project, and you should figure out what you actually want to commit. And yet way too often the maintainer spends more time trying to explain to me what they don't like about my patch or what other relevant work needs to be done in order to commit the patch than it would take for them to just do it themselves now that I showed them what to do.

I am following a ton of issues on projects like Flutter and Cargo that are somehow steeped in this culture and it frankly just seems like nothing ever gets fixed. In some cases pull requests are open for years as people bike shed back and forth arguing over some extremely minor point in a patch, letting thousands of other developers suffer waiting for a patch that to them would work exactly the same either way, because of some weird culture that has been built up surrounding "maintainers may only click commit or leave comments while contributors type all of the code", and if the person who made the patch doesn't want to--or simply can't as they grok the requirement--satisfy some procedural process the world now has to wait for someone else to step up and submit themselves to this process with a new pull request, despite the maintainers having clearly all spent hours typing comments nitpicking on what at the end of the day is often literally a 30 line patch they apparently refuse to just re-type themselves.

In some sense I frankly think this is an entitlement issue on both sides: the maintainer isn't entitled to the continued time and effort of the person who submitted the patch--and certainly isn't committed to them doing exactly what the maintainer wants--and nor is the submitter entitled to some weird GitHub-specific contributor "credit" on the maintainer's project and the requisite control over the patch and how it gets applied that getting that would have to imply.

Should the patch submitter get some love? Yes! Is the bug tracker sufficient? Maybe! (I will say for myself this is absolutely sufficient: unless I am fixing some world-shaking bug--which does happen given what J do--the most important thing to me is going to be getting the feature or fix I wanted landed with minimal delay and preferably the least effort from me, not a very specific form of control over a commit.) If not, is having a file somewhere of "helpful people" enough? I want to say "yes", and I'll up it to "certainly" if it includes a mention of why.

But like, I really do think there is something super core going wrong with the open source world here, and this "pull request" model from GitHub with "contributor" status is to blame :/. In addition to the obviously-evil gamification of the codebase (from the badges) I think one reason people get this feeling of entitlement over their patch is that they put way too much work into it before it even gets presented as they go for this completed pull request model. 99.99% of the time what I want as a maintainer isn't someone who spent a month working on a patch that they are now going to argue with me about: I want a single paragraph a month earlier with an explanation of what is needed and maybe I could have solved it in a few hours or explained why the concept won't work or would conflict with planned effort.

On the other side, I then think the UI--combined with this inferred expectation from the patcher--forms that brutal entitlement from many maintainers that they maybe never have to touch code again and can just armchair quarterback / long-range pair program other people into getting some exact result. This is what chases away the quality contributors, as the quality contributors know that reformatting code is easy but predicting how someone else wants code to be formatted is nigh-unto impossible, and they also appreciate that the hardest part of a patch is knowing what worked or didn't work to fix the problem, not typing the code.

Meanwhile, the people who matter to your project want what's best for the project, not what's best for their GitHub contribution scores, and I dare say that if the maintainer is in a good position to be man-handling all the code that's the right way to go about the problem.

So like, concrete example: am I proud to be listed in the AUTHORS file of v8? Sure! But do I care whether whatever patch I had provided actually has my Author: on it? So much not to the point where I can't remember if it happened or not. Would I still be proud of my marginal work on v8 even if I weren't in the AUTHORS file? Yes! And would I have minded if they didn't put me there? No, and honestly I almost find it weird sometimes that they did... I certainly didn't ask.


> And would I have minded if they didn't put me there? No, and honestly I almost find it weird sometimes that they did... I certainly didn't ask.

Having your name in the credits brings some of your reputation to their project. Having 'name brand' contributors is a benefit all its own for many projects.



This appears to be the repo that the copyright originated from

https://github.com/mwshinn/forceatlas2-python

bhargavchippa also contributed to this repo, so this doesn't seem like one of those random "fork and lie" sort of thing. Without the necessary context, we can't assume much. It could well be that mwshinn was in the wrong or there was an "okay" from mwshinn for bhargavchippa to make those changes.


As I understand it, this is not permitted under the GNU General Public License v3.0. Is that correct? I would think that certain licenses do permit this (MIT possibly?). Could anybody with knowledge chime in?


MIT prohibits this. You can use the code and remix it or add your own copyrighted code, but pretty much the only thing you can’t do is remove the original copyright text.


I assume copyright law prohibits this too.


Copyright law doesn't care about copyright notices. They are informational and not necessary or sufficient to enforce copyright.


yikes, and by the looks of it seems like a "refactor" at best


I have used emails in the past I can no longer verify, so I see a use case for linking unverified emails to profiles if there's only one profile claiming the email address

However, if another profile verified that email address, it definitely shouldn't link to another profile that hasn't verified


That isn't fair either though. I can see the ISP I quit using 15 years ago letting someone else have my old email address, but now they can claim to be me.

I don't know how to handle this situation. It is somewhat easy to verify that a commit today comes from an email address I control now. However if I claim an unverified commit from years back is it really me just because I now control that email?


Sign your commits.


That is the right answer, but it means 15 years ago you need to have done the right thing, and also means not losing the private key (which should have expired) in the mean time.


What does it matter if someone can claim your commits? Doesn't seem important imo.


Since [by default] you own the IP for code you commit, this is effectively claiming to steal intellectual property. Also professional credit and fame, which can directly relate to employment opportunities.


> intellectual property

This seems to me to be the least important aspect of whatever is happening.

> can directly relate to employment opportunities

Given how free we are in choosing work, it seems to me you could just go work for a workplace that doesn't trawl through your charity history to figure out you're worthy enough. That's a matter of self-respect, which you should absolutely have.


Given few programmers have open source work on their résumé that doesn't really matter, more just pride.


AFAICT they can only claim it because you haven't claimed it so

>which can directly relate to employment opportunities.

doesn't seem right.

Agreed on the copyright part of the equation, but I don't think many people take the commit author on GitHub as the copyright owner.


It seems that one proper solution could be:

1 - Don't associate the commit to an account if the email is unverified, obviously

2 - If someone tries to "forge" ownership by pushing a commit with an e-mail that doesn't belong to the GitHub account being used to push, a "unverified" warning should be added to the commit and manually claimed by the account owning said e-mail for its status to change.


That would break any workflow that has the slightest bit of decentralization, and would only work in a github-first workflow. Suppose two developers are working on a feature that requires two changes. Each developer works on a change in their own development branch. It wouldn't make sense to submit a PR for each change, because they depend on each other to be a complete feature. Therefore, they should merge the two dev branches together, then submit a pull request for that merged development branch to then be merged into the main branch. In your proposed scheme, this would then show up as an attempted forgery, rather than a perfectly normal workflow.

The email field is for humans, who can lie, and shouldn't be used as authoritative by github.


> this would then show up as an attempted forgery, rather than a perfectly normal workflow.

they said unverified, not marked as a forgery

> The email field is for humans, who can lie, and shouldn't be used as authoritative by github.

and your solution is just to have everything marked as unverified, including commits that I push using my credentials and contain an email that I've verified with github.


but pushing commits with other authors is a really common thing in many git workflows.


Veering off topic but I absolutely hate that git requires you to have an "email address" (which cannot be empty and iirc must satisfy some regex criteria for a valid-looking address). A particular choice of user identifier or communication medium should not be hardcoded into the totally unrelated concern of source-control, IMO. Anonymous and non-email accounts should be first-class things. Instead of email maybe you'd want to have your public-key or something.


AFAIK git was made with an email workflow in mind. Git's the byproduct on the Linux kernel project, for better or worse. GitHub provide a fake email address that's tied to your GH account for commit stats, which is an appreciated utility.


Nothing is stopping you from using anonymous@localhost or YOUR_KEY_ID@publickey


I scrolled through all the comments and didn’t see this answer.

What better way to recruit famous people to your platform than to allow people to trivially claim their commits until and unless they join and claim them?

It is most likely driven by customer acquisition — hence the response “working as expected!”


Github has made multiple decisions which, whatever the rationale, damage trust in them as an identity authority and make it more difficult to believe that a Github account represents who it appears to.

They also allow for accounts to be renamed and then for someone unrelated to register the abandoned name:

https://www.theregister.com/2018/02/10/github_account_name_r...


I remember when one of our contractors refused to do a rebase for like, a week, and just ignored any messages we sent him. I changed my e-mail on git to his, rebase, push, PR merged :)

Nobody ever found out hahaha


I remember this earlier subthread where someone was criticizing GitHub for allowing this (even using Torvald as someone to impersonate!), and others offered some defenses (which were IMHO dubious):

https://news.ycombinator.com/item?id=21025378

Also, semi-related, obligatory mention of my joke utility for stealing credit for someone else's work:

https://github.com/silasx/git-upstage

Finally, I thought this phrasing was funny, like commits have a non-substantively transferable ownership, like an NFT (though FYI it's quoting an older discussion of the same problem):

>Someone wrote about the whole situation on Medium in November 2021: "The 1st commit of git/git no longer belongs to Linus Torvalds".


Could someone also write bad code and commit it using someone else's email address in the commit message, thus making the commit link to the other person's Github profile? (Sort of the reverse problem -- "giving blame" instead of "taking credit")


Now you're thinking like the author of git-blame-someone-else: https://github.com/jayphelps/git-blame-someone-else


IIRC there was an infamous (at the time) user hostile commit made to a Google product (Android or Chrome perhaps) where the author was obfuscated to something like "Android Dev" instead of an actual individual.



Yes, simply change the email and author before commit and should work.

Note that git already provides a way to mark a commit with someone else authorship, but in that case you remain as the "original author" of the commit, usually shown as "X authored commit of Y". I sometimes use that when I need to push other coworkers code for whatever reason, or when you start a codebase from an old project files that weren't versioned (so that you are not the author of all the atrocities of the old code ;)


Yes, but it isn't limited to non-verified emails, you can do it with verified emails as well. I assume it's already used to obscure deliberate security compromises in forks etc.

There are many practical impersonation vectors. I assume Github is gonna have to require signed commits for profile links in the medium term future.


Or use it for clout-chasing by getting big names in your contributors list. :D


Arpad, your site looks like this - https://i.imgur.com/jj9Uxbl.png

Not just the linked page, the homepage too. All but illegible. That's in a recent Firefox on Windows. Just FYI.


Weird, looks like this for me: https://imgur.com/a/6qL0bWa

    Firefox 97.0.1
    Linux version 5.16.11 (gcc 10.3.0, GNU Binutils 2.35.2)


Thanks very much for letting me know, I'll look into that!


I'm on Firefox 97 on Windows 10, and it looks as intended for me.


At some point, we need to start treating Windows like Internet Explorer or Safari. Things break on it all the time, completely unpredictably, and it doesn't support many useful features you'd expect to be used.

We should really just tell Windows users to upgrade to a real operating system, rather than go out of our way to support it.


I love comparing these kinds of comments to articles like this: https://sporks.space/2022/02/27/win32-is-the-stable-linux-us...

You really think businesses like Repography should just ignore the OS with 80% marketshare?


Their response is to add a PGP key. But AFAICT they don't do verification on PGP keys either. So you could do the same.


Really? GH Enterprise definitely verifies GPG.


How does it verify them? They could ask you to sign a message to prove that you control the private key, but I don't think public Github (or Gitlab) does this. They just assume you hold the private key to any pubkey you upload. Alternatively your private installation could have a centralised trust store of keys.


It verifies the signature but I was able to just add a public key that I found online to my account.


Will Github verify a commit associated with a GH account via an unverified e-mail address?

If so then it's probably fine since you would have to demonstrate ownership of an e-mail address that was contained in the signed payload, or you would have to be able to sign payloads yourself (i.e. you have the private key).


I noticed that arraypad is really trying to push his repography project. Whilst this is not a bad thing, it seems like he is using the blog posts as an excuse to push more his project.

I don't mind that much, but I think I've seen these posts hitting the front page quite a lot already - it's a good strategy but it could be maybe against the guidelines:

> Please don't use HN primarily for promotion. It's ok to post your own stuff occasionally, but the primary use of the site should be for curiosity.


I didn't look at the previous posts you mention, but TFA is what I would consider of interest and curiosity. I didn't find it at all to be self promotion. You have to read to the bottom of the post before you are told what repography is selling.


Personally I think this submission and the other one[1] that probably made it to the front page are excellent :)

[1] https://repography.com/blog/go-first-commit


A grand total of 7 submissions to blog posts over the course of a month is probably fine.


This reminds me very much of a "hack" I performed in a workplace that used Outlook/Exchange as its primary email system. I simply sent an email (to a few, trusted people) with the "from" field set to the CEO's name/address.

In their inbox it looked completely legit. Outlook even put the CEO's avatar next to it and everything. They were genuinely shocked. Even after I explain that the "from" field is just like me writing "love from Mum" at the bottom of a letter I think they still couldn't believe it.

There is a problem with people assuming that all data they find is authoritative. People don't question whether they can trust data often enough. Another problem is when you make things look nice enough, they look trustworthy. This is a well known confidence trick, of course.

My PhD supervisor objected to me typesetting my work in LaTeX before it had been checked because he said once it's typeset it looks correct, but might still be complete rubbish.

Unfortunately this all boils down to web-of-trust, as usual. We've had the solution for decades now, but we've collectively agreed that it's more trouble than it's worth. So these kinds of problems will keep popping up again and again.


Why not add a small orange (!) icon next to the name for unverified emails, or a similar indicator? As a way of saying "this user claimed authorship, but we couldn't verify it".

When you commit from the Github page itself, a similar green "verified" check is shown, but if you do it from command line and then push nothing is shown. So the infrastructure for special verifications messages is there, and perhaps could be used.


It's possible to show the "verified" check when commiting from the command line, you just have to sign the commit with a PGP key, and associate said PGP key with your account on github.


I supposed that, but wasn't sure. Thanks for the confirmation!


I'm not entirely sold on the explanation "This is just how git commit (messages) work". GitHub could easily limit linking the GitHub profile to profiles whose e-mail address has been verified (by usual means, no GPG required).

They could show statistics and attribution limited to the data available in the commit messages (e.g. accumulated statistics by e-mail address) for contributors without a GitHub profile.

Am I missing something here? (Edit: just read the other comments addressing the use cases)


I knew about the ability to push commits as someone else, but GitHub allowing taking ownership of other people's commits in their own repos, using an unverified e-mail address seems like a whole another level of insecurity here.

Even though git -> email link is weak for reasons beyond GitHub's control, I expected email -> github account link to be reliable, since that is entirely under GitHub's control.

I think GitHub is needlessly making a bad situation even worse here.


Dated back to at least 2015 https://news.ycombinator.com/item?id=10005577

It’s old news.


This is just a fact of how attribution works in Git. It's not GitHub's responsibility to figure out exactly who should be given credit for which commit, they're just a viewer on top of Git commits.

Imagine you did some work at some workplace years ago, and you want credit for it. You don't have access to that email anymore, but you'd still like to have the credit and have it link to your account. That's the usecase.


> It's not GitHub's responsibility to figure out exactly who should be given credit for which commit, they're just a viewer on top of Git commits.

If it's not GitHub's responsibility, then why are they doing it? Nobody forces GitHub to attribute commits to GitHub user accounts. (And yes, you answer this "why" question in your next sentence, I'm just pointing out that your argument is nonsensical.)


Because they're trying to help build a social graph insofar as it helps people do their work, without the aspect of actually taking responsibility for being a source of truth around it.

Sure, I could claim an original Unix dev's work. But what does that actually do, besides raise questions I can't answer at interviews?


I'm extremely surprised as well. This seems like a obvious vector for an impersonation attack. A malicious user could do this, then perhaps they would have more success submitting a malicious change to "correct a flaw in their previous commit"

At the very least, repo owners should have some better control over how attributions display when the user is not a project member or the email used is not verified to an existing user.


Repography looks very cool, but why does it need permission to "act on my behalf"?

Is it possible to use without connecting my user at all?


An obvious and feasible improvement is to least make it clear in the UI which addresses are verified or unverified.


One thing to note: I believe this only works if the email is not already associated with a GH account.


I think if the project included a mailmap file [1], supported by Git, and if Github honored it, this may not be a problem.

[1]: http://git-scm.com/docs/gitmailmap


This is trackable, but I had my PR closed and my contributions redressed as another PR a week later by the members of a few different prominent opensource projects, without any communication on their part. And I only realized by chance.


how many email addresses can a person associate with their account? is there the potential for me to develop a bot that scrapes every "unclaimed" email address and claim them? seems like a very poor design choice.


As of today (Feb 28th, 2022) GitHub allows you to add up to twenty (20) unverified email addresses to your account. When you reach that limit, you will be presented with the following error message “Please verify one or more of your unverified emails before adding another.”


Is it at all relevant that github is not a source of authority for anything, unless the project itself chooses it (and maintainers/owners designate it) as the platform of choice for source control?


What happens if multiple github users add the same unverified email address for a particular commit in a repo to their accounts, how does it know which github username to pick to display next to the commit?


An email address can only be associated to a single GitHub account.

If “Alice” adds alice@example.test to their account, GitHub will check if the email is not associated to an existing account, and then proceed to handle the request. Then, when “Bob” tries to add the same email to their account, GitHub prints the following error message “ Error adding alice@example.test: email is already in use.”

In your example, GitHub will display the username of the first user to add the (unverified) email to their account.


so this is obviously an ad for this repography thing, which looks pretty cool.

so.. does it do anything interesting with stuff like git blame -CCC which shows the genealogy of copypasta across time within a repo?


Whether working as expected or not, just fix the Linus thing, that is an embarrassment and a surreal one at that.


Sign your commits if you care about this!

iirc, isn't signing with the same ssh key you push with a possibility?


I believe you could do that yes, but I don't think it's recommended.

My understanding is that git itself offers no real solutions to the questions of key-management, especially revocation and rotation, so you're essentially on your own for all of that.

Revocation seems especially tricky as it seems directly at odds with git's model of immutable commits. I don't know if there's a robust solution out there.


Why does the email address hijacking only work/show up for the first commit?


It is not merely for the first commit, here, the first commits are used as a proof of concept since they are large repostories known to everyone. You can try it yourself, by changing your email to something like Linus' email in your git config, and then trying to commit code. GitHub will automatically show Linus' profile.


Because Git/Github is source control tool not a forensic tool.


This implies that assuring a commit's provenance is beyond the scope of git, which is wrong. Git supports cryptographic signing of commits for this purpose.


It implies such signing is nit used all the time


I worked on the security team at GitHub, this was a long standing part of how git works. GitHub allows users to verify commits via GPG signatures to prove that they committed something but it doesn't work for proving a negative, that you did not commit something.

We got so many of these submissions which are clearly called out in the rules/scope, usually the people who don't read the rules don't find anything useful. ¯\_(ツ)_/¯


Obviously anyone can attach any email to any commit, but why does the frontend UI work like this:

    gh_profile = get_profile_with_email(commit_email)
and not this?

    gh_profile = get_profile_with_email(commit_email)
    if not gh_profile.has_verified_email_ownership(commit_email):
      return null


for these reasons I have started using PGP signing for commits and releases I make


Just ask Sushi Swap.


[flagged]


I think you're missing the point of what the author is asking. Showing the email address from the commit is one thing (and the author is fine with showing that). That's the limit to what git gives you. Associating that email address to a GitHub user profile which never verified ownership of that email address is a GitHub UX decision, having nothing to do with git. That's what the author is saying is a security flaw.

That said, clearly users shouldn't be ascribing any level of certainty to commits that point to a GitHub profile even if the email address is verified, since AFAIK nothing is stopping the inverse attack, i.e. having someone else take credit for your work. Which is arguably more exploitable.


That's not the main complain, the issue is that GitHub is allowing users to claim emails even without verifying users are the owners of those emails.


how are they claiming emails?


> The problem is that GitHub makes this association even for unverified email addresses. In this case of course it really was Linus who made the first commit, but all it took was someone to add Linus's email address to their GitHub profile - without any verification - and now GitHub displays this person as the author instead.


I can also write on my own web site that my email address is linus@linuxfoundation.org. But I can't send or receive messages from it, so how exactly would I be claiming it?

Does GitHub allow you to impersonate Linus via email? No, it does not.


I think you're fixated on the inverse of what this is. Imagine it's a true commit by Linus identified by his email address. This issue is when someone creates a GitHub profile with Linus' email address and by doing so, causes the GitHub UI to ascribe authorship of that commit to your user profile via your GitHub username and a link to your profile.


Also there's no need to imagine because that's what this is. A real commit with a correct email showing the wrong user.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: