I debugged for one min at 11:59 trying to push, and then my eat-lunch notification came in at 12:00 and I opened hackernews with a tuna sandwich and this is super helpful because it means I wont need to debug it locally for 10 mins before figuring out Github is down.
Edit - Just want to clarify when I say "opened hackernews with a tuna sandwich" I want to clear up that I did indeed full on mash the keyboard letters with my sandwich. It's costing me a fortune in keyboards every day and it's ruining my sandwich most days as well, I think I have an issue.
Well from another perspective, a bootstrapping perspective, and hackernews does like a bootstrapping perspective, I could make a case that a good meal could pick licked out of most keyboards, saving money on lunch once a week.
The GitHub Status shows 14 incidents affecting Git Operations this year alone [1]. That's quite a lot, considering it's only May. I wonder if the outages were always this frequent and just get more publicity on here now, or whether there was a significant increase in outages fairly recently.
Many outages happen because something changed, and someone/something missed one of the effects of said change, bringing the platform down immediately, or after a while.
There was a period of time when GitHub basically didn't change, for years. And the platform was relatively stable (although "unicorns" (downtime)) still happened from time to time.
But nowhere near as often as now, but then again, there is a lot of more moving pieces now compared to before.
Interested to hear whether anyone actually managed getting some Client Credits as per their SLA [1]? Over the last quarter they probably went sub 99.9% in some services.
About 10 years ago someone said we should move to self hosting because bitbucket who we used were unreliable. I looked at the status page and say 2 hours of downtime over 3-months, while we have 3-4 days of downtime on our self hosted jenkins during the same time. I always think of that when I see people complain about services being unreliable. Often we see one or two problems in short span and forget about the months were we didn't see any issues.
GitHub is probably as reliable now as it has been for the past 10 years. It's always had downtime.
> Let’s hope it’s temporary and GitHub error 500 won’t become their own version of Blue Screen of Death. In this case it would be Green Screen of Death (GSoD or GhSoD).
It's a surprisingly unreliable service. It's been great for code management / reviews. But I can't imagine relying on it as the only option for deployments via CD. Imagine needing to deploy an important bug fix or a big project with a release date, but you can't because Github's having an outage.
Once again another GitHub incident and 4 days later before the last one [0], GitHub Actions goes down.
You are better off self-hosting at this point, rather than centralizing everything to GitHub [1] as it is just chronically unreliable for years ever since the Microsoft acquisition.
We've switched to selfhosted Gitea last month, no regrets. Only the CI story could be a bit better. We're currently using Woodpecker but need macOS runners, and Woodpecker's "local" agent implementation is still unstable. I'm watching Gitea Actions' progress with great interest.
I'm the lone person at my team that still believes in keeping most of our stuff local, with online versions as primarily backup.
Every time some global service goes down, or internal internet/intranet goes down, there is a security breach, or a WFH person has a power outage I'm reminded I'm right.
I'm no luddite, these services make you dependent on them. The worst thing I'm dependent on here is a bad computer. We have backups and keep our files on our network, so it seems fine. We are slowly moving to an online system, and I'm constantly reminded all the problems shifting online.
Meanwhile, if I had a linux server, we would be in control of our own destiny.
GIT is actually a great protocol for keeping distributed copies of code. You can pretty easily with bash cycle through a list of backup urls for a git repo, looking for updates.
Perhaps everyone should stop complaining and be thankful for a chill morning. You can't create a PR right now - go get a pastry and some fresh air. Be in the moment for once. It's beautiful outside*
GitHub outages aren't nearly long or often enough to consider this. Git is distributed, just keep working locally until GitHub is back up. GitHub outages are nowhere near the threshold of pain I'd require to introduce a second Git hosting provider to the mix.
Really, GitHub outages barely hurt at all. It's not like an AWS or Cloudflare outage which is more likely to be a production disaster. Every outage a bunch of people on HN start screaming about owning their own on-prem destiny or wondering why we're still on GitHub. Nothing changes because it's not nearly as bad as those people are making it out to be. Life is all about tradeoffs.
Github has been down hundreds of times this year alone.
They have reported outages 72 times this year and there are multiple times when services are unavailable and they don't report it on the status page.
> there are multiple times when services are unavailable and they don't report it on the status page.
There's no evidence that the exact same doesn't happen with GitLab. I've had it (consistently) 500 on me in the past when there's nothing on their status page to indicate any issues.
That's not the point of discussion. I didn't say Gitlab doesn't lie about it or heck, That it doesn't have worse uptime than Github.
My argument is that a company erasing 300GB production database once is not a stain on their competency and that it can not be compared to a company which has very frequent outages which also happens to lie when they have outages.
gitlab.com is implied since it happening on a self-hosted instance would have nothing to do with gitlab as a service (they can't be responsible for your on-site backups).
> Trying to restore the replication process, an engineer proceeds to wipe the PostgreSQL database directory, errantly thinking they were doing so on the secondary. Unfortunately this process was executed on the primary instead. The engineer terminated the process a second or two after noticing their mistake, but at this point around 300 GB of data had already been removed.
Ah I see the link. I'd caution that many people choose between github.com, gitlab.com, and gitlab self-hosted. The reliability of self-hosted gitlab is meaningful, especially when operated competently. People need to know if there are safeguards or foot guns. Backups alone can't prevent data loss.
The point isn't that GitLab has more, the point is that running these things at global scale is pretty complicated, and everyone has problems. "Just switch to GitLab" is pithy but isn't in itself an actual solution.
You can self-host GitLab and have few, if any, incidents that get resolved very quickly. Worked for a company that had no incidents that I observed in ~3 years, now work at a company that had ~2 incidents in 1.5 years.
We have a self-hosted Premium instance and have 30min of downtime _every day_ while the database is frozen and backed up. We've been told that it's a known issue being discussed with GitLab but that could just be CYA. But in any case, it's the "at scale, while changing" that tends to cause problems.
Perhaps this is a continuing argument for self-hosting, especially if you don't have to expose the instance publicly. But then, if that's an option, you can also self-host GitHub (though I have heard less anecdotes about the stability of that).
GitLab is quite a bit more expensive. If you have GitHub Enterprise with the security features, it's $70/month/user whereas you'll need to get GitLab Ultimate for the security features, which is $99/month/user.
From the outside, it appears GitHub doesn't have any internal sharding going on.
Outages always affect _all_ repos.
Architecturally this seems rather sub optimal?
EG AWS doesn't roll out changes globally - they start with an internal shard within a region and progressively roll out to more shards and more regions.
remote: Resolving deltas: 100% (3/3), completed with 3 local objects.
remote: fatal error in commit_refs
To github.com:acme/foo.git
! [remote rejected] HEAD -> acme/foo (failure)
error: failed to push some refs to 'github.com:acme/foo.git'
If that makes you mad, I still need help with https://github.com/MichaelMure/git-bug ;-)
Coming at some point, kanban and pull-request support, offline-first!
You're always relying on third parties. Always. Except if you run it locally. We're way beyond that. I deployed to production just fine. It's just a helper. It adds to the stress tho.
Can anyone from GH weigh in on this? We've had several major outages from GH over the last month or two, and the company has been completely silent on the causes, as well as any sort of remediation steps to fix stability.
As a somewhat large size org, we're now exploring other options for code hosting.
Ignoring the fact that what people actually do with GitHub, git is such a small part. Issues, PRs, CI/CD and basically everything that isn't git, doesn't happen over git (besides the wiki, which somehow miraculously actually is via git).
Some people have their entire roadmap in GitHub, and every single bug report / feature request, without any offline backup. Don't ask me why, I don't get it. Especially since they have proven for the last few years that they cannot keep the platform up in a stable manner.
I mean, you joke but that's actually fairly true. P4 users always notice when the central server goes down because you can't reliably look at changelist history, draft CLs, and do a host of other operations that are possible on git locally. (using a central VCS confers other advantages of course).
No I'm not. This outage affects Github, not git itself - but if you're storing your git repos (and automation) on Github then you cannot git clone, push etc... from or to them - all of which are critical to CI/CD.
They are adding affected services to the status entry title (started with Issues, Actions, Operations). Can't even do a simple push due to this so-called "degraded performance".
I've started working on a Forgejo instance for myself (Gitea fork). It's honestly disappointing how bad GitHub has gotten, just in terms of uptime anymore. I hope they get their stuff together.
Due to GitHub's chronic unreliability, it is guaranteed to continue happening every month.
Looks like avoiding to 'centralize everything to GitHub' has aged very well [1] and at this point you would get better uptime with self-hosting instead of using GitHub.
Just ask many open source organizations like RedoxOS, ReactOS, wireguard, GNOME, KDE, etc.
Edit - Just want to clarify when I say "opened hackernews with a tuna sandwich" I want to clear up that I did indeed full on mash the keyboard letters with my sandwich. It's costing me a fortune in keyboards every day and it's ruining my sandwich most days as well, I think I have an issue.