Show HN: Backup your public GitHub repositories with GitHub-backup

w0rd-driven · on Sept 8, 2014

Where a tool like this seems to really make sense is not in the git repository. Most of us know or should know how to back this up properly.

The real power comes in dumping the issues, wiki, PRs, and any other information that is not completely contained in git. Having something that would dump the git+ so that even if a site like GitHub were to shutdown you at least had data, in hopefully human readable format like JSON, that can be parsed at a later time if you so choose.

This would also open the door for interop between things like GitLab, GitHub, and Bitbucket which all seem to have their own version of what the term "git+" should be. If this could be a pathway to lead to standardizing that API on top of git so that all these different players can talk with one another would be kinda huge as well. It's hard to see Github and Bitbucket wanting to work together for something like this when they have mouths to feed but something tells me "if we build it, they will come."

josegonzalez · on Sept 8, 2014

I wrote a tool like that for Github almost a year ago, finally got around to packaging it. It copies issues, issue events, watched/starred/private/public repos, wikis, etc. All configured via arguments. It's written in python with no external dependencies, so should work fine in most versions:

- https://pypi.python.org/pypi/github-backup

I built it as our company github org is getting quite big, and I wanted to backup my own repositories (at the time I had something like ~300 repos, which I've since slimmed down). Worked well enough for me.

ericlathrop · on Sept 8, 2014

That looks very nice. Much more comprehensive than my tool. If all my tool accomplished was to prompt you to release your better tool, then I'm a happy camper. Yay open source!

aaronbasssett · on Sept 8, 2014

I'm not sure about issues or PRs but you can clone the Wiki like any other repo. Just add ".wiki" before the .git at the end of the clone url.

    git clone git@github.com:jquery/jquery.wiki.git

w0rd-driven · on Sept 8, 2014

I seem to remember this being implemented but overlooked it. Thanks for bringing it up. Ideally, that would be the preferred way to "backup" Github repos. Just replace .wiki with .issues or .pullrequests [insert naming convention here].

It would require remembering the specific repositories for each entity but that's a minor inconvenience if they follow this suggested pattern.

The thought of standardizing the data makes sense when these formats change over time, and I expect they will as features or workflow is added. If I backup my private repo today, I would prefer to have the ability to import it 3 years from now if I so choose and not have to "get lucky" with migrating the data format. It likely wouldn't be hard to figure out if the Github backend APIs are as documented as I believe, but this is a pretty big caveat to using any of these backup approaches.

cookrn · on Sept 8, 2014

I released this as a product a few years ago and subsequently open sourced it. With a bit of work, it could be a good little web app to run on your own server to manage repo backups.

https://github.com/antiqua-io/Antiqua

blainesch · on Sept 8, 2014

Aren't your repos already backed up locally? I mean that is where I created them...

ericlathrop · on Sept 8, 2014

Not when you use several different computers, or when you're done with an old project and delete your local copy.

ericlathrop · on Sept 8, 2014

Plus, you can use this to backup someone else's repos in case they delete them.

atmosx · on Sept 8, 2014

If I have to backup my github repositories what good is github for?! Showing my code to the world?!

hayksaakian · on Sept 8, 2014

Its a back up, not a replacement.

Maybe github is down or they close your account because they don't like you.

tlongren · on Sept 8, 2014

Can't even get it to work. Runs, but does nada. What am I missing?

ericlathrop · on Sept 9, 2014

Do you have node.js installed and in your $PATH?

teach · on Sept 8, 2014

Are there really people whose ONLY copy of a project is on GitHub? If so, I think they're not doing distributed version control correctly.

ericlathrop · on Sept 8, 2014

This tool helps you stay distributed by keeping a server somewhere that constantly backs up all your repos. With this you can't accidentally forget to make a copy if you make a quick repo in a coffee shop somewhere, and delete the local copy days later when you're cleaning up.

deanc · on Sept 8, 2014

Or you could mirror to multiple repositories with one push. I write about how to use Github & Bitbucket simultaneously here: http://deanclatworthy.com/2013/01/how-to-avoid-relying-on-gi...

ericlathrop · on Sept 8, 2014

Wow, that's a pretty neat approach! How well does that work with multiple people?

tomswartz07 · on Sept 8, 2014

I can't imagine it would be that difficult with multiple people.

Even if they pulled from one or the other, so long as someone merges the changes from the missing repo, everyone will be good to go.

This is a pretty unique approach to multiple git servers.

lucaspiller · on Sept 8, 2014

I have old projects on GitHub that I haven't touched for a few years. They are things that I don't really care about if I lost them though.

zeckalpha · on Sept 8, 2014

I mirror most stuff between Bitbucket and GitHub, plus local copies of course.