Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Where a tool like this seems to really make sense is not in the git repository. Most of us know or should know how to back this up properly.

The real power comes in dumping the issues, wiki, PRs, and any other information that is not completely contained in git. Having something that would dump the git+ so that even if a site like GitHub were to shutdown you at least had data, in hopefully human readable format like JSON, that can be parsed at a later time if you so choose.

This would also open the door for interop between things like GitLab, GitHub, and Bitbucket which all seem to have their own version of what the term "git+" should be. If this could be a pathway to lead to standardizing that API on top of git so that all these different players can talk with one another would be kinda huge as well. It's hard to see Github and Bitbucket wanting to work together for something like this when they have mouths to feed but something tells me "if we build it, they will come."



I wrote a tool like that for Github almost a year ago, finally got around to packaging it. It copies issues, issue events, watched/starred/private/public repos, wikis, etc. All configured via arguments. It's written in python with no external dependencies, so should work fine in most versions:

- https://pypi.python.org/pypi/github-backup

I built it as our company github org is getting quite big, and I wanted to backup my own repositories (at the time I had something like ~300 repos, which I've since slimmed down). Worked well enough for me.


That looks very nice. Much more comprehensive than my tool. If all my tool accomplished was to prompt you to release your better tool, then I'm a happy camper. Yay open source!


I'm not sure about issues or PRs but you can clone the Wiki like any other repo. Just add ".wiki" before the .git at the end of the clone url.

    git clone git@github.com:jquery/jquery.wiki.git


I seem to remember this being implemented but overlooked it. Thanks for bringing it up. Ideally, that would be the preferred way to "backup" Github repos. Just replace .wiki with .issues or .pullrequests [insert naming convention here].

It would require remembering the specific repositories for each entity but that's a minor inconvenience if they follow this suggested pattern.

The thought of standardizing the data makes sense when these formats change over time, and I expect they will as features or workflow is added. If I backup my private repo today, I would prefer to have the ability to import it 3 years from now if I so choose and not have to "get lucky" with migrating the data format. It likely wouldn't be hard to figure out if the Github backend APIs are as documented as I believe, but this is a pretty big caveat to using any of these backup approaches.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: