The complete list of every GitHub backup script

Sarah Bader | Last updated on May 5, 2023 | 9 minute read

Your hosted code repositories are a vital part of your business, but many people never think of backing them up. Creating backups for your hosted code repositories is crucial, because an adverse event could cause you to lose access to your repos, leading to an interruption or complete halt of your business operations. Repository hosting providers like GitHub and GitLab, while mostly reliable, can still suffer outages and unexpected downtime, or may perform maintenance leading to their services being unavailable. In some cases, their outages may result in the complete loss of your hosted code and assets.

In other instances, your organization may be targeted by malicious actors who compromise your accounts and demand a ransom to restore your access, or fall victim to human error. Organizations may also want to maintain backups to ensure redundancy for compliance and policy reasons.

GitHub Backup Scripts

One fairly inexpensive way to create backups is using scripts. These are free and open source backup scripts made available online by various authors and organizations. When run using appropriate GitHub credentials, these scripts create backups of repositories under individual user or organization accounts. In this article, you will get to learn about some of these scripts, including how to set them up and create backups with them.

You’ll look at the features and backup customization options they offer, and how well they handle automation for recurrent backups. You’ll also see if these scripts are actively maintained, and if they receive regular bug fixes and security updates.

Note: This article was written and tested on Ubuntu 20.04. If you encounter problems with any of these scripts, it may be the result of using a different operating system or version.

backup-github.sh gist by Rod Waldhoff

backup-github.sh is a bash script that backs up GitHub organization repositories, or, with some modification, individual user repositories. It’s written and maintained by Rod Waldhoff. It backs up repositories, their wikis, and their issues.

To use it, you’ll first need to download the script:

curl -o backup-github.sh https://gist.githubusercontent.com/rodw/3073987/raw/d5e9ab4785647e558df488eb18623aa6c52af86b/backup-github.sh

Next, you need to replace all instances of <CHANGE-ME> strings with the value described in the associated comments. The values you need to provide for this script are an organization name, your username, and a personal access token as a password. You also have the option of providing these values as environment variables. If you haven’t set up SSH for GitHub, you’ll need to modify GHBU_GIT_CLONE_CMD to clone with HTTPS.

The backup created by this script is a tar archive.

To create a backup, run:

bash backup-github.sh

Below is a screenshot of some of its sample output.

Rod Waldhoff's `backup-github.sh` script

backup-github.sh is easy to use and set up. It has no dependencies. All you have to do is provide configuration values and run it. While you can back up both user and organization repos, it doesn’t have much in the way of customization options, like backing up specific repositories or ignoring others.

Although you can set a backup directory using its GHBU_BACKUP_DIR config option and decide how often backups should be cleared using GHBU_PRUNE_OLD and GHBU_PRUNE_AFTER_N_DAYS, It doesn’t offer any automation options, so if you wanted to perform regular backups, you’d need to set up automation with something like a cron job.

The script is actively maintained, and regularly updated with fixes proposed by commenters. Waldhoff frequently answers queries left in the comments, as well, so some form of support is relatively easy to come by.

GitHubBackup by Tango Controls Core Projects

GitHubBackup, published by Tango Controls Core Projects, is a fork of Rod Waldhoff’s script. It improves upon that script with features such as authentication via token, proxy support, and the ability to back up more than thirty issues in a repository. However, unlike the original script, this only backs up repos from organizations, and doesn’t support backups for user repos.

To get this script, run:

curl -o backup-github.sh https://raw.githubusercontent.com/tango-controls/GitHubBackup/master/backup-github.sh

This script has the jq parser as a dependency. You’ll need to install it by running the following code:

apt-get install jq

Next, in the script, add your organization name and your personal access token. The points in the script you should place these are marked by tango-controls and <Your token here>, respectively. You can set which directory the backups should be placed in using the GHBU_BACKUP_DIR env var—replace <Your generated backup directory> with the directory path. Backups will be created as tar archives.

To create a backup, run:

bash backup-github.sh

Here’s a sample of the output it should return.

Tango Controls Core Projects GitHubBackup script

GitHubBackup is relatively easy to use and set up, and only has one dependency that you need to install, jq. The script allows you to set a backup directory, a proxy server, and how long to wait before pruning old backups. Other than that, it does not offer any customization options, such as specifying specific repos to back up or ignore.

Although it offers no automation, its README provides some pointers on how to create recurrent backups with a crontab. This script is not actively maintained, and as of this writing, the most recent update was almost fifteen months ago.

GitHub backup script by abusesa

GitHub backup script is a Python script published by abusesa that backs up repositories. It will update existing repositories and add any new ones if they have been added.

To use it, begin by cloning its repository and changing directories into its working directory:

git clone https://github.com/abusesa/github-backup.git
cd github-backup

This script has only one dependency, which you need to install by running:

pip3 install -r requirements.txt

Next, you need to create a config file. This file contains configuration values needed to run the script and create backups:

echo -e '{\n\t"token": "",\n\t"directory": "",\n\t"owners": []\n}' > config.json

In this file, add your personal access token to token, the path of the backup directory to directory, and the organizations and users whose repos you’d like to back up to owners.

To create the backups, run:

python3 backup.py config.json

It creates backups as bare repositories. Here’s a screenshot of some of its sample output.

abusesa’s GitHub backup script

Although its setup is a bit more involved, once you create a configuration file, running the script is pretty straightforward. The GitHub backup script allows you to select what organization and user repositories to back up, and what directory the backups should be saved in, but that’s the limit of its backup customization options. It also doesn’t provide any automation, so that’s something you’d have to manage yourself. This script is not actively maintained and, as of the writing of this article, it hadn’t been updated in two years.

Python GitHub Backup by Jose Diaz-Gonzalez

Python GitHub Backup by Jose Diaz-Gonzalez is a collection of Python scripts bundled into an installable CLI tool that back up GitHub user and organization repositories.

You can install it by running the following code:

pip install git+https://github.com/josegonzalez/python-github-backup.git#egg=github-backup

After installation, all you need to do is specify a personal access token using the --token flag and backup directory with --output-directory, then create a backup. Use the --repositories flag to back up repositories.

github-backup <organization> --token <token> --output-directory <backup directory> --repositories -O

It backs up the repository with a working directory unless specified as bare with the --bare flag. Here is a screenshot of some of its sample output.

Jose Diaz-Gonzalez’s Python GitHub Backup

Python GitHub Backup is installed with just a single command, and doesn’t require that you install its dependencies separately, making it very easy to set up. It’s also pretty simple to use, and offers a wide variety of features and customization options. You can back up everything from issues and pull requests to milestones and wikis, and the script allows you fine-grained control over what you back up. For example, with pull requests, you can decide to only back up commits, and leave out comments and other details.

To view all the options it offers, you can run github-backup --help. Unfortunately, it doesn’t offer any automation. It’s listed as feature complete on its README, so while bug fixes and pull requests for enhancements are welcome, it hasn’t been updated in fourteen months as of this writing.

Downsides of Using GitHub Backup Scripts

While using free and sometimes open source backup scripts can be cost-effective, they may not be the best choice for your backups. Most of these scripts only back up repositories, and fail to capture metadata like pull requests and issues. They also give you limited control over what you back up—it’s often impossible to select specific repositories to include and others to ignore.

If a script author does not make regular updates, you’ll also need to maintain the script. Free scripts are often outdated, and don’t receive regular security patches or bug fixes. As a result, they may contain security vulnerabilities, or become unusable due to things like API changes.

Support is limited, as well, as most of them are maintained by an individual creator, who may be slow to merge external contributions and respond to queries. If a backup fails, it’s nearly impossible to track the cause of the failure, as these scripts offer no reports or notifications. Finally, restorations from these backups are challenging, as they offer no mechanisms for recovery.

Using Rewind For Backups

Rewind is a backup service that automatically makes daily GitHub backups of your repositories. It’s easy to set up, as all you have to do is provide your GitHub credentials—no technical knowledge or extensive scripting needed. After authorization, your backup will begin, and will run regularly without your intervention. Rewind backs up not only your repositories, but also their pull requests, wikis, issues, projects, milestones, and other crucial data.

Rewind is GDPR and SOC 2 Type 2 compliant, offers an audit log, year-long data retention, and multiple storage location choices. Additionally, you can copy your backups to Azure Sync or AWS S3 for improved redundancy. If anything happens to your GitHub repo, you can restore your code and assets in a few simple steps, making restorations hassle-free.

Conclusion

Maintaining GitHub backups is essential to business continuity in the face of security compromises, repo hosting provider outages, or accidental data loss. Using GitHub backup scripts, you can set up backups inexpensively. However, it’s important to consider how easy they are to use and set up, customization options, automation options, and how well the script is maintained when selecting a script. While scripts can be great choices in some instances, they often don’t back up repository metadata, are insecure, and offer no automation or monitoring capabilities. With Rewind, you can set up recurring backups and make restorations securely and seamlessly in just a few clicks.

github

How many GitHub users are in your organization?

Individual Plans

$14

US / month

Free 14 day trial

Pro Plan

$14

US / month

$4.00 US / user / month

Free 14 day trial

Pro Plan

$14

US / month

$4.00 US / user / month

Free 14 day trial

Enterprise Plan

$400

US / month

$4.00 US / user / month

Contact sales

Enterprise Plan

$400

US / month

$4.00 US / user / month

Contact sales

Enterprise Plan

$400

US / month

$4.00 US / user / month

Contact sales

Profile picture of <a class=Sarah Bader">
Sarah Bader
Sarah Bader is a content writer, tech enthusiast, and passionate supporter of the Oxford Comma. When she puts her pen down, she can often be found riding her bike around Ottawa or watching trashy reality tv with her dog (he’s a big fan).