GitHub Enterprise Server is the on-premises git repository hosting offering from GitHub. Large organizations commonly run GitHub Enterprise Server for improved control and security over their code repositories. With Enterprise Server, you can limit access to a private network, set rules for creating and accessing repositories, use SAML for single sign-on across your organization, and get access to premium GitHub support. You can also migrate to different hardware as your team and repositories grow, so there’s a lot of power in GitHub Enterprise Server.
GitHub also offers an Enterprise Cloud option that gives you dedicated enterprise resources, but it runs on GitHub’s hardware. With a 99.95% uptime SLA and access to GitHub’s built-in security features, Enterprise Cloud is an appealing offering. It allows you to restrict user access based on a user’s network and add SAML sign-on, but it removes your need to maintain potentially expensive infrastructure.
For many organizations, starting with GitHub Teams or GitHub Enterprise Cloud makes sense. If you eventually need more control over your servers, it’s relatively straightforward to migrate the data using GitHub’s API. But, what if you want to migrate from GitHub Enterprise Server to GitHub Enterprise Cloud?
Moving your data this direction proves to be much more challenging. In this article, I’ll walk you through my first two unsuccessful attempts to migrate from GitHub Enterprise Server to Enterprise Cloud. Finally, I’ll share what worked for me based on GitHub’s recommendations.
The Challenge of Migrating From GitHub Enterprise Server to Enterprise Cloud
A git repository tracks changes to all the files in a project. Typically, software projects put a .git/
directory inside their codebase that includes all git’s records. Because git is an open-source standard, there’s nothing special about a git repository hosted on your server, GitHub’s servers, or even a static file hosting solution like Amazon S3.
The challenge in migrating between git hosting providers is that most hosts offer additional features that enhance your team’s ability to work with git repositories. For example, GitHub includes issues, pull requests, metadata, and projects. While both GitHub Enterprise Server and the hosted GitHub solutions include these features, GitHub does not provide an official method for migrating this data from GitHub Enterprise Server to GitHub Enterprise Cloud.
Before figuring out the best way to do this, I tried two unsuccessful methods first.
Attempt 1: GitHub Enterprise Server Export
After reading the GitHub Enterprise Server documentation, the first idea I had was to use GitHub’s Enterprise export command-line tool to export my Enterprise Server data. The tool is designed for migrating your GitHub Enterprise data from one server to another, but I was hoping I could import the same data into a GitHub Enterprise Cloud or Team account too.
When you run the export, you end up with a folder having a bunch of JSON files and subdirectories for each repository you exported:
GitHub Enterprise Server export results
While this has all the information you need, there’s no easy way to import the GitHub Enterprise Server export into GitHub Enterprise Cloud. GitHub’s import API endpoint uses the repository’s git URL to grab the data from an external version control system and import it into GitHub.
This method might work if you wrote a script to parse the Enterprise Cloud export data and call each GitHub API endpoint sequentially, but this would be a huge challenge. There’s no detailed documentation on the format of the Enterprise Server export data, and it’s not clear if it contains any data that can’t be replicated in GitHub Enterprise Cloud.
Before I broke down and started writing custom code, I decided I would try the GitHub Importer first.
Attempt 2: GitHub Importer
To make it easier to import your repositories into GitHub, you can use the GitHub Importer tool or import API endpoint. Either of these methods let you enter a repository URL and user credentials to import the repo into your GitHub Cloud account.
The problem is that GitHub Enterprise Server is not designed to publicly share your repositories. Most Enterprise Server installations are on private networks, and if they’re not, they require private mode to be enabled.
To use the GitHub Importer, your Enterprise Server repositories must be publicly available on the internet and given read access. This runs counter to the goal of GitHub Enterprise Server as a means to increase the security and privacy of your code, so I didn’t pursue this option further.
After my first two attempts stalled, I decided to try one last method to migrate from Enterprise Server to Cloud.
How to Migrate from GitHub Enterprise Server to Enterprise Cloud
Moving your git repos from one repository host to another is easy if you don’t care about all the added features (like pull requests, issues, etc.). Git provides command-line arguments that allow you to add new remotes and push your repository to a new host. The challenging part is bringing along the rest of GitHub’s data during the migration.
The solution that worked (and GitHub’s support team recommends) is to migrate from Enterprise Server to Enterprise Cloud using a two-step process:
- Step 1: Push each repository to a new GitHub Cloud remote
- Step 2: Use the GitHub API to save and then migrate GitHub’s data piece-by-piece
Here’s how I migrated a single test repository from GitHub Enterprise Server to GitHub Cloud.
Step 1: Push Each Repository to GitHub Enterprise Cloud
Moving a repository and all its tags and branches from GitHub Enterprise Server to GitHub Cloud is the most straightforward part of this process.
First, clone the repository you want to transfer:
git clone git@<YOUR_SERVER_URL>:<YOUR_REPOSITORY_NAME>.git
Next, create a new repository in your GitHub Enterprise Cloud account with the same repository name. Once you’ve created the new repository, GitHub will show you the SSH address so you can add it as a remote:
Now you can add the new repository as a remote for the repository you cloned from your GitHub Enterprise Server account:
git remote add new-origin git@github.com:<NEW_REPOSITORY_NAME>.git
Push the repository to this new-origin
using the --all
flag to make sure all the branches are transferred:
git push new-origin --all
Refresh the page in your GitHub Cloud account, and you should see the repository with all the branches. You should let everyone in your organization know that the repository has been moved and ensure they update their origin
remotes appropriately.
Now your repository has been moved, but none of the other GitHub data was transferred. In the next step, you’ll see how to transfer your issues, pull requests, and projects.
Step 2: Use GitHub’s API to Migrate Other Data
Depending on the features your team uses, you probably want to transfer the GitHub issues, comments, projects, pull requests, settings, and repository metadata. Unfortunately, this information isn’t stored in git, so you’ll have to use GitHub’s API to move it to your new repository.
I won’t go through every piece of data you could transfer, but I’ll show you an example of how you can migrate your issues from GitHub Enterprise Server to Enterprise Cloud. The same process will work for all of GitHub’s data.
First, download a list of all the issues from your GitHub Enterprise Server repository:
curl -X GET -u "<YOUR_USERNAME>:<YOUR_ACCESS_TOKEN>" -H "Accept: application/vnd.github.v3+json" https://<YOUR_ENTERPRISE_SERVER_URL>/api/ v3/repos/<ORG_NAME>/<REPO_NAME>/issues --output ~/issues.json
Note that <YOUR_USERNAME>
and <YOUR_ACCESS_TOKEN>
in this call are your username and personal access token for your GitHub Enterprise Server account. You’ll need your username and access token for your GitHub Enterprise Cloud account in the next step.
This produces a file called issues.json
which contains all the issues for this repository. Each issue includes the data you’ll need to create the issue in your new GitHub Enterprise Cloud environment using the API. Unfortunately, there’s no bulk create endpoint for issues, so you’ll have to write a script that iterates through each issue.
To test this out, you can create a single issue manually via curl:
curl -X POST -u "<YOUR_USERNAME>:<YOUR_ACCESS_TOKEN>" -H "Accept:ap plication/vnd.github.v3+json" -d '{"body":"Its just a test though.","title": "This is a
new issue"}' https://api.github.com/repos/<ORG_NAME>/<REPO_NAME>/issues
Depending on how much data you decide to migrate, repeat this process of downloading resources from your GitHub Enterprise Server account and iteratively recreating them in GitHub Cloud until you’re finished. Then, repeat the process for each repository you want to migrate.
You’ll quickly find that this method isn’t perfect.
All GitHub’s IDs are sequential, so there’s no simple way to correlate the issues, users, comments, and other resources to those in your Enterprise Server account. This means you can go through each resource and manually convert the IDs, or you can ignore the errors and make the best of it. Neither solution is ideal. The timestamps will also be set to the date each issue is imported, so the order of items in your GitHub issues could be affected.
Conclusion
It may seem tedious to migrate from GitHub Enterprise Server to GitHub Enterprise Cloud like this, but if you want to do the migration yourself, it’s the only way. The other option for teams with a budget is to contact GitHub’s Professional Services division. They do custom development and implementation work like this and have likely faced similar migrations in the past.
Another common challenge for teams with changing git hosting needs is managed backups. BackHub (now known as Rewind) is the best way to make sure you never lose data in your GitHub repositories, and their nightly automated backups can help you with compliance and governance. BackHub offers backups to Amazon S3, and it’s available in the GitHub Marketplace today.