How Mercado Libre backs up 13,000+ repositories with Rewind Backups for GitHub

Mercado Libre featured image

As microservices continue to gain adoption in enterprise software applications, more companies are using them to solve interesting problems at scale.

“Breaking a monolith into microservices has clear engineering benefits including improved flexibility, simplified scaling, and easier management—all of which result in better customer experiences.”

Mary Treseler, Vice President of Content Strategy at O’Reilly

But new technologies always bring new challenges, and microservices are no exception. One of the biggest hurdles in adopting microservices is the need for new, unique tooling that works across multiple codebases, programming languages, and HTTP layers. One such example is handling GitHub backups at scale when you use a microservice architecture.

In this post, you’ll see how Mercado Libre—the largest online retailer in Latin America—uses Rewind Backups for GitHub to manage backups for their massive collection of over 13,000 repositories. You’ll learn about some of the unique challenges companies like Mercado Libre face when backing up their GitHub repositories and how Rewind has designed an industry-leading solution to handle these challenges.

Introducing Mercado Libre

North American and European readers may not be familiar with Mercado Libre, but the company is an e-commerce giant in Latin America. They operate in eighteen countries and processed over $14 billion in payments in the first three quarters of 2020. In addition to its e-commerce platform, the company distributes a point-of-sale systema payments platform, and a growing logistics operation.

Technology at Mercado Libre

All these services were built in-house by their team of over 5,000 software engineers, product managers, and designers using a unique microservices architecture. The team now maintains over 13,000 GitHub repositories, which are managed by a custom-built orchestrator.

Each microservice is written in the best programming language for the job at hand (typically Java, Go, or Python) and independently deployed to Amazon Web Services or Google Cloud Platform. This architecture makes their system incredibly robust, but it introduces some unique challenges, especially when the team started to explore backing up all their repositories.

Mercado Libre’s Journey to Rewind

As you might imagine, keeping track of 13,000 repositories is a big job in itself, but the problem gets even more complicated when you face the level of scrutiny that a publicly traded company like Mercado Libre faces. As their payment processing platform grew, Mercado Libre decided they needed to have internal backups of all their codebases hosted on GitHub.

Repository backups are a common part of SOC2 compliance, so while GitHub is a very reliable place to keep your code, your code is a critical part of your infrastructure, and backups ensure you have access to it at all times. Like many teams, Mercado Libre started by building an in-house backup solution.

One of their engineers created a script that could be run nightly on an AWS Lambda. It performed a `git clone` on each of Mercado Libre’s GitHub repositories and pushed the data to an S3 bucket. This simple solution worked for a while, but pretty soon, it was clear that it had some shortcomings.

First, there was very little visibility into the backup process. Engineers at Mercado Libre could check the dates on each repository in their S3 bucket, but nobody had time to check each repository every day. Second, the script didn’t use the GitHub API, so it wasn’t able to backup issues, pull requests, or metadata from each repository. If GitHub failed, this data would have been lost. Finally, there wasn’t an easy path to restore data from these backups. Mercado Libre’s engineers would have had to recreate all the GitHub repositories manually, so restoring all 13,000 repositories would have been a time-consuming task.

After running the Lambda script for a while, someone on the engineering team noticed that the backups were no longer working. The permissions on the S3 bucket had been changed, so the IAM role performing the backups could no longer save git repositories to the bucket.

After realizing the rabbit hole their team would be going down to ensure this didn’t happen again, Mercado Libre’s engineers started looking for a better solution. They asked GitHub, who recommended Rewind Backups for GitHub.

Adopting Rewind Backups at Mercado Libre

When Mercado Libre approached Rewind to ask about backing up their 13,000 GitHub repositories, the Rewind team was excited.

While Rewind handled backups for over 80,000 repositories every day, this would be the biggest single user on the platform, so it would give the Rewind engineers the chance to test a new level of scale.

Mercado Libre tried Rewind Backups for GitHub for three weeks and monitored the results. The initial backup took four days due to the time-based limits imposed by the GitHub API and the sheer volume of data that needed to be saved, but incremental backups after that took just an hour. Mercado Libre’s team was impressed with the results, but before we get to that, let’s look at how Rewind handles backups at this scale.

Final results

Mercado Libre’s use case is an excellent example of how Rewind Backups for GitHub can save you weeks of engineering time while improving your infrastructure’s redundancy.

“Rewind Backups for GitHub has been very good for Mercado Libre. We know it’s working, and we don’t have to maintain our own solution anymore. It runs without any intervention from us.” –Mariano Guelar, Governance Project Leader at Mercado Libre

Mercado Libre’s team got Rewind up and running quickly and with very little configuration required. Once their backups were being stored in S3 every night, they set up a lifecycle policy that moved files to Amazon Glacier storage after sixty days. This allows them to stay compliant while minimizing their cloud storage costs.

If your organization is looking for a robust GitHub repository backup solution, check out Rewind. Whether you’re running a single monorepo or 13,000 small repositories, Rewind can ensure you never lose access to your code.