Safeguarding your SaaS data: debunking the top 9 backup and restore myths

Joel Hans | Last updated on May 16, 2024 | 9 minute read

If you’re an IT administrator or DevOps engineer worth their salt (and if you’re reading this, then we think you’re one of them!), you already have some strong opinions about backing up and restoring on-premises data. You’ve heard and disproven all the myths in day-to-day practice as you protected data from accidental deletion or malicious attack. You’ve maybe even restored swaths of mission-critical data while your entire organization watched over your shoulder as you pecked away at your command line.

But do you have the same strong feelings about your organization’s SaaS data? Do you have a similarly strong mental model for protecting mission-critical data when you no longer control the environment?

In our experience, a lot of top-tier administrators and engineers learn the hard way that many of the ideas and strategies they once proved on-premises are actually myths when it comes to SaaS data. If you’re shaky on the fundamentals, even as your organization starts integrating supposedly helpful AI assistants in this era of quickly-spreading shadow IT, let us offer you a new sorting algorithm to help you rebuild that mental model and learn where to start tightening up your practices.

Myth #1: Your SaaS data is protected by the provider

You might assume that when you use SaaS apps, particularly when you pay for them, the provider’s engineers are doing everything they can to back up your data. You might also assume that they would have a strong restoration process to bring you back up to speed ASAP. If your SaaS data disappeared without any path to remediation, they would lose a customer…right?

The truth? The logic is sound, but SaaS providers have come up with a sneaky strategy for escaping the obligation of protecting your data: the Shared Responsibility Model. This paradigm means the SaaS provider can restore their service and your data in improbable disaster scenarios, like an asteroid, but does nothing for far more probable situations, like an employee accidentally deleting your Jira Cloud instance at 4:59 pm on a Tuesday. You’re not alone in assuming your SaaS data is inherently safe—in a recent Rewind survey, we found 83 percent of IT professionals still assumed their SaaS vendors would complete data restoration requests.

Myth #2: Ransomware is your biggest target

A 2022 survey from Odaseva, which targeted senior data professionals from enterprises worldwide, found that 48 percent of organizations experienced some ransomware attack in the last 12 months, and that for 51 percent of them, SaaS data was the target.

We don’t deny the pervasive threat of ransomware and other external attacks—but there are others you should worry about first, especially in the SaaS landscape.

The truth? Human error still dominates the list of root causes of incidents around data. A pivotal study between Stanford University Professor Jeff Hancock and Tessian discovered that mistakes cause 88 percent of all data breaches, and in the 2022 follow-up, more than half of employees had fallen for a phishing email that impersonated a senior executive. Even sending emails to the wrong person was prevalent in their findings—folks are stressed, burned out, and simply trying to work too quickly. All situations ripe for SaaS data loss your provider won’t cover.

Myth #3: The 3-2-1 Backup Rule is still relevant with SaaS data

The 3-2-1 rule establishes a basic strategy for backing up production data in a way that’s resilient to data loss incidents driven by factors like fire or flood:

  • Keep 3 copies of current data.
  • Use 2 different media for backups, including a physical copy, on a virtual machine, or in a cloud data store like S3.
  • Maintain 1 offsite backup in a different location or cloud environment as the others.

This rule makes a lot of sense when you’re dealing with employees in a central office who work from networked storage in the server closet. If you create one backup on an external RAID you stash in your cubicle and another with a cloud provider, you’ve effectively protected yourself against many “common” disasters.

The truth? In an era where remote work is often the default, and because you no longer own the environment your SaaS platforms operate on, you have far fewer levers of trust. When your business depends on SaaS data, you need to flip the 3-2-1 rule on its head and focus first on an automated, complete, offsite backup. Once you’ve established that baseline, you can start to think about additional backups on different media.

Myth #4: An export file is a suitable backup

Nearly every SaaS offers a data export feature, which you can leverage to download a vault of your data. Save one of those locally and one on Google Drive, and you’re set!

The truth? Most SaaS companies built their data export to comply with data sovereignty laws, not to help you build a sophisticated backup strategy. If you dump a JSON or CSV file to your local filesystem and consider your work done, you’ll be in for a rough realization come the day you need to restore. Plus, what happens if you stored the export file locally and you’re on vacation? If you accidentally deleted the export file or didn’t timestamp them correctly, leaving you unsure which is the most recent?

Myth #5: You can craft a backup and restore solution in a day (or with 3 story points)

With a Bash script, you can add sophistication and automation to an export file-based backup, and the process does sounds simple enough: Export SaaS data using the provider’s API and an authentication key, rename the resulting file with mv and the current Unix time, and use rsync to send the file to a remote VM for long-term storage.

The truth? The script isn’t the hard part—it’s the ongoing maintenance. At Rewind, we work with thousands of companies and back up terabytes of data daily, so we’ve heard of all the mistakes around thinking you can forever take on more responsibility. Have you considered pruning old data to control costs or prevent failing backups due to a destination drive that’s full? Are you running your Bash script from a more reliable source than your local workstation? Are you validating existing data? Are you regularly testing your restoration process?

Myth #6: A backup of content from a SaaS is good enough

If your organization works heavily on GitHub, you can always git clone your repository for a complete backup of your content, which is ostensibly the many lines of code from your development peers. For everything else, like a Jira Cloud instance for your internal ticketing system, you might think the export file mentioned above is comprehensive, containing the content, assignees, and status of every issue and project.

The truth? The metadata around your content, like the code review comments left on a particular pull request, is often just as important as the content itself—it reflects not just the current or production data, but the thinking and collaboration that got folks there. SaaS data export files often don’t include all metadata, which means you risk losing some unknown portion of your data during an incident. If you can’t wholly restore metadata from an export file, you’ll also have to rebuild the structure yourself, greatly extending your recovery time objective (RTO).

Myth #7: SaaS simplifies—or even eliminates—the work for IT around data

One of the great selling points of SaaS and cloud environments in general is that they free you from all the administrative burdens of an on-premises deployment. Instead of worrying about networking infrastructure and the health of your hardware, you can redirect your focus toward security, compliance, or responding faster to requests from your peers.

The truth? That 2022 Odaseva survey mentioned in myth #1 also found that while 81 percent of respondents could completely recover all on-premises or private cloud data after a ransomware attack, only half could achieve a full recovery of SaaS data. If you understand that your SaaS vendor is not obligated to restore your data, it creates operational complexity for you.

Myth #8: My organization is too small to require backups… or too big to fail

You might feel you can roll with the data-related punches if you’re a small business or a startup. You’re not an obvious target for attack, and even if you accidentally delete some of your SaaS data, you can rebuild quickly and without a ton of cost. You’re not like these billion-dollar global enterprises that can lose hundreds of thousands of dollars in just a few minutes.

Conversely, your organization might be so large, and you’re under so much pressure to deal with day-to-day operations, that you assume even a data loss incident in one unit couldn’t possibly impact everyone else. Heck, other folks might not even know it happened. The bigger you are, the more isolated even a catastrophic loss of SaaS data seems.

The truth? Whether you’re new and scrappy or decades-old and slow-moving at best, one reality applies: The longer you kick the can down the road, the harder it is to backfill a backup and restoration solution into the sprawling landscape of shadow IT your peers have incorporated into their day-to-day. The more SaaS becomes our working standard, the more likely a single incident will impact not just one person but daisy chain its way across your organization, leading to downtime at a cost you can’t afford.

Myth #9: Someone else has backups under control!

Imagine that you’re employee number 5 on an IT or DevOps team. No one would blame you for assuming that your sharp-minded new peers have already solved the “backup problem” and that you’re free to work on more exciting projects.

The truth? Even sophisticated organizations fail to delegate basic backup tasks to specific teams or engineers. Case in point: GitLab suffered a massive data loss incident partly because no one was responsible for monitoring whether their backup system was working appropriately, and when push came to shove, they discovered it had broken a while before, due to a cronjob that failed silently day after day. Plus, many teams have built a sophisticated and secure system for on-premises or private cloud data, but that doesn’t mean they’ve even considered the organization’s growing SaaS landscape.

What’s next?

Data recovery exists on a spectrum. Establishing an initial system and procedure doesn’t mark the process as “complete,” and neither do incremental improvements based around new technology, hard-learned lessons from data loss incidents, or learning what separates a myth from a truth—the best you can ever hope for is “more complete than last week/month/year.”

Here are two more ways to get you there:

Profile picture of Joel Hans
Joel Hans
Joel Hans writes copy and marketing content that energizes startups with the technical and strategic storytelling they need to win developer trust. Learn more about how he helps clients like ngrok, CNCF, Rewind, and others at