The SaaS Resilience Gap: Why Continuity Stops at IaaS

Answer: The SaaS resilience gap is the structural mismatch between the resilience guarantees infrastructure (IaaS/PaaS) vendors provide and the resilience requirements businesses now have for the SaaS applications that run on top of them. Cloud infrastructure has had a mature, four-tier disaster recovery framework for over a decade. The SaaS layer above it, where business-critical workflow, IP, customer data, and compliance evidence now live, operates under a shared responsibility model that most organizations have not fully internalized. Closing the gap requires applying IaaS-grade resilience discipline (RTO/RPO, tiered recovery postures, tested restoration, governed retention) to SaaS systems. Gartner expects 75% of enterprises to treat SaaS application backup as a critical requirement by 2028, up from 15% in 2024.

For two decades, the cloud infrastructure industry built a sophisticated, well-understood, well-rehearsed playbook for resilience. Recovery Time Objectives and Recovery Point Objectives became standard vocabulary in engineering organizations. AWS’s Well-Architected framework codified four tiers of disaster recovery posture, with clear cost-benefit trade-offs. Compliance regimes like SOC 2, ISO 27001, HIPAA, and PCI DSS were built around the assumption that resilience could be designed, tested, and proven.

Then, in parallel, the locus of business value moved. Engineering work moved to Jira and GitHub. Customer data moved to Salesforce and HubSpot. Documentation moved to Confluence and Notion. Financial data moved to NetSuite and QuickBooks. Code moved to GitHub and Bitbucket. The mission-critical operational layer of the modern business is now overwhelmingly SaaS. And the resilience playbook that protects the infrastructure beneath these applications largely does not extend to the applications themselves.

This mismatch between IaaS-grade resilience expectations and SaaS-era operational reality is the SaaS resilience gap. This piece is an attempt to define it precisely, examine why it exists, and propose what closing it actually requires.

How resilience was solved at the infrastructure layer

The IaaS resilience model is mature. AWS’s Well-Architected Reliability Pillar defines four disaster recovery strategies in increasing order of capability and cost: Backup and Restore, Pilot Light, Warm Standby, and Multi-site Active/Active. Each posture is paired with documented RTO/RPO targets. The framework is internalized by every cloud-native engineering organization. The vocabulary is shared. The trade-offs are well-understood: lower RTO costs more, and the right choice is a function of the business cost of downtime versus the cost of the standby capacity.

Around this technical framework, an entire ecosystem of compliance, audit, and contractual practice has grown. SOC 2 Availability criteria require documented RTO/RPO, tested DR plans, and reviewed business continuity processes. Customer security reviews routinely include backup and DR questions. Cyber insurance underwriting examines whether DR is documented and exercised. The IaaS layer is a solved problem, with operational disagreements at the margins but consensus on the framework itself.

Why the SaaS layer didn’t inherit this maturity

Three factors explain the lag:

Speed of adoption outran the discipline. Organizations moved to SaaS rapidly to get the productivity benefits. The deliberate, slow work of building resilience frameworks for each SaaS system was deprioritized in favor of adoption velocity.
The shared responsibility model is poorly understood. Every major SaaS vendor publishes a shared responsibility model: they guarantee infrastructure availability, but the customer is responsible for application-level data protection and disaster recovery. A 2024 State of SaaS Data and Recovery report found that 79% of IT professionals believed SaaS applications include backup and recovery capabilities by default. They don’t.
The tooling lagged. IaaS-grade DR tooling like replication, point-in-time recovery, and automated failover took years to develop. SaaS-grade equivalents are still emerging, with different platforms exposing very different primitives for backup and recovery.

What the gap looks like in concrete terms

The gap manifests in four specific failure modes that organizations encounter operationally:

The outage gap

SaaS vendors operate at high uptime: Atlassian Cloud commits to 99.90% for Premium and 99.95% for Enterprise, and customers reasonably assume the vendor handles availability. But the published RTO for major incidents is 6 hours, and the historical worst case is much longer. The April 2022 Atlassian incident affected 775 customers for up to 14 days. No customer-side resilience posture existed for most of them to bridge that gap.

The configuration gap

SaaS applications store significant business logic in configuration: workflows, custom fields, automation rules, app-level settings. Native backups frequently capture the data without faithfully preserving the configuration that gives the data meaning. After a restore, organizations encounter the “hollow restore” pattern, where tickets exist, but the system that uses them is broken. Recovery time inflates from hours to weeks.

The compliance gap

Auditors increasingly expect SaaS systems to meet the same controls as infrastructure systems: documented retention, tested restoration, role-based access on backup data, audit logs of backup operations. Native SaaS tooling typically does not provide audit-ready evidence for these controls. 96% of ransomware attacks target backup repositories, and the controls auditors look for assume infrastructure-grade backup hygiene that SaaS native tools often lack.

The vendor concentration gap

When backup lives inside the same vendor as production, no posture exists for events that affect both simultaneously: ransomware, account compromise, mass deletion. The ShinyHunters Salesforce attacks in 2025 allegedly compromised over a billion records across more than 30 organizations, forcing affected enterprises without independent backups to choose between paying ransom and accepting permanent data loss.

Three case studies that illustrate the pattern

April 2022: Atlassian Cloud

An internal Atlassian script deleted 883 sites belonging to 775 customers. First customers were restored three days later; final restorations completed 14 days later. Atlassian’s post-incident review is unusually transparent: the issue was not just the deletion but that the company’s restoration tooling had not been designed for simultaneous multi-customer recovery at scale. Customer-side resilience would have meant continued read access during the 14 days. Almost no customer had it.

July 2024: CrowdStrike

A single content update from a security vendor disabled an estimated 8.5 million Windows systems globally. While not strictly a SaaS resilience event, Gartner cited the CrowdStrike outage as a catalyst for its prediction that 75% of enterprises will treat SaaS backup as a critical requirement by 2028. The lesson the industry absorbed: vendors fail, and resilience that depends on a single vendor is fragile by definition.

2025: ShinyHunters Salesforce campaign

Social engineering attacks across multiple major enterprises like Adidas, Allianz Life, TransUnion, and others extracted data from Salesforce instances, allegedly compromising over a billion records. Organizations without independent backup faced a stark choice. The incident accelerated Salesforce’s own acquisition of Own, now offered as Salesforce Backup & Recover. The market signal is unambiguous.

What closing the gap looks like

Closing the SaaS resilience gap is not a single product or a single decision; it is the deliberate application of IaaS-grade discipline to SaaS systems. Four practices distinguish organizations that have closed it from those that have not:

Defined RTO and RPO per SaaS system. Not aspirational. Documented, agreed to by the business owner, and tested.
A chosen DR posture per system. Backup and Restore, Pilot Light, Warm Standby, or Multi-site, explicitly chosen against the business cost of downtime.
Tested restoration on a regular cadence. Annual is the SOC 2 minimum; quarterly is mature practice. The test produces auditor-ready evidence and prepares the team for the real event.
Configuration-aware backup. The backup preserves not just the data, but the logic that makes the data functional. Restores produce working systems, not hollow ones.

Where the market is going

The shift is happening. Gartner’s prediction that 75% of enterprises will prioritize SaaS application backup as a critical requirement by 2028, up from 15% in 2024, is a five-fold expansion in five years. The SaaS application backup market is expanding rapidly, initially led by specialized vendors and now joined by established enterprise backup software. Major SaaS platforms are themselves acquiring backup capabilities, with Salesforce’s acquisition of Own being the most visible recent example.

The companies closing the gap ahead of their peers gain three durable advantages: faster recovery when incidents happen, easier compliance posture across multiple frameworks, and a credible answer to the security reviews that increasingly gate enterprise deals.

The reframe

“Backup” is no longer the right word for what enterprise SaaS now requires. What is required is resilience: a continuous, tested, configuration-aware, governance-ready capability to keep the business operating when SaaS systems fail. The framework exists; it just needs to migrate from the infrastructure layer where it was developed to the application layer where the business actually lives. The companies that complete that migration first will not just be more resilient. They will be operationally and competitively advantaged in a market that is rapidly making this an expectation rather than a differentiator.

The SaaS resilience gap is closeable. It is closing. The question for any individual organization is not whether to close it—that decision is increasingly being made by regulators, customers, and insurers regardless—but whether to do so on the timeline of strategic advantage or on the timeline of crisis response.

Sources

AWS — Disaster Recovery Options in the Cloud (Well-Architected) — https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/rel_planning_for_recovery_disaster_recovery.html
Atlassian — Service Level Agreement — https://www.atlassian.com/legal/sla
Atlassian — Post-Incident Review on the April 2022 outage — https://www.atlassian.com/blog/atlassian-engineering/post-incident-review-april-2022-outage
Atlassian — Approach to resilience (shared responsibility model) — https://www.atlassian.com/trust/security/data-management
Gartner — 75% of enterprises will prioritize SaaS backup by 2028 — https://www.gartner.com/en/newsroom/press-releases/2024-08-28-gartner-predicts-75-percent-of-enterprises-will-prioritize-backup-of-saas-applications-as-a-critical-requirement-by-2028
Spin.AI — The Shared Responsibility Gap in SaaS Security — https://spin.ai/blog/shared-responsibility-gap-saas-security/
TechTarget — SaaS shared responsibility model: What vendors don’t cover (ShinyHunters incident) — https://www.techtarget.com/searchdatabackup/tip/SaaS-shared-responsibility-model-What-vendors-dont-cover
IT Brief NZ — Gartner predicts rise in SaaS backup prioritisation by 2028 (CrowdStrike context) — https://itbrief.co.nz/story/gartner-predicts-rise-in-saas-backup-prioritisation-by-2028
Konfirmity — SOC 2 Backup and Recovery (Veeam ransomware data) — https://www.konfirmity.com/blog/soc-2-backup-and-recovery-for-soc-2

Rewind">

Rewind

Rewind is a leading and trusted provider of cloud backup and data recovery solutions, helping businesses safeguard their critical SaaS data from loss, corruption, and cyber threats.