In the first two blog posts of the series Kurt talked about the concepts of protection and detection as it relates to Ransomware threat readiness.   The industry has recognized that data integrity attacks such as Ransomware are not a matter of “if” but “when”.      This is illustrated by the fact that various industry frameworks are starting to categorically account for recovery as an individual problem domain. 

Another change in approach is that recovery needs to evolve from traditional disaster recovery to include mechanisms to address data integrity threats.  Traditional disaster recovery assumes data integrity is still intact and that functional compute and storage simply become unavailable, presumably because of a natural or man-made physical disaster.  

Traditional Disaster Recovery is arguably simpler to address because of the assumption that data integrity is unaffected and that only the problem domain of offline physical compute and/or storage needs to be accounted for. 

So how does the scenario change if we add in the wrinkle of data integrity threats?    It changes everything.    For instance, with traditional Disaster Recovery some approaches simply rely on continual replication of data to a warm/hot standby.   This is a reasonable approach assuming the data has not been tampered with in any way.   But what if some of the records have been altered or deleted?   In this case you would simply be replicating corrupt data.   Sometimes this can be non-malicious in nature and include scenarios such as dropped database tables or other administrative errors.  To address RTO “Recovery Time Objective” and RPO “Recovery Point Objectives” certain processes need to be put in place to ensure those objectives can be met in light of potential data integrity threats. 

 

Cyber Resilient Solution Design Tenets:

Cyber Resilient Backup and Recovery is not just standard backup and DR.  

  • Must have enhanced data integrity checking at each phase of the backup and recovery process.
  • Requires enhanced “separation of duties” architected into the tech and process.
  • Needs to address administrative, logical, and physical separation from the primary backup and recovery infrastructure.
  • For structured data they must support single transaction level backup and recovery.
  • They must be accessible to hydration mechanisms of great speed.

 

Without these controls there is inadequate recovery assurance:

  • Changes since last backup may be lost
  • Corrupt backups not easily discovered
  • No comprehensive recovery validation & status
  • High Production Impact
  • Extensive CPU & I/O load for full + archived log backups
  • Impacts performance of business-critical applications
  • Ineffective Ransomware Protection
  • Ransomware can penetrate production, DR, and even backup
  • Backups not validated – unknown status, RTO, RPO
  • Ransom paid to hopefully regain access – no other option

 

Design tenets serve as a guide for solutioning.   State of the Art Cyber Resilience reference architectures distill these tenets into industry accepted constructs which include components such as air-gapped enclaves, immutable vaults, clean rooms and safe restoration environments. 

 

Figure 2 – Logical Cyber Resilient Reference Architecture

Figure-2

Structured and unstructured data are two key data types to account for when designing cyber resilient backup and recovery deployments.  

To meet RTO/RPO objectives for structured data a solution is required that allows for individual transaction level recovery.    In principle this means that transaction logs are archived at certain intervals and that they can be replayed to roll-forward or roll-back transactions to a specific point in time.   

Special care is needed to ensure that individual database transactions are protected from deletion or alteration.   This is where incorporation of immutability and logical air-gapping is required.

Oracle offers solutions for both on-prem and cloud workloads which facilitate immutability and transaction level recovery for Oracle Databases.    For on-prem workloads the combination of Oracle’s Zero Data Loss Recovery Appliance and immutable ZFS (with vaulting to OCI immutable Object Storage) ensures that structured data is protected at the transaction level.   For cloud deployments, OCI Oracle offers the Zero Data Loss Autonomous Recovery Service.   This solution is similar to the on-prem offering but is delivered as a fully managed PaaS service. 

 

Figure 3 – Oracle Zero Data Loss Autonomous Recovery Service in OCI

Figure-3

 

The diagram below shows the Zero Data Loss Recovery Service in action as part of a Cyber Resilient Reference Architecture:

Figure 4 – Oracle Zero Data Loss Autonomous Recovery Service Reference Architecture

Figure-4

Additionally, unstructured data should be versioned wherever possible.   Meaning that if a file changes it should have a copy of that change.    Object Storage in OCI supports versioning through simple bucket level configuration.     OCI File Storage now natively supports policy-based snapshots.   You can have point-in-time copies of data in snapshots to protect against accidental or unintended file deletions and modifications and take as many snapshots as you need. It is now possible to create Policy-based snapshots with cloning and file system-consistent File Storage replication to automatically create, replicate, and maintain snapshots at a different geographical location. You can create file system clones based on a policy-based snapshot to provide a separate writable copy of data at the source or target side of the replication.

Figure 5 – File Storage Snapshot Scenarios

Figure-5

 

Cyber-Resilient reference architectures such as the one illustrated below serve as a roadmap for designing Cyber Resilient solutions in OCI.    The sample reference architecture below solutions for both structured and unstructured data and adheres to the key design tenets of logical air-gapping, principle of least privilege, clean rooms, immutable vaulting and single transaction level recovery through the use of technologies such as the Zero Data Loss Autonomous Recovery Service, OCI Backup Services and Object Storage immutability and retention rule locks.

Figure 6 – OCI Cyber Resilience Reference Architecture

Click HERE to download a full draw.io copy of the OCI Cyber-Resilience Reference Architecture

Figure-6

 

Series Recap and Conclusion

Preparation for modern data integrity threats such as ransomware requires a full lifecycle approach that accounts for threat protection, detection and robust recovery to meet an organization’s most stringent RTO and RPO objectives.

 

 

Additional reference materials:

Website:  https://www.oracle.com/recoveryservice/

Blog:  https://blogs.oracle.com/maa/post/introducing-recovery-service

Cost Estimator:  https://www.oracle.com/cloud/costestimator/#/load&tag=ZRCV

Documentation: https://docs.oracle.com/en/cloud/paas/recovery-service/dbrsu 

Product demo: