OCI DNS Traffic Steering - Oracle Analytics Cloud Disaster Recovery

April 5, 2023 | 8 minute read
Radu Nistor
Principal Cloud Solution Architect
Text Size 100%:

Oracle Analytics Cloud is a scalable and secure Public Cloud Service that provides a full set of capabilities to explore and perform collaborative analytics for you, your workgroup, and your enterprise. If you want to learn more about the various deployment models for OAC, take a look at this blog. In this entry, we will create a Disaster Recovery solution for OAC instances deployed in the Public mode.

 

Oracle Analytics - Disaster Recovery

When we are thinking Disaster Recovery we are considering a solution that meets both the RPO and RTO values for our company related to the service. To get the best values we need as much automation as possible. We will focus on automating the DNS failover of two Public OAC instances, deployed in different regions but we will not cover any OAC configuration beyond basic deployment and enabling a Vanity URL.  The main steps are:

  1. Deploy a Primary OAC instance in region 1.
  2. Configure a Vanity URL for region 1.
  3. Deploy a Backup/DR OAC instance in region 2.
  4. Configure the same Vanity URL for region 2 as it was configured for region 1.
  5. Enable OCI’s DNS Traffic Steering for automatic failover between the two instances.

As said above, this entry focuses on the automation of the DNS part (step 5). However, it is important to note that you need to configure both OAC instances in such a way that the backup instance can become primary at any given point in time. Some of the solutions would be:

   a. Have both instances up and running at the same time with identical configuration – with this solution the recovery will take only a few minutes mostly because of the DNS propagation time. Of course, this solution implies you will pay for two full OAC instances.
   b. Take regular backups of the configuration of the primary to the region Object Storage. Have the backup files copied to the DR’s region Object storage. Keep the DR instance paused. In the case of a DR event, bring the backup online and restore the configuration from Object Storage. This solution is more cost-effective but it will provide longer recovery times.

For a guide on how to set the OAC backup solution and a manual switchover solution, please read this excellent blog.

Before we begin the demo please note the following prerequisites:

  • The DNS domain used in the Vanity URL must be handled by OCI’s Public DNS service.
  • You need the certificate, certificate chain up to the root, and the private key for the Vanity URL hostname.

DEMO

For the demo part I will deploy a primary instance in OCI’s region Frankfurt and a DR instance in the Amsterdam region.

1. Deploy the primary OAC instance.  The instance must be Public. After it is deployed, OCI will assign a Public DNS name and a Public IP.

oac1

2. Create a Vanity URL. We need to provide the correct certificates, certificate chain and key. I will use the following Vanity URL for both OAC instances: oac.oci-lab.cloud

oac3

oac-vanity

After the OAC instance updates, test the reachability to the Vanity URL in a browser.

oac-test-vanity

3. Deploy the DR instance in another region. For this demo I will deploy in Amsterdam. Again, OCI will assign a Public DNS name and IP.

oac-ams

4. Configure the same Vanity URL for the DR instance. It has to be the same as the Vanity URL configured on the primary instance.

oac-ams-vanity

5. DNS Traffic Steering.

OCI’s DNS Traffic Steering solution is great for providing DNS Load Balancing solutions and Active/Fail-over architectures. If you want to read more about it, please follow this link.

For our scenario, we have 2 parts:
- We need a Health Check policy that monitors both OAC instances from Vantage Points outside OCI. For more information on this service please check our documentation.
- We will create a Traffic steering policy that takes input from the Health Check service and decides to which instance the Vanity URL will resolve to.   
To put those 2 points on a diagram, it would look like this:

diagram

a. Health Check – Go to Burger menu -> Observability & Management -> Monitoring and click Health Checks. Create a new health check:
- Name it whatever you like.
- For target input both OAC instances by their Oracle generated DNS name
- For Vantage Points select 2-3 nodes that are somewhat geographically close to both OAC instances.
- The request type is mandatory HTTP (ICMP will not work).
- Protocol is HTTPS.
- Target path should be /ui/.
- Method can be either HEAD and GET, I will use HEAD.
- The Timeout and Interval values can be tuned to make the policy as aggressive as you want it.

In the end, it should look similar to this:

hcpol

After you create the policy give it a few minutes to get some output. It should show that both instances are reachable.

hcoverview

b. Traffic Steering policy
Go to Networking -> DNS Management -> Traffic management steering policy and press create:
- Type is failover.
- Give Pool 1 a name that references the Frankfurt OAC setup.
- Pool 1 Answer type is CNAME and RDATA is the Frankfurt OAC Oracle provided name ( ex:  fraoac-ociateam-fr.analytics.ocp.oraclecloud.com).
- Pool 2 configuration is similar to pool 1 but we will reference the Amsterdam DR OAC instance.
- Pool priority will be pool 1 first, followed by pool 2.
- Under Attach health check select the health check we created in the previous step.
- Under Attached domains we need to put the DNS domain and the host that the policy will reference. In this case it will be the Vanity URL. Note that the domain must be configured in OCI under the Public DNS service.

The configuration should look like this:

tfmng1

tfpol2

After you create the policy, you should see a result similar to this:

tfpoloverview

And if we try to resolve the Vanity URL:

nslookup

 

Testing

Let’s do a quick test and see how this work. To simulate a failure with the primary Frankfurt OAC instance we can enable OAC Access Control to limit the IPs that can reach the instance. This will make the AWS/Azure Vantage points of the health check service report the instance down and the policy will move the resolution to the DR instance. Let’s add a rule to only allow traffic to the OAC instance from 192.168.0.50 (random private IP).

accesscontrol1

acccontrol2

After we enable the traffic restriction to the primary instance we should see the health check policy report it cannot reach it:

hcfail

And the traffic steering policy reporting the node as unhealthy.

polfail

And, of course, the steering policy will redirect all traffic for our Vanity URL to the DR instance.

amsnslookup

When we remove the restriction from the primary, the traffic policy will redirect all traffic back to it automatically.

And this concludes our demo!

Radu Nistor

Principal Cloud Solution Architect


Previous Post

Overview of DRG Route Tables and Import Distribution Lists - Part 2

Raffi Shahabazian | 10 min read

Next Post


Why Infrastructure as Code Matters

Shea Nolan | 4 min read