Automating Secondary IP address Failover on Windows Server

July 30, 2020 | 4 minute read
Leo Yuen
Cloud Solutions Architect
Text Size 100%:

In a previous blog, I introduced a solution on how to deploy a highly available Windows File Server on OCI, so that workload requires SMB file sharing capability can run on OCI. However, a crucial part is not described in details that is how to automate the handling of secondary IP address failover when a failover event happens. Instead of creating the scripts from scratch, this blog will take you through the steps on how to modify the scripts in an existing repository available on GitHub provided by Oracle to fulfil this functionality.

Before we dive into the details of changes needed, please follow the prerequisite section of the repository to setup the environment. After you have completed the tasks listed out in the prerequisite section, you should have Python and OCI SDK ready in your environment.

To simplify the configuration of the scripts, I recommend you to setup a dynamic group that contains the two Windows instances in your failover cluster and allow them to make OCI API calls.

Assuming that you have put the scripts in a directory called "C:\oci-msfailovercluster" on both cluster nodes:

The script that we are going to modify is called "oci-mscluster-instance-principals.py".  The functionality of this script is equivalent to the one called "oci-mscluster.py", the only difference is one uses Instance Principal as the authentication method to invoke OCI API calls while the other uses a config file instead.

Here are the overall steps we need to go through:

  1. Create and configure scripts to set/unset a secondary IP address
  2. Modify the script "oci-mscluster-instance-principals.py" to handle failover of the secondary IP address
  3. Modify "settings.json" to add a new parameter
  4. Copy the scripts to the other node

Let's look at each step in details:

1. Create and configure scripts to set/unset a secondary IP address

Create a script called "set-secondary-ip.ps1" with the content shown on the screenshot below and save it to the same directory where the existing scripts are located. In this example, it is "C:\oci-msfailovercluster"

Configure the parameter $NetAdapterName with the name of the network adapter in your environment. Use the powershell command "Get-NetAdapter" to obtain the name:

Configure $IP and $MaskBits to the secondary IP address and its network mask respectively.

Create a script called "remove-secondary-ip.ps1" with the content shown on the screenshot below and save it to the same directory where the existing scripts are located. In this example, it is "C:\oci-msfailovercluster"

Configure the parameters $NetAdapterName and $IP accordingly.

2. Modify the script "oci-mscluster-instance-principals.py" to handle failover of the secondary IP address

1. Add a new parameter called "private_ip_id" to hold the OCID of the secondary IP address:

In line 28, modify the following line

list = ['node1_name', 'node2_name', 'private_ip_id_default_cluster', 'vnic_1', 'vnic_2']

to

list = ['node1_name', 'node2_name', 'private_ip_id_default_cluster', 'vnic_1', 'vnic_2', 'private_ip_id']

2. Add the following line after line 64

private_ip_id = str(settings['private_ip_id'])

3. Locate the function called "first_contact", modify the block of code starting from line 86 to the following where the new lines are in bold:

if var1.startswith(default_cluster_name):
    var2 = str(var1.split()[int(len(default_cluster_name.split(" ")))])
    if var2 == node1_name or var2 == node2_name or var2 == skip_dr_node_name:
        history_nodes.append(var2)
        print ('New MASTER DEFAULT NODE detected --> ' + var2)
        error_log('New MASTER DEFAULT NODE detected --> ' + var2)
        if var2 == node1_name:
            assign_to_different_vnic(private_ip_id_default_cluster, vnic_1)
            assign_to_different_vnic(private_ip_id, vnic_1)
        elif var2 == node2_name:
            assign_to_different_vnic(private_ip_id_default_cluster, vnic_2)
            assign_to_different_vnic(private_ip_id, vnic_2)
        elif var2 == skip_dr_node_name:
            pass
        if var2 == os.environ['COMPUTERNAME'].lower():
            error_log('Setting secondary IP address')
            subprocess.Popen(['C:\\windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe',
                              'C:\\oci-msfailovercluster\\remove-secondary-ip.ps1'],
                              shell=True,
                              stdout=subprocess.PIPE)
            subprocess.Popen(['C:\\windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe',
                              'C:\\oci-msfailovercluster\\set-secondary-ip.ps1'], 
                              shell=True, 
                              stdout=subprocess.PIPE)
            error_log('secondary IP address configured!')
        else:
            error_log('Removing secondary IP address')
            subprocess.Popen(['C:\\windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe',
                              'C:\\oci-msfailovercluster\\remove-secondary-ip.ps1'], 
                              shell=True, 
                              stdout=subprocess.PIPE)
            error_log('secondary IP address removed')

4. Locate the "While True" loop in the script, modify the block of code starting from line 135 to the same listed in previous step.

3. Modify "settings.json" to add a new parameter

Add a new parameter called "private_ip_id" and specify the OCID according to your environment:

Don't forget to put a comma at the end of the line above the new parameter.

4. Copy the scripts to the other node.

Copy all the modified scripts and settings.json to the other node.

Finally, register the script "oci-mscluster-instance-principals.py" in Task Scheduler on both cluster nodes and ensure it is running:

At this point, we have everything ready and your windows cluster is now able to handle secondary IP address failover automatically.

 
 
 
 

Leo Yuen

Cloud Solutions Architect


Previous Post

Knowledge Graph Modeling: Formal Taxonomy micro-pattern using gist

Michael J. Sullivan | 3 min read

Next Post


Connecting to Oracle Analytics Cloud Private Endpoint from within the VCN

Dayne Carley | 4 min read