Best Practices from Oracle Development's A‑Team

Deploying a Highly Available Windows File Server on OCI

Leo Yuen
Cloud Solutions Architect

For application workload that requires the capability of SMB file sharing, there are a number of solutions available on OCI, from High Performance Bare Metal deployment to Open Source Samba running on Linux virtual machines. Depending on different workload requirements, customer may choose the most optimal solution for their use cases.

This blog will take you through the setup of a Windows file server running on virtual machine that uses native Windows technologies, namely Windows Failover Cluster to provide high availability of file sharing service and DFS replication to protect and synchronize your data among the cluster nodes.

This solution uses OCI Compute VM shapes as well as OCI Block Volume which are both elastic and scalable, customers may start with a small deployment and grow the solution as required with minimal service interruption.

Our sample deployment is depicted in the following diagram:

This sample deployment is deployed in a single availability domain region that has 3 fault domains.

There are 4 VMs in the sample deployment, the two on Fault Domain 1 are the IIS server running on Windows 2012 and a Domain Controller running on Windows 2016 Standard edition.

The IIS server is used to simulate customer workload that requires SMB file sharing access. For customers who have Windows-based workloads, they may already have domain controllers running in their environment that can be used for this solution without creating a new one.

In production environments, domain controllers should be setup in high availability mode. If a domain controller failed, there should be no impact to a cluster that is already running, but it may affect failover operation.

How to deploy domain controllers in HA mode is not covered in this blog.

A 2 node cluster is created with 2 VM instances running on different Fault Domains. Each one of them has its own set of block volumes attached to it. The size of the block volumes attached to the cluster nodes should be the same and they should be configured in the same way including the file system format and drive letter assignment.

Customers may choose different VM shapes and the number of block volumes attached to the instances as well as their size and performance characteristic according to their requirements.

A cluster IP and an additional secondary IP are allocated for the cluster, where the additional secondary IP is used by the client to access the shared folder. The IP addresses are configured on the master node of the cluster and when failover occurs, both of them should be migrated to the surviving node.

During the process of failover, the client will experience a slight pause of the file sharing service. Most operations will just resume after the failover succeeded. In case if some operations timeout and fail, the application should retry the failed operation.

This solution works for all Windows Server versions, i.e. 2019, 2016, 2012 R2 available on OCI.

Setting up the cluster

This section describes the steps to create and configure the failover cluster on OCI and they include the following:

  1. Provision VM instances and block volumes
  2. Configure Block Volumes on Windows
  3. Create and configure secondary IP addresses
  4. Create a Windows failover cluster
  5. Install and Configure DFS replication
  6. Configure script to handle secondary IP address failover

1. Provision VM instances and block volumes

In the sample deployment, VM.Standard.E2.2 shape is used to create the instances, Oracle provided Windows Server 2016 Standard edition image is used.

Notice that node01 and node02 are located in different Fault Domains.

Two block volumes are created and they are called LUN01 & LUN02, both of them are 50GB in size.

Attach LUN01 to node01 and attach LUN02 to node02, either paravirtualized mode or iSCSI can be used.

2. Configure Block Volumes on Windows

Once the block volume is attached to the windows instance, format the disk with NTFS and assign a drive letter to it. The same drive letter should be used in both instances.

The following screenshot shows the Disk Management window, LUN01 is formatted with NTFS and is assigned the drive letter O.

3. Create and configure secondary IP addresses

From the OCI Console, select node01 and create 2 additional private IP addresses in the VNIC attached.

In the following screenshot, is for cluster IP and is for file sharing.

There is no need to create additional IP addresses on node02 because these are floating IP addresses.

The IP addresses should be statically assigned.

Nothing needs to be done within the Windows operating system at this time.

4. Create a Windows failover cluster

Follow this Microsoft documentation to create a failover cluster with the 2 windows instances

5. Install and Configure DFS replication

Once the cluster is successfully created. Install the DFS replication feature on the windows instances and follow this Microsoft documentation to create a DFS replication group.

6. Configure scripts to handle secondary IP address failover

When failover occurs on a Windows failover cluster, the cluster IP address configuration on the Operating System level will be handled by the Windows software. However it will not migrate the underlying private IP address that was created in step 3 from the failed node to the surviving node. The scripts here provides exactly this capability.

However, in our deployment, the file sharing service is not using the cluster IP but an additional secondary IP address. The scripts needs to be enhanced to handle this additional IP address. Details on what need to be changed will be described in details in a future blog post.

So stay tuned for now.