ECMP is a great feature to have in order to utilize in an active-active load balancing and failover of network traffic between On-Premise and OCI. In our new blog post, we will discuss ECMP capability on the new DRGv2.
Load balancing of network traffic over multiple FastConnect virtual circuits or multiple IPSec tunnels (but not a mix of circuit types) using a maximum of eight circuits is now possible on OCI using the DRGv2.
The protocol, destination IP, source IP, destination port, and source port are used to distinguish flows for load balancing purposes. Therefore, multiple flows are necessary to utilize all available paths.
ECMP is off by default and can be enabled on a per-route table basis. We will discuss more activating the ECMP in the Configuration section.
You can read more about DRGv2 at this link: https://docs.oracle.com/en-us/iaas/Content/Network/Tasks/managingDRGs.htm
In order to grasp the concept, we will use a simple networking topology focusing on the most important parts of DRGv2 ECMP.
VCN 1 has two VMs created at 10.0.0.173 and 10.0.0.227 and used for traffic testing. VCN 1 is attached to the DRGv2.
On-Premise, we have another two VMs at 172.31.0.2 and 172.31.0.10 and these VMs will receive the traffic from OCI VMs defined above.
We will use two IPSec tunnels between the On-Premise CPE and DRGv2 and we will run BGP on top for route exchange. The public IP addresses of the IPSec headends are not depicted in the diagram since these are not so important in our discussion. On the other hand, the IP addresses used for creating the two BGP sessions are listed, since we will perform some packet captures on the CPE vti 2 and vti 3 interfaces to see how the DRGv2 is using the paths when ECMP is activated and the traffic have different source/destination IPs per flow.
The On-Premise CPE is announcing on both BGP sessions a default route. The CPE has the ECMP activated.
Note: The BGP over IPSec is used as an example. You can use FastConnect Private Peering with multiple VCs to accomplish the same scope.
2.1 Configure the BGP over IPSec and make sure the BGP sessions are UP on both tunnels
2.2 On the DRGv2 create a Route Import, attach the Route Import to a new created Route Table, attach the Route Table to VCN 1 attachment
2.3 On the DRGv2 create a Route Import, attach the Route Import to a new created Route Table, attach the Route Table to both IPSec tunnel attachments
Note: You can use a separate Route Table for each IPSec tunnel if you have a strict requirement of IP prefixes that needs to be sent On-Premise on each tunnel.
2.4. In the RT_FC_VC add a static route to the VCN 1 CIDR using VCN 1 as a next-hop attachment, the CIDR is 10.0.0.0/24
This configuration will trigger a BGP advertisement on both tunnels from DRGv2 to CPE for 10.0.0.0/24. You can also use a rule on the Route Import attached to the route table and dynamically import the VCN 1 subnets.
Let's check how the CPE route table looks (remember that CPE has the ECMP function activated):
The CPE has imported 10.0.0.0/24 via two different next-hops in the routing table and will load-balance the traffic to 10.0.0.0/24 across both next-hops.
2.5 In the Route Import for Route Table attached to VCN 1 add a rule to import what DRGv2 received over IPSec Tunnel1 and Tunnel2 attachments from the CPE, the default route originated in BGP by the CPE
The Route Table for VCN 1 will have the following routes:
As we can see, both default routes have been imported but one is marked as a Conflict. This means the DRGv2 will not use at all for forwarding the traffic. Why did this happen?
We want to enable the ECMP on the Route Table for VCN 1 attachment, so probably this is due to the fact that ECMP is not enabled for this Route Table? Remember, ECMP is disabled by default and can be enabled per Route Table basis.
Let's activate the ECMP for this Route Table. Edit the Route Table and check ECMP (by default this is disabled):
Now, we need to wait for few seconds and the routes in the Route Table will show both as Active and both will be used for traffic forwarding:
For testing scenarios we will generate traffic from OCI to On-Premise in the following way:
10.0.0.227 -> 172.31.0.2
10.0.0.173 -> 172.31.0.10
The source and destination are different for each flow and this will trigger the ECMP to be implemented by the DRGv2. In order to prove the ECMP is implemented, we will start two tcpdumps on the CPE over vti2 (Tunnel1) and vti3 (Tunnel2) interfaces, each interface is connected to its respective OCI BGP peer.
As we can see, the traffic from 10.0.0.173 to 172.31.0.10 is using vti3 (Tunnel2) and the traffic from 10.0.0.227 to 172.31.0.2 is using vti2 (Tunnel1). You can check the vti to Tunnel mapping in the Networking Diagram from point 1.
Now, both our tunnels are being used by the DRGv2 with the new ECMP feature activated. Wonderful, isn't it?