By having multiple connections to OCI services, you may have a few questions during the routing configuration on the On-Premise router/firewall to avoid the asymmetric routing.
The scope of this blog is to explain how the On-Premise CPEs needs to be configured in order to access Oracle Services Network for short OSN when we have multiple paths: FastConnect Private and Public Peering and Internet.
We will also explain how to avoid the asymmetric routing that can occur when OSN is initiating connections to customer On-Premise machines from services as: Oracle Analytics Cloud, Oracle Data Integration Platform Cloud, Oracle Integration Cloud. This type of connection is also known as Service to Consumer and in some cases (documented below) can be asymmetric.
Note: To explain the configuration steps and the testing part we used a Cisco router (On-Premise) and simulating the DRG (to have full control on the testing part). Any device supporting BGP, NAT and Policy Based Routing can be used instead on the On-Premise network.
Topology Architecture Diagram
A) Traffic initiated from On-premise to OSN:
Below topology diagram reflects the case when the On-Premise hosts are initiating connections to OSN using the following routing preference:
1) Prefer Private FastConnect;
2) If Private FastConnect experiences issues use the Public FastConnect;
3) If Public FastConnect experiences issues use the Internet;
For CPE configuration, a Cisco IOS device has been used. Any device supporting BGP, NAT and Policy Based routing can be used instead.
The green line represents the Private Fast Connect, the purple line represents the Public FastConnect and the red line represents the Internet.
B) Traffic initiated from OSN to On-Premise public IPs via Public FastConnect or Internet (asymmetric traffic can occur):
1) Prefer Public FastConnect (if exists) to initiate connection to On-Premise public IP from OAC ;
2) If Public FastConnect is not available use the Internet to initiate connections from OAC to On-Premise public IP;
The answer is sent from On-Premise host to OAC using the Private FastConnect link (asymmetric traffic might occur);
On the customer side we are using the following interfaces:
- Interface to peer with the SP, fa1/0 (10.0.0.5/30 CPE Internet facing interface IP);
- Interface to peer for FastConnect Public Peering, fa0/0 (10.0.0.17/30 CPE Public Peering interface IP);
- Interface to peer for FastConnect Private Peering, fa0/1 (10.0.0.13/30 CPE Private Peering interface IP);
- Internal interface to receive the traffic from On-Premise hosts, fa2/0 (10.0.0.1/30 CPE Internal interface);
Later we will also use the feature called Policy Based Routing and we are considering that the audience is familiarized with this feature available on most Routers and L3 switches.
A) Traffic initiated from On-premise to OSN (is using the networking topology defined at point A in the preceding section).
A1) OSN CIDRs announced via BGP from Oracle side in the same way on Public and Private Peering (for example 188.8.131.52/24 announced on both FCs):
The customer on the edge router/firewall has the BGP connections: one with their SP (they can receive BGP routes from Internet or the customer can have just a default route pointing to the SP), another BGP session over the FC Public Peering and a third one over the FC private peering.
The OSN in this example is allocated with the CIDR: 184.108.40.206/24.
The customer is using as a NAT POOL (when Public Peering or Internet is used) the public IPs from the range 220.127.116.11-18.104.22.168/24 to perform PAT.
The NAT will be activated only for interfaces fa1/0 and fa0/0 as outside interfaces and on fa2/0 as inside interface. On the fa0/1 there will be no NAT because using the private peering all the traffic will flow from the customer side using private IPs as source.
To Internet and to Public Peering the customer will announce only the public prefix used for NAT, in this case 22.214.171.124/24 (as example):
The OSN prefix received:
- over Internet will have the BGP local_pref of 200
- over Public Peering will have the BGP local_pref of 500
- over Private Peering will have the BGP local_pref of 1000
a) BGP, route-map and ip prefix-list configuration:
The BGP table on the edge router/firewall:
b) PAT configuration:
The above configuration will translate any source private IP from On-Premise to the public IP only if the exit interface is the interface used for FC Public Peering or the Interface connected to the Internet.
c) Traffic test over the Private Peering using the path CPE->DRG->SGW->OSN:
This packet capture is done on the private FC link, we can see that the traffic is flowing using the private IPs as source, no NAT is required from the customer side. The SGW is handling the source NAT for the customer.
d) Traffic test over the Public Peering:
To simulate a failure of Private Peering and observe the switchover process we will simulate a failure by putting the FC private peering interface in the down state (BGP keepalive of 60s and hold-down of 180s are used):
1. We put the interface in down state;
2. The traffic has been interrupted;
3. Hold down expires on the edge router/firewall, the BGP route through private peering is removed from the routing table and the next one through the public peering is inserted;
4. Traffic started to work again but now it is NAT-ed (the packet capture done on the Public FC reveal the source IP address of 126.96.36.199);
As a side note: As we can see that the route received from the Internet has an AS path of two: 65535 and 31898. In this case the AS path lengths is enough to prefer the public FC instead of Internet. It is up to the customer if they want to assign a local_pref for the Internet routes or not. In this case the local_pref is used also for the Internet routes.
e) Traffic test over the Public Internet:
To simulate a failure of Public Peering and observe the switchover process we will simulate a FC public peering failure by putting the FC public peering interface in the down state (BGP keepalive of 60s and hold-down of 180s are used):
1. We put the interface in down state;
2. The traffic has been interrupted;
3. Hold down expires on the edge router/firewall, the BGP route through public peering is removed from the routing table and the next one through the public Internet is inserted;
4. Traffic started to work again NAT-ed (the packet capture done on the Internet link reveal the source IP address of 188.8.131.52):
As we can see, the customer is not doing any operation to move the traffic to a specific link, all is done automatically by the BGP intelligence and using the above NAT configuration.
A2) OSN CIDRs announced via BGP as more specific routes on Private Peering and less specific over Public Peering (for example 184.108.40.206/16 announced on Public Peering and 220.127.116.11/24 on Private Peering):
Suppose that we are in a region and we are announcing 18.104.22.168.0/24 over the Private Peering and 22.214.171.124/16 over a public Peering and over the Internet.
In this case, based on the configuration made above, the BGP table will look like:
Based on the longest match rule the traffic will prefer the Private Peering – no PAT.
If Private Peering is down, then the next route used is over the Public Peering with PAT and if Public Peering is down it will prefer the path through the Internet.
B) Traffic initiated from OSN to On-Premise public IPs via Public FastConnect or Internet (asymmetric traffic can occur - using the networking topology defined at point B in the preceding section).
In OSN we have couple of services (https://www.oracle.com/cloud/networking/service-gateway.html) - in our case we will use OAC as an example that can initiate connections back to customer On-Premise public IP or VCN VMs. These types of connections are usually called Service to Consumer. The case discussed here is when the OAC is initiating connections to On-Premise machines when the public and private FastConnect are configured. If Public FastConnect is not configured, the Internet is used for routing the traffic from OAC to customer public IP located On-Premise. The private FastConnect is not used by OAC as of now because the SGW is not supporting connections from OAC to the customer On-Premise.
The asymmetric routing that can occur (follow the numbers from the networking diagram):
- OAC is initiating the traffic to customer On-Premise via public FastConnect (if exists) - if public FastConnect does not exist Internet is used for reaching the On-Premise host;
- When the On-Premise host is responding to OAC, based on the configuration that we already have in place (preferring private FastConnect to reach OAC) the return traffic will follow the path: private FastConnect → DRG → SWG. The SGW will drop this traffic being asymmetric (the initial connection did not flow over the SGW and the SGW will drop the traffic);
In order to avoid this behavior, we need to create a Policy Based Routing and activate it on-demand (when OAC→ On-Premise traffic should occur) on the CPE interface receiving the response from the On-Premise host. We will use the Policy Based Routing only for destination public IP matching OAC.
# define the access-list to catch the traffic that needs to be routed using the policy based routing and not using the routing table that is directing the traffic to 126.96.36.199 (representing OAC instance in OSN as an example) via the FC private peering:
# define the next-hop for this traffic which is the Public FastConnect peer (10.0.0.18/30) or peer Service Provider IP address (10.0.0.6/30) if Public FastConnect is not available (in our case we consider public FastConnect is available):
# apply the PBR on the CPE interface that is receiving the response traffic from the On-premise host (the router will do also NAT):
The routing table on the CPE looks like to route the traffic back to 188.8.131.52 it uses the next-hop 10.0.0.14 (private peering):
Now, we want to send the traffic to the next-hop 10.0.0.18 which is the public FastConnect peer based on the PBR defined above.
The debug on the CPE confirms that for forwarding this traffic, the next-hop 10.0.0.18 has been used as per the PBR defined above and applied on the interface:
The Policy Based Routing should be applied only when OAC is initiating the traffic to customer On-Premise using the On-Premise public IP and when the FastConnect Private peering is configured and preferred for OSN services.