ISV Architecture - Operations

June 9, 2020 | 9 minute read
Tal Altman
Sr. Manager
Text Size 100%:

This is the 5th blog in a series of blogs regarding the ISV Validated Design. 

This blog series will contain the following topics

  1. ISV Home Page
  2. ISV Architecture Validated Design 
    • Requirements, Design, Solution
    • Life of a packet
    • High Availability (HA) Concepts
  3. Core Implementation
  4. Failover Implementation – You can choose from the two options below for implementation
  5. Operations
    • How to add a customer to an existing POD
    • How to create a new POD
    • References, key files and commands

 

Note: This document assumes that the initial configuration and setup is done for at least one customer network, and that end to end connectivity was tested successfully

 

Introduction

This blog specifically will focus on operating the design we have been focusing on. This blog will also include some operational references to key files, commands and troubleshooting suggestions. ISVs will have a few different use cases that they will need to address in this design. The use cases include:

  1. Adding a new customer to an existing POD
  2. Adding a new POD to an existing set of vRouters

As you might remember a POD design contains a POD VCN that has up to 20 Customer VCNs that are attached in a hub and spoke model via local peering gateways (LPGs). By default a single VCN can have up to 10 LPG peerings. With a service limit increase we can raise the limit to 20. Operationally once you start to approach the upper limit you should start plan, test and implement a new POD into your existing set of vRouters. Some customers may decide to pre-provision a number of PODs in anticipation of future customer growth as well.

Use Case #1 - Adding new customers to an Existing POD:

Existing connectivity. In this use case, we have the following existing connectivity established.

Adding a new Customer to the network.

To add a new customer to this design there are a few basic steps you have to take. Any time you create a new Customer VCN, in this example named "New Customer 2", you will need to create a peering relationship by attaching Local Peering Gateways (LPGs) between the ISV-POD and the New Customer. After your peering relationship is established here are the steps to enable end to end routing between the Customer Network and the ISV Management network:

  1. In the new Customer VCN (New Customer 2), add a route table entry for the ISV Management server subnet (172.20.136.0/25) with the next hop being the locally attached LPG named "CUST-LPG"
  2. In the POD network (ISV-POD1), the local LPG named POD-LPG2 will need a static route to forward traffic from "New Customer 2" into the Linux Virtual Router listening on 1.1.1.10.
  3. In the POD network (ISV-POD1), the subnet that the vRouters are attached to, need a route table entries pointing to the LPG's for the Customer networks. When traffic comes back from the ISV management servers, it will land in the ISV-POD1 subnet (1.1.1.0/28), but it will need a route table entry to figure out which Local Peering Gateway is for Customer 1 versus Customer 2.
  4. In the Virtual Routers, you will need to put in static routes for the networks that are accessible behind ISV-POD1. In this example we add a static route for 172.20.138.0/24 with a next hop of 1.1.1.1 (the default gateway for the ISV-POD1 network). Remember to add the static route for each router, and make sure that the routes are persistent.
  5. In the ISV Management Server subnet, add a static route for the new customer network, 172.20.138.0/24, with a next hop of the vRouter (172.20.136.140).

 

Traffic flow and changes to the network. Make sure to re-evaluate your security lists as well to ensure that the new customer networks are able to reach the management servers.

 

How to add static routes in Linux

Update the Linux host with static routes to the customer. The next hop should be the OCI default gateway that segment. The OCI route tables will push the traffic to the correct LPG....

 

Note: Make sure to update both virtual routers with the same commands
[opc@vrouter1 ~]$ sudo ip route add 172.20.138.0/24 via 1.1.1.1 dev ens5
[opc@vrouter1 ~]$ sudo vi /etc/sysconfig/network-scripts/route-ens5
   172.20.138.0/24 via 1.1.1.1 dev ens5

 

Route summarization

 

Steps 4 and 5 above could leverage a "summary" route if the customer networks in a given POD are non-overlapping and contiguous. For example, in POD1 perhaps the following 16 customers VCN CIDRs are implemented as follows:

  • Customer 0: 10.1.0.0/24,
  • Customer 1: 10.1.0.0/24,
  • Customer 2: 10.1.2.0/24,
  • Customer 3: 10.1.3.0/24,
  • Customer 4: 10.1.4.0/24,
  • Customer 5: 10.1.5.0/24
  • ....up to:
  • Customer 15:10.1.19.0/24

So on the route tables instead of having 16+ route table entries it could be summarized into one, such as:

  • 10.1.0.0/20 (with a range of addresses from 10.1.0.0 - 10.1.15.255)

So the routing table now would be simple:

  • In the ISV management network add a route entry of 10.1.0.0/20 with a next hop of 172.20.136.140 to get the traffic from the management servers to the vRouters virtual IP, 
  • In the vRouters add a route entry of 10.1.0.0/20 with a next hop of 1.1.1.1 to push the traffic into the ISV-POD1 route table.

IP Address Management (IPAM) is a large topic, and for various reasons requires some advanced planning with customers to ensure they have a solid IPAM strategy.

Continuing with our example if you use route summarization you could use a /20 to summarize 16 customer networks at a time.

Note: IP subnetting is based on binary masks, so 2^4=16, the next largest mask is 2^5=32 but you can't have 32 peerings in a VCN at the moment.

 

Example IPAM strategy for the ISV PODs....

 

POD

Summary Network

Netmask

Range of addresses

1 10.1.0.0/20 255.255.240.0 10.1.0.0 - 10.1.15.255
2 10.1.16.0/20 255.255.240.0 10.1.16.0 - 10.1.31.255
3 10.1.32.0/20 255.255.240.0 10.1.32.0 - 10.1.47.255
4 10.1.48.0/20 255.255.240.0 10.1.48.0 - 10.1.63.255
5 10.1.64.0/20 255.255.240.0 10.1.64.0 - 10.1.79.255
6 10.1.80.0/20 255.255.240.0 10.1.80.0 - 10.1.95.255
7 10.1.96.0/20 255.255.240.0 10.1.96.0 - 10.1.111.255
8 10.1.112.0/20 255.255.240.0 10.1.112.0 - 10.1.127.255
9 10.1.128.0/20 255.255.240.0 10.1.128.0 - 10.1.143.255
10 10.1.144.0/20 255.255.240.0 10.1.144.0 - 10.1.159.255
11 10.1.160.0/20 255.255.240.0 10.1.160.0 - 10.1.175.255
12 10.1.176.0/20 255.255.240.0 10.1.176.0 - 10.1.191.255
13* 10.1.192.0/20 255.255.240.0 10.1.192.0 - 10.1.207.255
14 10.1.208.0/20 255.255.240.0 10.1.208.0 - 10.1.223.255
15 10.1.224.0/20 255.255.240.0 10.1.224.0 - 10.1.239.255
16 10.1.240.0/20 255.255.240.0 10.1.240.0 - 10.1.255.255

 

Note: After having 13 PODs you will have reached your maximum number of VCNs in a given tenant per region. You will have to engage engineering to see if its possible to go higher than 200 VCNs in a given tenancy in a given region.

 

If you need help summarizing your networks check out the Visual Subnet Calculator: http://www.davidc.net/sites/default/subnets/subnets.html

 

Use Case #2 - Adding a new POD to an existing set of vRouters:

In use case #2 we are transitioning from the current network topology:

Our goal is to implement a new topology such as the following:

Update the OCI Network and Instance configuration

  1. In the OCI Console create the following network elements
    1. Create new POD VCN name and CIDR block....(POD2 - 2.2.2.0/28)
    2. Create a regional subnet inside the new POD VCN
    3. Create a security list permitting SSH and ICMP to and from everywhere (0.0.0.0/0), or adjust the filters as appropriate
  2. In the OCI Console update your Virtual Router linux instance (vRouter) network settings (repeat on each vRouter).
    1. Create a new VNIC on each vRouter and choose the new POD network.
    2. Make sure each VNIC has skip source/destination enabled
    3. Add 1 secondary IP address (the floating Virtual IP - VIP) for that segment)
  3. Update the OCI Console Route Tables and create LPGs as necessary (see above for adding customers)
    1. Create a route table on POD-LPG1 for your Local Peering Gateways to access the management network (next hop is the 2.2.2.10). 
    2. Update the route table in the ISV-POD2 subnet so it knows that if the target address is 172.20.139.0/24 the next hop is POD-LPG1.
  4. SSH into each vRouter and Update the network configuration

 

Specific Linux commands to update each vRouter

All of these commands need to run as root, or with sudo privileges.
wget https://docs.cloud.oracle.com/iaas/Content/Resources/Assets/secondary_vnic_all_configure.sh
chmod a+x secondary_vnic_all_configure.sh
./secondary_vnic_all_configure.sh

So if the previous command shows you a new IFACE (such as ens6) you will use that going forward as the new interface to the new POD.

ip link set ens6 mtu 9000
ip addr add 1.1.1.8/28 dev ens6 label ens6
ip addr add 1.1.1.10/28 dev ens6 label ens6:0
vi /etc/sysconfig/network-scripts/ifcfg-ens6
  DEVICE="ens6"
  BOOTPROTO=static
  IPADDR=1.1.1.8
  NETMASK=255.255.255.240
  ONBOOT=yes
  MTU=9000
systemctl restart network

 

Pacemaker updates

In order to update the Pacemaker configuration you'll have to stop Corosync and pacemaker services on each box. If you edit the configuration while the services are running the cluster can act unpredictably. 

systemctl stop pcsd.service
systemctl stop pacemaker
systemctl stop corosync
cp /usr/lib/ocf/resource.d/heartbeat/IPaddr2 /usr/lib/ocf/resource.d/heartbeat/IPaddr2.ORIG

Editing the IPaddr2 file

Note: Make sure to collect the OCIDs for the new VNICs there were added on each vRouter.
  • If you want to modify the file in VI, make sure to jump to the blocks of text that start with OCI. 
  • I use the /OCI command in VI to find the block of text with OCI in it.
  • There are two sections of the file with OCI specific commands. 
    • The first section is the Variables we define
    • The second section is the actual commands to execute

In section 1 - ##### OCI vNIC variables, Add 3 new variables:

  1. vrouter1vnicpod2="<OCID of vRouter #1  VNIC attached to POD2 >"
  2. vrouter2vnicpod2="<OCID of vRouter #2  VNIC attached to POD2 >"
  3. vnicippod2="<floating IP>"

In section 2 - ##### OCI/IPaddr Integration, add a new command on each vRouter to move the VIP for the new POD. Make sure enter the command before the network restart commands....

  • For vrouter1 - 

    /root/bin/oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $vrouter1vnicpod2 --ip-address $vnicippod2
     

  • For vrouter2 - 

    /root/bin/oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $vrouter2vnicpod2 --ip-address $vnicippod2

Full Example

##### OCI vNIC variables
server="`hostname -s`"
vrouter1vnic="ocid1.vnic.oc1.ca-toronto-1.ab2g6ljrzowh2sa6ucqq2wjmawi6XYZ1"
vrouter1vnicpod1="ocid1.vnic.oc1.ca-toronto-1.ab2g6ljrlujfgyav4frscb7uXYZ2"
vrouter1vnicpod2="ocid1.vnic.oc1.ca-toronto-1.ab2g6ljrizfu73egoxxvvixjXYZ3"
vrouter2vnic="ocid1.vnic.oc1.ca-toronto-1.ab2g6ljrfqopd3j6qdm3xlqhtsghXYZ4"
vrouter2vnicpod1="ocid1.vnic.oc1.ca-toronto-1.ab2g6ljr7lkic77hy4geg65cXYZ5"
vrouter2vnicpod2="ocid1.vnic.oc1.ca-toronto-1.ab2g6ljrforjbxlo2kuxopzjXYZ6"
vnicip="172.20.136.140"
vnicippod1="1.1.1.10"
vnicippod2="2.2.2.10"
 
##### OCI/IPaddr Integration
        if [ $server = "vrouter1" ]; then
                /root/bin/oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $vrouter1vnic  --ip-address $vnicip
                /root/bin/oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $vrouter1vnicpod1  --ip-address $vnicippod1
                /root/bin/oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $vrouter1vnicpod2  --ip-address $vnicippod2
                /bin/systemctl network restart
        else
                /root/bin/oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $vrouter2vnic  --ip-address $vnicip
                /root/bin/oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $vrouter2vnicpod1  --ip-address $vnicippod1
                /root/bin/oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $vrouter2vnicpod2  --ip-address $vnicippod2
                /bin/systemctl network restart
        fi
systemctl start pcsd.service
systemctl start pacemaker
systemctl start corosync

Verify PCS cluster status (pcs status command)

Test failover by stopping vRouter1 and verify that the secondary IP addresses move to the new active router. So if vRouter1 is active, force it to stop and see if the VIP moved to vRouter2.

pcs cluster stop vrouter1

 

How to troubleshoot connectivity

  1. Verify that new LPGs are in a "peered" / "established" relationship
  2. Go the the vRouters, run the "pcs status" command as root to see which router is active
  3. Ping from a management server to your customer host
  4. Turn on tcpdump on the active router. If you can see ICMP flowing then you know the traffic is making it to your virtual router. If you see ICMP echos but not reply, verify that the route tables and security lists are updated as needed to reach the target and return back to the router.
# tcpdump -i ens5 host 

 

Additional references

 

Key log files

  • cat /var/log/cluster/corosync.log

  • cat /var/log/pacemaker.log

  • cat /var/log/pcsd/pcsd.log
systemctl stop pcsd.service
systemctl stop pacemaker
systemctl stop corosync
 
mv /etc/corosync/corosync.conf /etc/corosync/corosync.bad
mv /etc/pacemaker/authkey /etc/pacemaker/authkey.bad
 
 
systemctl start pcsd.service

Key Files/Folders

/etc/sysctl.d/98-ip-forward.conf
/etc/sysctl.d/97-reverse-path-forwarding.conf
 
secondary_vnic_all_configure.sh
 
/etc/sysconfig/network-scripts
ifcfg-ens3
ifcfg-ens3:0
route-ens3
 
ifcfg-ens5:0
ifcfg-ens5
route-ens5
 
~/.oci/config
 
/usr/lib/ocf/resource.d/heartbeat/IPaddr2
/usr/lib/ocf/resource.d/heartbeat/IPaddr2.ORIG
 
/etc/corosync/corosync.conf
/etc/pacemaker/authkey
/var/log/cluster/corosync.log
/var/log/pacemaker.log
/var/log/pcsd/pcsd.log

 

Key Commands

systemctl stop pcsd.service
systemctl stop pacemaker
systemctl stop corosync
pcs status
pcs cluster stop vrouter1
pcs cluster start vrouter1
 
 
firewall-cmd --permanent --add-service=high-availability
firewall-cmd --reload
systemctl stop firewalld
systemctl disable firewalld
 
 
systemctl restart network
 
oci setup config
 
ip route add
ip addr show
ip link show
ifconfig
 
 
tcpdump -i ens5 icmp
tcpdump -i ens5 host 1.1.1.1

 

Tal Altman

Sr. Manager


Previous Post

Extending SaaS with Cloud Native (Part 2 - Network Connectivity)

Maximilian Froeschl | 3 min read

Next Post


How to Implement an OCI API Gateway Authorization Fn in Node.js that Accesses OCI Resources

Muhammad Abdel-Halim | 13 min read