*Reposted with permission from Oracle’s Networking Blog and Neeraj Gupta
As the name suggests, bonding driver creates a logical network interface by using multiple physical network interfaces underneath. There are various reasons to do so, including link aggregation for higher bandwidth, redundancy, high availability etc. Upper layers communicate through the logical bond interface which has an IP address but eventually the active physical interface(s) communicate to lower layer 2. It also provides transparency to upper layers by hiding the actual interface.
Other than specifying the physical interfaces part of the logical bond interface, we also specify how to we want this whole thing to work. There are several possible configurations but I am only going to focus on one mode which is called "Active Backup" and has a numerical identifier as 1.You can actually list all parameters of the kernel bonding driver installed in your system. Look for the lines beginning with 'parm' below. You can work with these options in /etc/modprobe.conf or some of them can also be worked with directly under each bonding interface via /etc/sysconfig/network-scripts/ifcfg-bond1
[root@hostA ~]# modinfo /lib/modules/2.6.32-100.23.80.el5/kernel/drivers/net/bonding/bonding.ko filename: /lib/modules/2.6.32-100.23.80.el5/kernel/drivers/net/bonding/bonding.ko author: Thomas Davis, firstname.lastname@example.org and many others description: Ethernet Channel Bonding Driver, v3.5.0 version: 3.5.0 license: GPL srcversion: 4D5495287BB364C8C5A5ABE depends: ipv6 vermagic: 2.6.32-100.23.80.el5 SMP mod_unload parm: max_bonds:Max number of bonded devices (int) parm: num_grat_arp:Number of gratuitous ARP packets to send on failover event (int) parm: num_unsol_na:Number of unsolicited IPv6 Neighbor Advertisements packets to send on failover event (int) parm: miimon:Link check interval in milliseconds (int) parm: updelay:Delay before considering link up, in milliseconds (int) parm: downdelay:Delay before considering link down, in milliseconds (int) parm: use_carrier:Use netif_carrier_ok (vs MII ioctls) in miimon; 0 for off, 1 for on (default) (int) parm: mode:Mode of operation : 0 for balance-rr, 1 for active-backup, 2 for balance-xor, 3 for broadcast, 4 for 802.3ad, 5 for balance-tlb, 6 for balance-alb (charp) parm: primary:Primary network device to use (charp) parm: lacp_rate:LACPDU tx rate to request from 802.3ad partner (slow/fast) (charp) parm: ad_select:803.ad aggregation selection logic: stable (0, default), bandwidth (1), count (2) (charp) parm: xmit_hash_policy:XOR hashing method: 0 for layer 2 (default), 1 for layer 3+4 (charp) parm: arp_interval:arp interval in milliseconds (int) parm: arp_ip_target:arp targets in n.n.n.n form (array of charp) parm: arp_validate:validate src/dst of ARP probes: none (default), active, backup or all (charp) parm: fail_over_mac:For active-backup, do not set all slaves to the same MAC. none (default), active or follow (charp)
Under active-backup mode, most common configuration that folks use is the link based failure detection via a set of parameters - miimon and use_carrier. Here is how it looks like in a running system.
[root@hostA ~]# cat /proc/net/bonding/bond1 Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008) Bonding Mode: fault-tolerance (active-backup) (fail_over_mac active) Primary Slave: None Currently Active Slave: eth0 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 5000 Down Delay (ms): 5000 Slave Interface: eth0 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:21:28:4a:cd:80 Slave Interface: eth1 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:21:28:4a:cd:81 [root@hostA ~]#
What we see here is that bond1 is set in active-backup mode with two physical interfaces or slaves - eth0 and eth1. Their link status is monitored every 100ms. If a link goes down, the bonding driver will wait 5000ms before actually declaring it as DOWN. When the lost link recovers, the driver will again wait for 5000ms before declaring it as UP.
The option 'primary' is set to none. This means that bonding driver does not have any preference of eth0 vs eth1 if both are UP at same time. Link failure counter keeps track of how many times the link has failed since host has been running.
Now lets review the following topology diagram. Here a host 'A' has two physical interfaces and their links are marked as 1 and 2 respectively. They are connected to independent Ethernet switches for redundancy and high availability purposes. These switches further connect into a bigger network which is external to us. It may be a corporate network or even Internet. The links are labeled as 3 and 4 respectively from our local Ethernet switches as well.
Host A with bond1 interface has eth0 as currently active interface. It is expected to communicate to an external network as shown.
Scenario 1: When link number 1 is out of service, then bonding driver will detect it within specific period of time and activate the next backup interface eth1. Service will be restored at this point.
Scenario 2: When link number 3 is out of service, then the bonding driver will be completely unaware of this scenario because both of its local physical interfaces are completely in service. However, the host A will be unable to reach out to the external world due to link number 3 being out of service.
Bonding driver offers an alternate set of parameters to solve the problem illustrated above. Instead of miimon, we will use arp_ip_target and arp_interval.
The modified configuration will look like this.
[root@hostA ~]# cat /proc/net/bonding/bond1 Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: eth0 MII Status: up MII Polling Interval (ms): 0 Up Delay (ms): 0 Down Delay (ms): 0 ARP Polling Interval (ms): 60 ARP IP target/s (n.n.n.n form): 192.168.70.1 Slave Interface: eth0 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:21:28:4a:cd:80 Slave Interface: eth1 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:21:28:4a:cd:81 [root@hostA ~]#
As you can see that the bonding driver is now monitoring accessibility to 192.168.70.1 every 60 seconds. If this is not successful then it will attempt to use eth1 irrespective of local link status.
MII monitoring based bonding is ideal when you are communicating within a LAN and do not go across a router. IPoIB is a good example in this case because currently InfiniBand networks are limited to same broadcast subnets or in other words, do not use a layer 3 router.
ARP IP target based monitoring should be preferred if your setup is similar to what we just discussed above. If the bonded interface is expected to communicate to an outside world across a router, then its better to monitor the accessibility to a set of external IP addresses instead of just local link status. Client access networks created with EoIB is a good example here.
*About the author: Neeraj Gupta joined Oracle as part of Sun Microsystems aqusition, where he spent last 11 years specializing in InfiniBand, Ethernet, Security, HA and Telecom Computing Platforms. Prior to joining Sun, Neeraj spent 5 years in Telecom industry focusing on Internet Services and GSM Cellular Networks. Currently Neeraj is part of Oracle’s Engineered Systems team focusing on Networking and Maximum Availability Architecture.