Deploying Load-Balancing Network Topology to GNU/Linux

Identify Bonding-Capable Interfaces

The first thing to do is to identify available network interfaces. To do that, simply:

$ ip -br addr show
lo               UNKNOWN        127.0.0.1/8 ::1/128 
enp2s0           UP             192.168.0.XX1/24 ...
wlp3s0           UP             192.168.0.XX2/24 ...

You can see there is a wireless interface (wlp3s0) and an ethernet interface(enp2s0) available for use.

Install Bonding Kernel Module

The next thing is to have the bonding kernel module installed. Here is the command:

$ apt install ifenslave

Then, check and load the module:

$ lsmod | grep bonding
bonding 163840 0
$ modprobe bonding

Define Bonding Network Interface

The next step is to create the bonding network interface. In this example, we name it (bond0). You can create this network interface in /etc/network/interfaces.d directory.

/etc/network/interfaces.d/bond0

with the following:

allow-hotplug bond0
auto bond0
iface bond0 inet <connectionType>
        bond-mode <mode>
        bond-miimon <millisecond>
        bond-downdelay <millisecond, multiple of bond-miimon>
        bond-updelay <millisecond, multiple of bond-miimon>
        slaves <interface0> <interface1> ...

NOTE:

  1. For <connectionType>, you can use the following:
    • "static" for static IP configurations
    • "dhcp" for DHCP automated allocations (If you're unsure or you are a home-user, use this)
  2. For load balancing <mode>, explained in detail in each subsections, you can use either of the following:
    • "balance-rr" for round-robin load balancing.
    • "balance-xor" for MAC addresses XOR matching selection.
    • "balance-tlb" for adaptive transmit load balancing (transmit is load-balanced using MII link only)
    • "balance-alb" for adaptive load balancing (receive is load-balanced using ARP negotiation only)
  3. For "bond-miimon", it is periodic checking timing to inspect any link failure. One can set it to 100.
  4. For "bond-downdelay", it is the wait timing to conclude a link is confirmed down. Set 0 to disable it.
  5. For "bond-updelay", it is the wait timing to use a link when it is up. Set 0 to disable it.

One example would be:

allow-hotplug bond0
auto bond0
iface bond0 inet dhcp
        bond-mode balance-rr
        bond-miimon 100
        bond-downdelay 0
        bond-updelay 0
        slaves enp2s0 wlp3s0

FOR YOUR INFORMATION: for "bond-downdelay" and "bond-updelay", they are specific to a machine setup. Hence, you should gather weekly data about the network performance and tune the timing accordingly until you get the optimal values.

Round Robin Load Balancing (balance-rr, ID = 0)

This mode transmit and receive by rotating each slave links per packet sends. Hence, it uses all links in a round-robin manners.

The upside is that:

  1. It offers fault tolerance (able to switch between links on-the-fly).
  2. It offers load balancing across all links.
  3. Suitable for consumer laptop, desktop, and server.
  4. Suitable for commercial grade server.

The downside is that:

  1. all received packets will no longer be consistent. Hence, client software must be able to handle such scenarios.

Active Backup (active-backup, ID = 1)

This mode uses the primary slave for both transmit and receive. Unless it fails, the second and subsequent links take over its role.

The upside is that:

  1. It offers fault tolerance (able to switch between links on-the-fly).
  2. Suitable for consumer laptop, desktop, and server.

The downside is that:

  1. Interfaces are not fully utilized (no load-balancing).

XOR Load Balancing (balance-xor, ID = 2)

This mode transmit and receive by selecting link from the slave links pool through XOR source MAC address with destination MAC address.

The upside is that:

  1. It offers fault tolerance (able to switch between links on-the-fly).
  2. It offers load balancing across all links.
  3. Suitable for local area network connections with multiple exit nodes (routers).

The downside is that:

  1. depending on hardware, XOR process can be sluggish and connection may appears lagging for each connections.
  2. Sub-optimal when every connection passes through a single router only.

Broadcast (broadcast, ID = 3)

This mode sends transmit and receive across all interfaces.

The upside is that:

  1. It offers fault tolerance (other interfaces can still work should one of the slave links went down).
  2. High availability.

The downside is that:

  1. No load-balancing.
  2. May overload the entire network with over-communication.

802.3ad (broadcast, ID = 4)

This mode is for IEEE 802.3ad compliant network.

The upside is that:

  1. It offers fault tolerance (other interfaces can still work should one of the slave links went down).
  2. High availability.
  3. Autonomous configuration.
  4. It offers load balancing.

The downside is that:

  1. The connected network must be IEEE 802.ad compliant. Requires additional setup outside of the machine (outside of this topic's coverage).
  2. All devices must be on the same speed and duplex per standard.
  3. Network transmit/receive policy depends on peers' 802.3ad network policy.
  4. ARP monitoring is not available - Strictly MII monitoring only.

Transmit Load Balancing (balance-tlb, ID = 5)

This mode transmit and receive by balancing outgoing traffic across links based on their respective relative computed loads. The receive is the current active slave only. Should the current active slave failed, another slave takes over the receive job.

The upside is that:

  1. It offers fault tolerance (able to switch between links on-the-fly).
  2. It offers load balancing across all links.
  3. Suitable for commercial grade server deployment (fixed hardware setup).

The downside is that:

  1. only MII link monitoring is supported.
  2. ethtool package must be installed and configured prior use.
  3. load balancing only affects transmit (upload).
  4. may not be compatible with inconsistent connection wifi link.

Adaptive Load Balancing (balance-alb, ID = 6)

This mode transmit and receive by balancing both traffics across links using ARP negotiation. The bonding driver intercepts the ARP replies and overwrites the source hardware address with one of the slaves's unique hardware address.

The upside is that:

  1. It offers fault tolerance (able to switch between links on-the-fly).
  2. It offers load balancing across all links.
  3. Suitable for commercial grade server deployment (fixed hardware setup).

The downside is that:

  1. only MII link monitoring is supported.
  2. ARP is broadcasting the hardware it uses across network which allows peer to learn the network (can be a security concern).
  3. may not be compatible with inconsistent connection wifi link.

Configure Each Slaves Link

With the bond network ready, you should now configure slave link. For each links, you can create its own network interface configuration files with its name inside /etc/network/interfaces.d/ directory. There are 2 general link types will be discussed here.

In General

In general, every slave links now must have the following statements:

bond-master <bond-link-ID>
  1. bond-master can be self-defined (recommended bond0, bond1, ...). This is essentially creating a singular network interface.

NOTE: since we defined slaves inside bond0, there is no need to re-define bond-primary in every single interface.

Secondly, the iface definition must use manual instead. This is because the bonding driver will manage the connection/disconnection on its own autonomously via manual interfacing with slave links. Example:

iface enp2s0 inet dhcp --> iface enp2s0 inet manual

Therefore, a common pattern would be:

iface <name> inet manual
        bond-master bond0

Example:

iface enp2s0 inet manual
        bond-master bond0

Ethernet-Based Link Specific

Most OS will setup Ethernet link autonomously during installation and will append its settings inside /etc/network/interfaces. Your job is to extract the default settings out and create an independent /etc/network/interfaces.d/<name> config file instead. Here is an example of a working Ethernet configuration with the name "enp2s0".

allow-hotplug enp2s0
auto enp2s0

iface enp2s0 inet manual
        bond-master bond0

iface enp2s0 inet6 auto

You need to keep a few things:

  1. allow-hotplug for on-the-fly cable attachment/removal.
  2. auto for loading the interface on boot.
  3. iface enp2s0 inet6 auto for IPv6 auto configurations.

Wifi-Based Link Specific

Wifi link driver and interfaces usually do not exist as it is managed by different network application like wpa-* list of programs. It is no advisable to set manual wifi configurations (except your consumer laptop or pc is using terminal interface). For Wifi, let the appropriate wifi network program to handle the wireless connectivity.

The only thing you need to do is to ensure the wifi network interface name is correctly defined in bond0's "slaves" listing.

Install Dependencies

If your load balancing mode is balance-tlb or balance-alb, you need to install and setup ethtool package. Otherwise, you may skip this step. To install:

$ apt install ethtool -y

IEEE 802.3ad Compliant Network Setup

Unfortunately, IEEE 802.3ad compliant network setup is outside of this topic's coverage. Please work it out with your research or use other modes like round-robin or xor.

Load Module On Boot

Optionally, unless absolute necessary that you need bonding module to be loaded on boot, you simply add the following into /etc/modules:

# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
bonding

Restart Networking Service

Now that the setup is ready, you can restart networking service. Use root account to:

$ /etc/init.d/networking restart

This will take some times. If there is any wrong configurations (e.g. accidentally use balance-tlb mode or balance-alb mode when many dependencies are not configured properly), this service restart will fail. You can use the following to read the error message:

$ service networking status

Should you encounter wifi firmware failure, you might need to restart your machine.

Test and Verify

Upon successful restart, depending on modes, you should get a speed boost. Testing the connection varies depending on the type of link.

Testing Ethernet Connectivity

To test Ethernet connection, you simply detach/attach the network cable to the port. You should experience uninterrupted connection when the cable is detached.

NOTE:

  • DO NOT turn off the link layer (e.g. disable via connection manager) especially with primary slave. Doing so may corrupt the bonding driver session and you will need to restart the service again.

Testing Ethernet Connectivity

To test Wifi connection, you simply connect/disconnect your trusted network cable to the port. You should experience uninterrupted connection when the wifi is disconnected.

NOTE:

  • DO NOT turn off the wifi link layer or hardware (e.g. power the wifi chip off) especially with primary slave. Doing so may corrupt the bonding driver session and you will need to restart the service again.

Get Bonding Status

To check bonding status, you can read from the file inside /proc/net/bonding/<bond name>. Example:

$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: enp2s0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 3
Permanent HW addr: XX:XX:XX:XX:XX:X
Slave queue ID: 0

Slave Interface: wlp3s0
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 2
Permanent HW addr: XX:XX:XX:XX:XX:X
Slave queue ID: 0

Restart Bond Network As Non-Root

To restart network as non-root user, you need to:

  1. disable the primary slave.
  2. restore all subsequent slaves connections (e.g. get wifi connected and stable).
  3. enable the primary slave.

If you encounter wifi firmware error or the above does not work and admin is not around, you have only one choice: restart the machine.

That's all for installing load-balancing network topology into GNU/Linux.