Unlike commonly used Active Queue Management approaches, Neighborhood Watch operates on the previous hop’s egress queue to control ingress queue congestion at a given router. The goal of Neighborhood Watch is to quickly reduce bufferbloat at a congested router’s ingress queue and prevent high latency and network jitter. Dropping packets at the previous hop’s egress queue allows for an overloaded router to focus on processing packets already queued and less on queuing incoming packets. Once a router has alleviated its congestion, the previous router is notified and allows more packets to flow by reducing its drop rate. Due to the packets dropped during congestion, TCP’s congestion control algorithms will have reduced the sending rate at each of the sending clients and traffic will not instantly overload the buffer at the next hop when the previous hop reduces its drop rate.
This approach is implemented by a userspace agent running on all routers participating in Neighborhood Watch. The userspace agent, with aid from existing Linux userspace tools, determines the current congestion level at a router and reports its value to agents running on neighboring participants. Once a neighbor’s congestion level is received, the drop probability on an egress queue is set to drop more packets if the neighbor is congested or drop fewer packets if the neighbor is not congested. Neighborhood Watch only operates at the next immediate hop and does not try to alert routers to drop packets that are two or more hops away. Although it is ideal to drop packets as soon as possible, attempts to drop packets at further upstream routers may cause packet loss in unrelated network flows, as discussed below.
The above Figure depicts an example topology with five routers managed by a single entity and two routers managed by other entities. To illustrate why Neighborhood Watch cannot propagate congestion information past a single hop, consider TCP flows ABE and ABD where router E is experiencing congestion and router B is able to fully handle the traffic being transported through it. When router E reports to router B its congestion level, the queueing discipline on the egress link BE will begin dropping packets at a higher rate. If E also reported this information to router A, then A would drop packets on the egress link AB. This is an undesirable consequence since packets in the flow ABD will be dropped at router A but the flow ABD does not participate in the congestion at router E. Thus Neighborhood Watch only considers the congestion at neighboring routers. In this example, congestion at router E is not propagated back to router B since it is assumed router B is capable of handling all the traffic from flows ABD and ABE. In fact, congestion is never propagated back due to Neighborhood Watch because if a next hop downstream router is congested, the current router is going to drop packets destined for the congested router. Since the current router is not congested at the time the next hop broadcasts its congestion and actions taken by the current router do not increase congestion (i.e., dropping packets instead of sending packets), the current router will remain uncongested unless traffic sent through it is increased, which is unrelated to the Neighborhood Watch functionality.
The final design component of Neighborhood Watch is the egress queuing discipline dropping policy itself. As argued above, dropping at the previous hop’s egress queue is more desirable than dropping at the congested router’s ingress queue. This is because the congested router will not need to process as many incoming packets and can focus more on processing the packets that is causing the congestion.
In addition, the drop probability set at the previous hop’s egress queue needs analysis for the best performance. In this work, the drop probability is calculated on a given interval based on the current congestion level of the neighbor’s egress queue. The calculation follows the formula used in Random Early Detection, that takes a minimum queue size ratio, maximum queue size threshold, and maximum drop probability as arguments in addition to a configurable interval setting. The minimum queue size ratio is used to define when the drop probability should be calculated and set, for example if the queue is less than 33% full then do not drop any packets. The maximum queue size ratio is used to define when the router should drop all packets, for example if the queue size is greater than 66% full all packets should be dropped. If the queue size is somewhere in between 33% and 66% then the drop probability is calculated according to the following formula:
((currentSize - minimumSizemaximumSize )/(maximumSize - minimumSize))*minimumSizemaxProbability
Where currentSize is the current size of the queue in packets, minimumSize is calculated as minimumQueueSizeRatio * absoluteBufferSize, and maximumsize is calculated as maximumQueueSizeRatio * absoluteBufferSize. The above formula returns a number between 0 and the maximum probability as a drop rate. For example, if the minimum queue size ratio is .33, maximum queue size ratio is .66, max probability is .1, the current queue size is 50 packets and the absolute queue size is 100 packets then we have the following calculation:
((50 - 33)/(66 - 33))*.1 = .05
The Figure above depicts the design of Neighborhood Watch in a traditional network. Neighborhood Watch is realized as a userspace application, shown as purple circle, running on each participating router, shown in green. Each Neighborhood Watch daemon listens on a local TCP socket and opens connections (dotted lines) to each of the daemons running on neighboring routers. Once connected, the daemons broadcast drop probabilities calculated from the current congestion level using the above formula for each of their interfaces on a configurable interval. To obtain the congestion information from the local router, the daemon parses output from the Linux tc userspace tool to obtain the number of packets in the backlog. Once a neighbor’s congestion information is received by the daemon on its listening socket, it first determines if the message is destined to itself and if it is the daemon uses a tc netem qdisc to set the drop probability parsed from the broadcast message.
Daemons running on routers with neighbors not participating in Neighborhood Watch, such as routers outside of the managed AS or client devices, do not attempt to open connections or send congestion information to these neighbors. If the daemon detects congestion on an interface with a neighbor not participating in Neighborhood Watch, it is unable to directly manage the neighbor’s egress queue so it must rely on traditional methods of managing congestion on its ingress queue.
The final figure above depicts Neighborhood Watch designed for a Software Defined Networking model, where the control logic is removed from the daemons and placed in a centralized controller. In this design the daemons only have the ability to read the current congestion level and report it to the controller when prompted and change the current drop probability on the egress queue when directed by the controller. All configuration and logic is performed by the controller, which has a direct TCP connection to each of the daemons. On a given interval, the controller polls each of the daemons for their current congestion level on each interface. When this message is received, the daemon gathers the congestion level as in the traditional network and reports back to the controller. For each reply, the controller calculates the drop probability for each device and sends a message to the appropriate daemon to set the drop probability for a specific device. Once this message is received by a daemon, the egress drop probability is set in the same manner as the traditional network. This model benefits from a more centralized approach that allows quick modifications at a network wide level to change how drop probabilities are calculated and less complex configuration at each individual daemon by having a larger centralized configuration at the controller.
From the traditional network and Software Defined Network designs above, the following major components are identified:
Network: The first component is the network testbed itself which is comprised of routers and clients.
Routers: Each router is a Linux virtual machine configured with paths to route traffic between clients. Further each router runs a Neighborhood Watch Daemon, which interacts with a congestion module that provides congestion information about the local router.
Clients: Similarly, each client is a Linux virtual machine configured with traffic generation tools.
Neighborhood Watch Daemon: The Neighborhood Watch Daemon is responsible for reporting congestion and setting egress drop probabilities according to a drop probability calculation. All actions are done by daemon communication over a Neighborhood Watch Protocol. The daemon’s peers can either be another Neighborhood Watch Daemon, in the case of a traditional network, or a centralized controller, as in the case of a Software Defined Network.
Congestion Module: The congestion module is responsible for reporting ingress queue congestion values to the daemon.
Traffic Generation Tools: End client traffic generation tools are used to create light network load, such as ping or file transfers over netcat and SSH, or stress test the network bed, such as iperf.
Drop Probability Calculation: The drop probability calculator takes congestion at a NIC as input and returns the egress drop probability to be set. The calculation takes three arguments, described previously, and operates on an interval, all of which are defined in a configuration file.
Daemon Communication: In the traditional design daemon communication is done by listening on a single inbound socket at each daemon and opening one outbound socket to each neighbors’ listening socket. Using the inbound socket the daemon sets the egress drop probability for a device based on information read from a neighbor and reports its congestion level to its neighbor on each device using the outbound sockets.
Neighborhood Watch Protocol: The Neighborhood Watch Protocol is used to inform a router of its neighbors’ congestion. There are two modes, peer mode used in traditional networks and master-slave mode used in software defined networks. The peer mode uses a congestion notification message sent by router to every participating neighbor and the master-slave mode uses request/response messages to get the current congestion information at a router and a command message to set the drop probability for an egress queue on a router.
Centralized Controller: The centralized controller is directly connected to all Neighborhood Watch Daemons via a TCP connection and uses the master-slave protocol instead of the peer mode protocol. In the master-slave protocol the controller is the master, which polls every slave for their congestion information, calculates drop probabilities for each of their queues and then sends command messages to each slave directing them to set their egress drop probability.
Configuration File: The final component is the configuration file which is used to define performance tuning information, network topology information, and debugging information. The drop probability calculation parameter is an example of performance tuning information, the network topology will describe neighbors participating in Neighborhood Watch, and debugging information will turn on verbose logging. This file will reside locally on every host participating in Neighborhood Watch.
In the Traditional Network approach the configuration file will contain information about each neighbor and tuning parameters in each daemon. Where in the SDN approach the file on each daemon will only contain debugging information and the centralized configuration file on the controller will contain all information about the topology and tuning parameters.