Written by : Deepaknath Tandur, April 2016
Network coding is a networking technique in which transmitted data is encoded and decoded by nodes of the network to improve the network performance and throughput. In traditional routing networks, packets are stored and forwarded one at a time. Therefore, if an intermediary node of the network has two packets from two different sources then it forwards them one after another even if both are headed in the same destination. This requires separate transmissions for each individual message, which can overall decrease the network efficiency. Whereas with network coding, algorithms are used to merge different messages and the accumulated result is then forwarded to the destination. After receiving the accumulated massage, the receiving node then decodes the message at the destination using the same algorithm.
Figure shows an example of linear network coding where the source node S sends packets to the receiver node R. Some of the intermediary nodes such as C, E and F encode the received packet with coded vectors and then forward the merged packet to the next destination in the network. Nodes A, B and D serve as traditional nodes that store and forward the received packets. The receiver node R at the end receives the two coded packets. Given that the receiver R knows the coded vectors, it can then retrieve the native packets x1 and x2 from the two received packets y1 and y2. It is to be noted that even if the link between nodes A and E breaks down, the receiver will still be able to determine both x1 and x2. Thus, the network shows increased resilience against bottlenecks or breakdowns in the network even if the received packet has only partial information. The loss of link somewhere earlier in the network will not affect the receiver as it receives the coded native packets from different parts of the network. Thus in an ideally coded network, retrials are not needed. This results in reduction of packet loss, which in turn results in improving the throughput of the network. The other benefit of this approach leads to potential improvement in robustness and latency of the network.
Network coding has found wide applicability in wireless communication networks. It is shown that a network coded TCP can significantly boost the performance of a lossy wireless network by up to 10 times. It has been widely used in in ad-hoc mesh or sensor networks. Some of the mesh routing protocols have already taken advantage of network coding. In the wired network scenario, network coding is applied in overlay networks, such as P2P (peer-to-peer) and distributed storage networks, etc.
P2P networks are a perfect place to apply network coding as the nodes in a P2P network are end hosts which can perform more complex operations such as decoding and encoding than simply storing and forwarding messages. Avalanche is Microsoft’s peer-to-peer (P2P) network created by Pablo Rodriguez and Christos Gkantsidis, who claim to offer improved scalability and bandwidth efficiency compared to existing P2P systems. The project studied how to enable a cost effective, internet scalable and very fast file distribution solution (e.g. for TV on-demand, patches, software distribution). Such an approach leverages desktop PCs to aid in the distribution process, relieving congested servers and network links from most of the traffic. Like BitTorrent, Avalanche splits the file to be distributed into small blocks. However, rather than peers simply transmitting the blocks, they transmit random linear combinations of the blocks along with the random coefficients of this linear combination. This technique removes the need for each peer to have complex knowledge of block distribution across the network. The Avalanche model fixes these problems using network coding. Instead of distributing the blocks of the file, peers produce linear combinations of the blocks they already hold. Such combinations are distributed together with a tag that describes the parameters in the combination. Any peer can generate new unique combinations from the combinations it already has. When a peer has enough independent combinations, it can decode and build the original file. Such encoding ensures that any piece uploaded by a given peer can be of use to any other peer. Peers do not need to find specific pieces in the system to complete, any subset encoded piece will suffice. This makes the system very robust as peers disconnect. Also, no peer becomes a bottleneck, since no block is more important than another. Finally, network bandwidth is efficiently utilized since the same information does not travel multiple times over bottleneck links. Microsoft in 2007 made Avalanche technology available as a public customer technology preview (CTP) of the resulting system, called 'Microsoft Secure Content Downloader' (MSCD), and since 2015 started delivering Windows Updates using P2P in Windows 10.
Similarly, the benefits of network coding can also be applied in distributed storage systems. In these systems, the data is typically stored in a large number of nodes distributed over a network. In conventional systems, if a storage node fails or leaves the system, then it becomes necessary to regenerate a new storage node in order to maintain the reliability level. This requires a significant download bandwidth that can slow down the entire system. However, with network coding the system can recreate the lost piece by downloading the linear combination of stored data from different nodes, resulting in reduced maintenance bandwidth by orders of magnitude.