Review on "A Survey and Comparison of Peer-to-Peer Overlay Network Schemes"

Post date: Sep 19, 2012 10:03:07 AM

This paper by Eng Keong Lua et. al. dated 2005, talks about the different network design for Peer-to-Peer Overlay Networks.

In summary, the network architecture can be classified into two categories, structured and unstructured. For structured type, there is a correspondence between data files and peers so that where data file is stored is deterministic and can be determined using a Distributed Hash Table (DHT). On the other hand, unstructured type maintains that there is no association between peers and files, and routing is done through flooding. The advantage of structured peer-to-peer networks include scalability, however, the overhead resources used for such type of network will be larger compared to those of the unstructured type. On the other hand, unstructured type issues include scalability. Moreover, while rare items will be easily searchable for network with structured design, in unstructured design, downloading of rare items shall potentially cause inefficiency.

Since the paper is a survey, it is meant to provide only important and unique details about the design of the network and not specifics of the protocol for the network. Thus, if one wants further details about how things work and how they were able to satisfy some desired design principles, other resources must be consulted. With this, the following summarizes what I have learned:

* On structured types

1. Content Addressable Networks (CAN) - A logical d-dimensional space is partitioned to peers so that each peer has a unique zone in this space. A data file will be assigned a d-dimensional identifier. In order to determine the location of data file with a given key, a uniform hash function will be used to map the key to a point P. The peer closest to point P will then be the one storing the file. Since the hash function is uniform, it will also be used by any peer to uniquely identify point P. In routing, the algorithm is to find the neighboring peer with the closest distance to the point P assigned to the sent file. In this way, regardless of how many peers the network has, we can get an upperbound on the performance of routing (O(d x (N)^(1/d))).

2. Chord - In this type of network, one of the main concept is the use of a consistent hashing to observe minimal interruption when peers join or leave the network. (Note that in CAN, there is a need to assign and re-assign a portion of the d-dimensional space when peers join and leave the system. This requires that all peers be disturbed in order to check if there is a need to update their routing tables.) To do this, the concept of a chord ring is employed, where all peer and data identifiers will be assigned an m-bit value. A chord ring will be used to determine to which peer will a key be assigned; this will be based on the first peer that is greater than or equal to the identifier input. When joining the system, some of the files assigned to the successor of a peer ID will be given to the joining peer, whereas, upon leaving the chord ring, the leaving peer will pass its data files to its successor.

In order to find the successor of a peer, each peer stores m entries in a routing table called a finger table, where the i^th entry corresponds to the successor of the identifier peerID+2^(i-1). While minimal interruption is achieved in this network, there will be a stabilization protocol running in the background in order to assure that the copies in each finger table is up-to-date, otherwise, false information will degrade network performance.

(description of other structured networks to be continued here)

* On unstructured types

1. Bittorrent - In this type of unstructured network, there is a central location that manages user downloads. In order for a peer to download a file, it needs to contact a tracker to where the peers downloading the file will be listed. After obtaining a list, user will contact the peers in the list and pieces of the file will be potentially originating from different peers. While downloading, a peer will also be able to share his/her pieces of his/her downloaded incomplete file. Since the protocol for bittorrent uses a tit-for-tat algorithm, the rate of download will be dependent on the rate of upload, and free-riders will be discouraged due to the concept of choking. The idea of how Bittorrent works is one of the many parts in the paper where I focused on because it will be the type of protocol we will study for our project. I specifically have questions in mind where I resulted to doing assumptions due to some parts that I think lacks further explanation in the paper. Some of those include:

      • It is not clear how much information the tracker knows about the file downloaded by a peer. It seems the tracker only needs to know whether a peer is a seeder or only a leecher, however, it is not specifically stated. Also, how the tracker decides how many peers will be given to a certain downloader is not given.

      • Since peer who has the complete file announces completion to all of its peers, doesn't the tracker also need to know of this? Should the tracker also be sent a copy of the notification? It seems so, but is not specifically stated.

2. Freenet - a loosely structured network used to provide more spaces for each peer by having other files stored in extra spaces of other peers. It uses several types of file key, Keyword-Signed Key, Subspace-Signed Key and Content-Hash Key which is used to name a file, allow different files to have same name provided that they are managed by different peers and update and split contents of a file in the network, respectively. The routing algorithm for freeness is the same as that in IP, wherein a peer maintains a routing table of neighboring peers, which includes both their addresses and the data keys of files stored in them. Querying involves a depth-first search like algorithm.

(description of other structured networks to be continued here)

On the side, I also found some typographical errors. For the chord section, p. 76, 20 must be 2^0, 26 must be 2^6 and 25 must be 2^5.