What are the contributions in "A survey and comparison of peer-to-peer overlay network schemes" ?

In this paper, the authors present a survey and comparison of various Structured and Unstructured P2P networks. The authors categorize the various schemes into these two groups in the design spectrum and discuss the application-level network performance of each group.

What have the authors stated for future works in "A survey and comparison of peer-to-peer overlay network schemes" ?

Finally, the authors close this survey with their thoughts on some directions for the future in P2P overlay networking research: • Future research would aim to reduce the stretch ( ratio of overlay path to underlying network path ) routing metric based on scalable and robust proximity calculations. The authors see the future of P2P overlay networks inexorably linked to the take-up and subsequent commercial success of P2P overlay computing, personal area and ad-hoc networking, mobile location-based services, mirrored content delivery, and networked file-sharing.

What is the advantage of de Bruijn graphs?

Since de Bruijn graphs give very short average routing distances and high resilience to peer failure, they are well suited for structured P2P overlay networks.

What is the purpose of the routing algorithm?

The routing algorithm for storing and retrieving data is designed to adaptively adjust routes over time and to provide efficient performance while using local knowledge, since peers only have knowledge of their immediate neighbors.

What is the protocol used by a downloader?

The trackers use a simple protocol layered on top of HTTP in which a downloader sends information about the file it is downloading and the port number.

How can the authors handle the authenticity of data objects?

The authenticity of the data objects can be handled by using cryptographic techniques through some cost-effective public keys and/or content hashes to securely link together different pieces of data objects.

What is the purpose of the P2P decentralized overlay network?

P2P decentralized overlay network is designed to handle the discovery and location of data and resources in a dynamic butterfly fashion.

How many malicious peers can be tolerated in a de Bruijn graph?

The technique can tolerate up to 25% of malicious peers while providing good performance when the number of compromised peers is small.

Why is the neighborhood set not used in the routing of messages?

The neighborhood set is not used in the routing of messages, but it is still kept fresh/update because the set plays an important role in exchanging information about nearby peers.

What is the future of P2P overlay networks?

The authors see the future of P2P overlay networks inexorably linked to the take-up and subsequent commercial success of P2P overlay computing, personal area and ad-hoc networking, mobile location-based services, mirrored content delivery, and networked file-sharing.

What is the effect of a deterministic short overlay path?

This can result in high network delay and unnecessary long-distance network traffics, from a deterministic short overlay path of O(logN), (where N is the number of peers).

Who proposed a mechanism to improve resiliency to network partitions?

Stoica et al. [6] demonstrates that the advantage of recursive lookups over iterative lookups, but future improvement work is proposed to improve resiliency to network partitions using a small set of known peers, and to reduce the amount of messages in lookups by increasing the size of each step around the ring with a larger fingers in each peer.

What is the average hopcount of the collapse point lookup query?

They used the collapse point lookup query rate (define as the per node query rate at which the successful query rate falls below 90%) and the average hopcounts prior to collapse.

What is the argument for DHT-based systems?

The argument is that DHT-based systems while more efficient at many tasks and have strong theoretical fundamentals to guarantee a key to be found if it exists, they are not well suited for mass-market file sharing.

What is the comparison study done by Loguinov et al.?

A good comparison study done by Loguinov et al. [50] where they use example of Chord, CAN and de Bruijn to study routing performance and resilience of P2P overlay networks, including graph expansion and clustering properties.

What is the defense that prevents Eclipse attacks?

Singh et al. [55] proposes a defense that prevents Eclipse attacks for both Structured and Unstructured P2P overlay networks, by bounding degree of overlay peers, i.e. the in-degree of overlay peers is likely to be higher than the average in-degree of legitimate peers and legitimate peers choose their neighbors from a subset of overlay peers whose in-degree is below a threshold.

What is the defense for P2P overlay networks?

this defense restricts the flexibility necessary to implement optimizations such as proximity neighbor selection and only works in Structured P2P overlay networks.

What are the main improvements in the lookup properties of Unstructured P2P overlays?

in the research community, efforts are being made in improving the lookup properties of Unstructured P2P overlays to include flow control, dynamic geometric topology adaptation, one-hop replication, peer heterogeneity, etc.

What is the key used to sign the data file?

The private half of the asymmetric key pair is used to sign the data file, thus, providing a minimal integrity check that a retrieved data file matches its data file key.

What is the process of sending a PING message to a neighbor?

The PING message is then forwarded to its neighbors and initiates a back-propagated PONG message, which contains information about the peer such as the IP address, number and size of the data items.

(Open Access) A survey and comparison of peer-to-peer overlay network schemes (2005) | Eng Keong Lua

IEEE COMMUNICATIONS SURVEY AND TUTORIAL, MARCH 2004 1

A Survey and Comparison of

Peer-to-Peer Overlay Network Schemes

Eng Keong Lua, Jon Crowcroft, Marcelo Pias, Ravi Sharma and Steven Lim

Abstract—Over the Internet today, computing and communi-

cations environments are signiﬁcantly more complex and chaotic

than classical distributed systems, lacking any centralized orga-

nization or hierarchical control. There has been much interest

in emerging Peer-to-Peer (P2P) network overlays because they

provide a good substrate for creating large-scale data sharing,

content distribution and application-level multicast applications.

These P2P networks try to provide a long list of features

such as: selection of nearby peers, redundant storage, efﬁcient

search/location of data items, data permanence or guarantees,

hierarchical naming, trust and authentication, and, anonymity.

P2P networks potentially offer an efﬁcient routing architecture

that is self-organizing, massively scalable, and robust in the

wide-area, combining fault tolerance, load balancing and explicit

notion of locality. In this paper, we present a survey and compar-

ison of various Structured and Unstructured P2P networks. We

categorize the various schemes into these two groups in the design

spectrum and discuss the application-level network performance

of each group.

Index Terms—Peer-to-Peer, Distributed Scalable Algorithms,

Lookup Protocols, Overlay Routing, Overlay Networks.

I. INTRODUCTION

EER-TO-PEER (P2P) overlay networks are distributed

systems in nature, without any hierarchical organization

or centralized control. Peers form self-organizing overlay

networks that are overlayed on the Internet Protocol (IP)

networks, offering a mix of various features such as robust

wide-area routing architecture, efﬁcient search of data items,

selection of nearby peers, redundant storage, permanence, hier-

archical naming, trust and authentication, anonymity, massive

scalability and fault tolerance. Peer-to-peer overlay systems

go beyond services offered by client-server systems by having

symmetry in roles where a client may also be a server. It allows

access to its resources by other systems and supports resource-

sharing, which requires fault-tolerance, self-organization and

massive scalability properties. Unlike Grid systems, P2P over-

lay networks do not arise from the collaboration between

established and connected groups of systems and without a

more reliable set of resources to share.

We can view P2P overlay network models spanning a wide

spectrum of the communication framework, which speciﬁes a

fully-distributed, cooperative network design with peers build-

ing a self-organizing system. Figure 1 shows an abstract P2P

overlay architecture, illustrating the components in the overlay

Manuscript received March 31, 2004; revised November 20, 2004.

Eng Keong Lua, Jon Crowcroft and Marcelo Pias are with the University

of Cambridge, Computer Laboratory.

Ravi Sharma is with the Nanyang Technological University.

Steven Lim is with the Microsoft Asia.







































































































































































 

























































































































































































































(

)

(

)

(

;

Fig. 1. An Abstract P2P Overlay Network Architecture

communications framework. The Network Communications

layer describes the network characteristics of desktop ma-

chines connected over the Internet or small wireless or sensor-

based devices that are connected in an ad-hoc manner. The

dynamic nature of peers poses challenges in communication

paradigm. The Overlay Nodes Management layer covers the

management of peers, which include discovery of peers and

routing algorithms for optimization. The Features Management

layer deals with the security, reliability, fault resiliency and

aggregated resource availability aspects of maintaining the ro-

bustness of P2P systems. The Services Speciﬁc layer supports

the underlying P2P infrastructure and the application-speciﬁc

components through scheduling of parallel and computation-

intensive tasks, content and ﬁle management. Meta-data de-

scribes the content stored across the P2P peers and the

location information. The Application-level layer is concerned

with tools, applications and services that are implemented

with speciﬁc functionalities on top of the underlying P2P

overlay infrastructure. So, there are two classes of P2P overlay

networks: Structured and Unstructured.

The technical meaning of Structured is that the P2P overlay

network topology is tightly controlled and content are placed

not at random peers but at speciﬁed locations that will make

subsequent queries more efﬁcient. Such Structured P2P sys-

tems use the Distributed Hash Table (DHT) as a substrate,

in which data object (or value) location information is placed

deterministically, at the peers with identiﬁers corresponding

to the data object’s unique key. DHT-based systems have a

2 IEEE COMMUNICATIONS SURVEY AND TUTORIAL, MARCH 2004

Distributed Structured P2P Overlay Application

Distributed Hash Table

Peer Peer Peer

Peer





















































Value



































































































Fig. 2. Application Interface for Structured DHT-based P2P Overlay Systems

property that consistently assigned uniform random NodeIDs

to the set of peers into a large space of identiﬁers. Data objects

are assigned unique identiﬁers called keys, chosen from the

same identiﬁer space. Keys are mapped by the overlay network

protocol to a unique live peer in the overlay network. The P2P

overlay networks support the scalable storage and retrieval

of {key,value} pairs on the overlay network, as illustrated

in Figure 2. Given a key, a store operation (put(key,value))

lookup retrieval operation (value=get(key)) can be invoked to

store and retrieve the data object corresponding to the key,

which involves routing requests to the peer corresponding to

the key.

Each peer maintains a small routing table consisting of its

neighboring peers’ NodeIDs and IP addresses. Lookup queries

or message routing are forwarded across overlay paths to peers

in a progressive manner, with the NodeIDs that are closer to

the key in the identiﬁer space. Different DHT-based systems

will have different organization schemes for the data objects

and its key space and routing strategies. In theory, DHT-based

systems can guarantee that any data object can be located

in a small O(logN) overlay hops on average, where N is

the number of peers in the system. The underlying network

path between two peers can be signiﬁcantly different from the

path on the DHT-based overlay network. Therefore, the lookup

latency in DHT-based P2P overlay networks can be quite high

and could adversely affect the performance of the applications

running over it. Plaxton et al. [1] provides an elegant algorithm

that achieves nearly optimal latency on graphs that exhibit

power-law expansion [2], at the same time, preserving the

scalable routing properties of the DHT-based system. However,

this algorithm requires pair-wise probing between peers to

determine latencies and it is unlikely to scale to a large number

of peers in the overlay. DHT-based systems [3]–[7] are an

important class of P2P routing infrastructures. They support

the rapid development of a wide variety of Internet-scale

applications ranging from distributed ﬁle and naming systems

to application-layer multicast. They also enable scalable, wide-

area retrieval of shared information.

In 1999, the Napster [8] pioneered the idea of a peer-to-

peer ﬁle sharing system supporting a centralized ﬁle search

facility. It was the ﬁrst system to recognize that requests for

popular content need not to be sent to a central server but

instead it could be handled by many peers, that have the

requested content. Such P2P ﬁle-sharing systems are self-

scaling in that as more peers join the system, they add

to the aggregate download capability. Napster achieved this

self-scaling behavior by using a centralized search facility

based on ﬁle lists provided by each peer, thus, it does not

require much bandwidth for the centralized search. Such a

system has the issue of a single point of failure due to

the centralized search mechanism. However, a lawsuit ﬁled

by the Recording Industry Association of America (RIAA)

forced Napster to shut down the ﬁle-sharing service of digital

music — literally, its killer application. However, the paradigm

caught the imagination of platform providers and users alike.

Gnutella [9]–[11] is a decentralized system that distributes

both the search and download capabilities, establishing an

overlay network of peers. It is the ﬁrst system that makes use

of an Unstructured P2P overlay network. An Unstructured P2P

system is composed of peers joining the network with some

loose rules, without any prior knowledge of the topology. The

network uses ﬂooding as the mechanism to send queries across

the overlay with a limited scope. When a peer receives the

ﬂood query, it sends a list of all content matching the query

to the originating peer. While ﬂooding-based techniques are

effective for locating highly replicated items and are resilient

to peers joining and leaving the system, they are poorly suited

for locating rare items. Clearly this approach is not scalable as

the load on each peer grows linearly with the total number of

queries and the system size. Thus, Unstructured P2P networks

face one basic problem: peers readily become overloaded,

therefore, the system does not scale when handling a high

rate of aggregate queries and sudden increase in system size.

Although Structured P2P networks can efﬁciently locate

rare items since the key-based routing is scalable, they incur

signiﬁcantly higher overheads than Unstructured P2P networks

for popular content. Consequently, over the Internet today, the

decentralized Unstructured P2P overlay networks are more

commonly used. However, there are recent efforts on Key-

based Routing (KBR) API abstractions [12] that allow more

application-speciﬁc functionality to be built over this common

basic KBR API abstractions, and OpenHash (Open publicly

accessible DHT service) [13] that allows the uniﬁcation plat-

form of providing developers with basic DHT service models

that runs on a set of infrastructure hosts, to deploy DHT-based

overlay applications without the burden of maintaining a DHT

and with ease of use to spur the deployment of DHT-based

applications. In contrast, Unstructured P2P overlay systems are

Ad-Hoc in nature, and do not present the possibilities of being

uniﬁed under a common platform for application development.

In the sections II and IV of the paper, we will describe

the key features Structured P2P and Unstructured P2P overlay

networks and their operation functionalities. After providing a

basic understanding of the various overlays schemes in these

two classes, we proceed to evaluate these various overlays

schemes in both classes and discuss its developments in

sections III and V. Then, we attempt to use the taxonomy to

make comparisons between the various discussed Structured

A SURVEY AND COMPARISON OF PEER-TO-PEER OVERLAY NETWORK SCHEMES 3

and Unstructured P2P overlay schemes:

• Decentralization — examine whether the overlay system

is distributed.

• Architecture — describe the overlay system architecture

with respect to its operation.

• Lookup Protocol — the lookup query protocol adopted

by the overlay system.

• System Parameters — the required system parameters for

the overlay system operation.

• Routing Performance — the lookup routing protocol

performance in overlay routing.

• Routing State — the routing state and scalability of the

overlay system.

• Peers Join and Leave — describe the behavior of the

overlay system when churn and self-organization oc-

curred.

• Security — look into the security vulnerabilities of over-

lay system.

• Reliability and Fault Resiliency — examine how robust

the overlay system when subjected to faults.

Lastly, in section VI, we conclude with some thoughts

on the relative applicability of each class to some of the

research problems that arise in Ad-Hoc, location-based or

content delivery networks.

II. STRUCTURED P2P OVERLAY NETWORKS

In this category, the overlay network assigns keys to data

items and organizes its peers into a graph that maps each data

key to a peer. This structured graph enables efﬁcient discovery

of data items using the given keys. However, in its simple

form, this class of systems does not support complex queries

and it is necessary to store a copy or a pointer to each data

object (or value) at the peer responsible for the data object’s

key. In this section, we survey and compare the Structured

P2P overlay networks: Content Addressable Network (CAN)

[5], Tapestry [7], Chord [6], Pastry [4], Kademlia [14] and

Viceroy [15].

A. Content Addressable Network (CAN)

The Content Addressable Network (CAN) [5] is a dis-

tributed decentralized P2P infrastructure that provides hash-

table functionality on Internet-like scale. CAN is designed to

be scalable, fault-tolerant, and self-organizing. The architec-

tural design is a virtual multi-dimensional Cartesian coordinate

space on a multi-torus. This d-dimensional coordinate space is

completely logical. The entire coordinate space is dynamically

partitioned among all the peers (N number of peers) in the

system such that every peer possesses its individual, distinct

zone within the overall space. A CAN peer maintains a

routing table that holds the IP address and virtual coordinate

zone of each of its neighbors in the coordinate space. A

CAN message includes the destination coordinates. Using the

neighbor coordinates, a peer routes a message towards its

destination using a simple greedy forwarding to the neighbor

peer that is closest to the destination coordinates. CAN has a

routing performance of O(d.N

) and its routing state is of

2.d bound. As shown in Figure 3 which we adapted from the

Fig. 3. Example of 2-d space CAN before and after Peer Z joins

CAN paper [5], the virtual coordinate space is used to store

{key,value} pairs as follows: to store a pair {K,V}, key K

is deterministically mapped onto a point P in the coordinate

space using a uniform hash function. The lookup protocol to

retrieve an entry corresponding to key K, any peer can apply

the same deterministic hash function to map K onto point P

and then retrieve the corresponding value V from the point

P. If the requesting peer or its immediate neighbors do not

own the point P, the request must be routed through the CAN

infrastructure until it reaches the peer where P lays. A peer

maintains the IP addresses of those peers that hold coordinate

zones adjoining its zone. This set of immediate neighbors in

the coordinate space serves as a coordinate routing table that

enables efﬁcient routing between points in this space.

A new peer that joins the system must have its own portion

of the coordinate space allocated. This can be achieved by

splitting existing peer’s zone in half; retaining half for the

peer and allocating the other half to the new peer. CAN has

an associated DNS domain name which is resolved into IP

address of one or more CAN bootstrap peers (which maintains

a partial list of CAN peers). For a new peer to join CAN

4 IEEE COMMUNICATIONS SURVEY AND TUTORIAL, MARCH 2004

network, the peer looks up in the DNS a CAN domain

name to retrieve a bootstrap peer’s IP address, similar to the

bootstrap mechanism in [16]. The bootstrap peer supplies the

IP addresses of some randomly chosen peers in the system.

The new peer randomly chooses a point P and sends a JOIN

request destined for point P. Each CAN’s peer uses the CAN

routing mechanism to forward the message until it reaches

the peer in which zone P lies. The current peer in zone P then

splits its in half and assigns the other half to the new peer.

For example, in a 2-dimensional (2 − d) space, a zone would

ﬁrst be split along the X dimension, then the Y , and so on.

The {K,V} pairs from the half zone to be handed over are

also transferred to the new peer. After obtaining its zone, the

new peer learns of the IP addresses of its neighbor set from

the previous peer in point P, and adds to that previous peer

itself.

When a peer leaves the CAN network, an immediate

takeover algorithm ensures that one of the failed peer’s neigh-

bors takes over the zone and starts a takeover timer. The peer

updates its neighbor set to eliminate those peers that are no

longer its neighbors. Every peer in the system then sends

soft-state updates to ensure that all of their neighbors will

learn about the change and update their own neighbor sets.

The number of neighbors a peer maintains depends only on

the dimensionality of the coordinate space (i.e. 2.d) and it is

independent of the total number of peers in the system.

The Figure 3 example illustrated a simple routing path from

peer X to point E and a new peer Z joining the CAN network.

For a d-dimensional space partitioned into n equal zones,

the average routing path length is (d/4) x (n

) hops and

individual peers maintain a list of 2.d neighbors. Thus, the

growth of peers (or zones) can be achieved without increasing

per peer state while the average path length grows as O(n

Since there are many different paths between two points in the

space, when one or more of a peer’s neighbors fail, this peer

can still route along the next best available path.

Improvement to the CAN algorithm can be done by main-

taining multiple, independent coordinate spaces with each

peer in the system being assigned a different zone in each

coordinate space, called reality. For a CAN with r realities, a

single peer is assigned r coordinate zones, one on each reality

available, and this peer holds r independent neighbor sets.

The contents of the hash table are replicated on every reality,

thus improving data availability. For further data availability

improvement, CAN could use k different hash functions to

map a given key onto k points in the coordinate space.

This results in the replication of a single {key,value} pair

at k distinct peers in the system. A {key,value} pair is then

unavailable only when all the k replicas are simultaneously

unavailable. Thus, queries for a particular hash table entry

could be forwarded to all k peers in parallel thereby reducing

the average query latency, and reliability and fault resiliency

properties are enhanced.

CAN could be used in large scale storage management

systems such as the OceanStore [17], Farsite [18], and Publius

[19]. These systems require efﬁcient insert and retrieval of

content in a large distributed storage network with a scalable

indexing mechanism. Another potential application for CANs

is in the construction of wide-area name resolution services

that decouple the naming scheme from the name resolution

process. This enables an arbitrary and location-independent

naming scheme.

B. Chord

Chord [6] uses consistent hashing [20] to assign keys to its

peers. Consistent hashing is designed to let peers enter and

leave the network with minimal interruption. This decentral-

ized scheme tends to balance the load on the system, since

each peer receives roughly the same number of keys, and

there is little movement of keys when peers join and leave the

system. In a steady state, for N peers in the system, each peer

maintains routing state information for about only O(logN)

other peers (N number of peers in the system). This may

be efﬁcient but performance degrades gracefully when that

information is out-of-date.

The consistent hash functions assign peers and data keys an

m-bit identiﬁer using SHA-1 [21] as the base hash function.

A peer’s identiﬁer is chosen by hashing the peer’s IP address,

while a key identiﬁer is produced by hashing the data key. The

length of the identiﬁer m must be large enough to make the

probability of keys hashing to the same identiﬁer negligible.

Identiﬁers are ordered on an identiﬁer circle modulo 2m.

Key k is assigned to the ﬁrst peer whose identiﬁer is equal

to or follows k in the identiﬁer space. This peer is called

the successor peer of key k, denoted by successor(k). If

identiﬁers are represented as a circle of numbers from 0 to

2m − 1, then successor(k ) is the ﬁrst peer clockwise from k.

The identiﬁer circle is termed as the Chord ring. To maintain

consistent hashing mapping when a peer n joins the network,

certain keys previously assigned to n’s successor now need to

be reassigned to n. When peer n leaves the Chord system, all

of its assigned keys are reassigned to n’s successor. Therefore,

peers join and leave the system with (logN)

performance. No

other changes of keys assignment to peers need to occur. In

Figure 4 (adapted from [6]), the Chord ring is depicted with

m = 6. This particular ring has ten peers and stores ﬁve keys.

The successor of the identiﬁer 10 is peer 14, so key 10 will

be located at NodeID 14. Similarly, if a peer were to join with

identiﬁer 26, it would store the key with identiﬁer 24 from the

peer with identiﬁer 32.

Each peer in the Chord ring needs to know how to contact

its current successor peer on the identiﬁer circle. Lookup

queries involve the matching of key and NodeID. For a given

identiﬁer could be passed around the circle via these successor

pointers until they encounter a pair of peers that include the

desired identiﬁer; the second peer in the pair is the peer the

query maps to. An example is presented in Figure 4, whereby

peer 8 performs a lookup for key 54. Peer 8 invokes the

ﬁnd successor operation for this key, which eventually returns

the successor of that key, i.e. peer 56. The query visits every

peer on the circle between peer 8 and peer 56. The response

is returned along the reverse of the path.

As m is the number of bits in the key/NodeID space, each

peer n maintains a routing table with up to m entries, called

the ﬁnger table. The i

entry in the table at peer n contains

A SURVEY AND COMPARISON OF PEER-TO-PEER OVERLAY NETWORK SCHEMES 5







 



 



























 





 







 

  

 







 



 





























    

    



    



 



 



    

  





   















 





 

Fig. 4. Chord ring with identiﬁer circle consisting of ten peers and ﬁve

data keys. It shows the path followed by a query originated at peer 8 for the

lookup of key 54. Finger table entries for peer 8.

the identity of the ﬁrst peer s that succeeds n by at least 2

i−1

on the identiﬁer circle, i.e. s = successor(n + 2

i−1

), where

1 ≤ i ≤ m. Peer s is the i

ﬁnger of peer n (n.finger[i]).

A ﬁnger table entry includes both the Chord identiﬁer and the

IP address (and port number) of the relevant peer. Figure 4

shows the ﬁnger table of peer 8, and the ﬁrst ﬁnger entry for

this peer points to peer 14, as the latter is the ﬁrst peer that

succeeds (8+20) mod 26 = 9. Similarly, the last ﬁnger of peer

8 points to peer 42, i.e. the ﬁrst peer that succeeds (8 + 25)

mod 26 = 40 . In this way, peers store information about only

a small number of other peers, and know more about peers

closely following it on the identiﬁer circle than other peers.

Also, a peer’s ﬁnger table does not contain enough information

to directly determine the successor of an arbitrary key k. For

example, peer 8 cannot determine the successor of key 34 by

itself, as successor of this key (peer 38) is not present in peer

8’s ﬁnger table.

When a peer joins the system, the successor pointers of

some peers need to be changed. It is important that the

successor pointers are up to date at any time because the

correctness of lookups is not guaranteed otherwise. The Chord

protocol uses a stabilization protocol [6] running periodically

in the background to update the successor pointers and the

entries in the ﬁnger table. The correctness of the Chord

protocol relies on the fact that each peer is aware of its

successors. When peers fail, it is possible that a peer does

not know its new successor, and that it has no chance to learn

about it. To avoid this situation, peers maintain a successor

list of size r, which contains the peer’s ﬁrst r successors.

When the successor peer does not respond, the peer simply

contacts the next peer on its successor list. Assuming that

peer failures occur with a probability p, the probability that

every peer on the successor list will fail is p

. Increasing r

makes the system more robust. By tuning this parameter, any

degree of robustness with good reliability and fault resiliency

may be achieved.

The following applications are examples of how Chord

could be used:

• Cooperative mirroring or Cooperative File System (CFS)

[22], in which multiple providers of content cooperate

to store and serve each others’ data. Spreading the total

load evenly over all participant hosts lowers the total cost

of the system, since each participant needs to provide

capacity only for the average load, not for the peak

load. There are two layers in CFS. The DHash (Dis-

tributed Hash) layer performs block fetches for the peer,

distributes the blocks among the servers, and maintains

cached and replicated copies. The Chord layer distributed

lookup system is used to locate the servers responsible

for a block.

• Chord-based DNS [23] provides a lookup service, with

host names as keys and IP addresses (and other host

information) as values. Chord could provide a DNS-like

service by hashing each host name to a key [20]. Chord-

based DNS would require no special servers, while ordi-

nary DNS systems rely on a set of special root servers.

DNS also requires manual management of the routing

information (DNS records) that allows clients to navigate

the name server hierarchy; Chord automatically maintains

the correctness of the analogous routing information.

DNS only works well when host names are hierarchically

structured to reﬂect administrative boundaries; Chord

imposes no naming structure. DNS is specialized to the

task of ﬁnding named hosts or services, while Chord can

also be used to ﬁnd data object values that are not tied

to particular machines.

C. Tapestry

Sharing similar properties as Pastry, Tapestry [7] employs

decentralized randomness to achieve both load distribution and

routing locality. The difference between Pastry and Tapestry

is the handling of network locality and data object replica-

tion, and this difference will be more apparent, as described

in Pasty section. Tapestry’s architecture uses variant of the

Plaxton et al. [1] distributed search technique, with additional

mechanisms to provide availability, scalability, and adaptation

in the presence of failures and attacks. Plaxton et al. proposes

a distributed data structure, known as the Plaxton mesh,

optimized to support a network overlay for locating named

data objects which are connected to one root peer. On the other

hand, Tapestry uses multiple roots for each data object to avoid

single point of failure. In the Plaxton mesh, peers can take on

the roles of servers (where data objects are stored), routers

(forward messages), and clients (entity of requests). It uses

A survey and comparison of peer-to-peer overlay network schemes

Figures

Citations

A survey of network virtualization

InterCloud: utility-oriented federation of cloud computing environments for scaling of application services

A Survey on Security and Privacy Issues of Bitcoin

Bubbles of Trust: A decentralized blockchain-based authentication system for IoT

An advanced hybrid peer-to-peer botnet

References

Basic Local Alignment Search Tool

Emergence of Scaling in Random Networks

Design Patterns: Elements of Reusable Object-Oriented Software

The Tragedy of the Commons

Error and attack tolerance of complex networks

Related Papers (5)

Chord: A scalable peer-to-peer lookup service for internet applications

Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

A scalable content-addressable network

Kademlia: A Peer-to-Peer Information System Based on the XOR Metric

Chord: a scalable peer-to-peer lookup protocol for Internet applications

Frequently Asked Questions (21)

Q1. What are the contributions in "A survey and comparison of peer-to-peer overlay network schemes" ?

Q2. What have the authors stated for future works in "A survey and comparison of peer-to-peer overlay network schemes" ?

Q3. What is the advantage of de Bruijn graphs?

Q4. What is the purpose of the routing algorithm?

Q5. What is the protocol used to update the successor pointers?

Q6. What is the protocol used by a downloader?

Q7. How can the authors handle the authenticity of data objects?

Q8. What is the purpose of the P2P decentralized overlay network?

Q9. How many malicious peers can be tolerated in a de Bruijn graph?

Q10. Why is the neighborhood set not used in the routing of messages?

Q11. What is the future of P2P overlay networks?

Q12. What is the effect of a deterministic short overlay path?

Q13. Who proposed a mechanism to improve resiliency to network partitions?

Q14. What is the average hopcount of the collapse point lookup query?

Q15. What is the argument for DHT-based systems?

Q16. What is the comparison study done by Loguinov et al.?

Q17. What is the defense that prevents Eclipse attacks?

Q18. What is the defense for P2P overlay networks?

Q19. What are the main improvements in the lookup properties of Unstructured P2P overlays?

Q20. What is the key used to sign the data file?

Q21. What is the process of sending a PING message to a neighbor?