scispace - formally typeset
Open AccessJournal ArticleDOI

A survey and comparison of peer-to-peer overlay network schemes

TLDR
A survey and comparison of various Structured and Unstructured P2P overlay networks is presented, categorize the various schemes into these two groups in the design spectrum, and discusses the application-level network performance of each group.
Abstract
Over the Internet today, computing and communications environments are significantly more complex and chaotic than classical distributed systems, lacking any centralized organization or hierarchical control. There has been much interest in emerging Peer-to-Peer (P2P) network overlays because they provide a good substrate for creating large-scale data sharing, content distribution, and application-level multicast applications. These P2P overlay networks attempt to provide a long list of features, such as: selection of nearby peers, redundant storage, efficient search/location of data items, data permanence or guarantees, hierarchical naming, trust and authentication, and anonymity. P2P networks potentially offer an efficient routing architecture that is self-organizing, massively scalable, and robust in the wide-area, combining fault tolerance, load balancing, and explicit notion of locality. In this article we present a survey and comparison of various Structured and Unstructured P2P overlay networks. We categorize the various schemes into these two groups in the design spectrum, and discuss the application-level network performance of each group.

read more

Content maybe subject to copyright    Report

IEEE COMMUNICATIONS SURVEY AND TUTORIAL, MARCH 2004 1
A Survey and Comparison of
Peer-to-Peer Overlay Network Schemes
Eng Keong Lua, Jon Crowcroft, Marcelo Pias, Ravi Sharma and Steven Lim
AbstractOver the Internet today, computing and communi-
cations environments are significantly more complex and chaotic
than classical distributed systems, lacking any centralized orga-
nization or hierarchical control. There has been much interest
in emerging Peer-to-Peer (P2P) network overlays because they
provide a good substrate for creating large-scale data sharing,
content distribution and application-level multicast applications.
These P2P networks try to provide a long list of features
such as: selection of nearby peers, redundant storage, efficient
search/location of data items, data permanence or guarantees,
hierarchical naming, trust and authentication, and, anonymity.
P2P networks potentially offer an efficient routing architecture
that is self-organizing, massively scalable, and robust in the
wide-area, combining fault tolerance, load balancing and explicit
notion of locality. In this paper, we present a survey and compar-
ison of various Structured and Unstructured P2P networks. We
categorize the various schemes into these two groups in the design
spectrum and discuss the application-level network performance
of each group.
Index TermsPeer-to-Peer, Distributed Scalable Algorithms,
Lookup Protocols, Overlay Routing, Overlay Networks.
I. INTRODUCTION
P
EER-TO-PEER (P2P) overlay networks are distributed
systems in nature, without any hierarchical organization
or centralized control. Peers form self-organizing overlay
networks that are overlayed on the Internet Protocol (IP)
networks, offering a mix of various features such as robust
wide-area routing architecture, efficient search of data items,
selection of nearby peers, redundant storage, permanence, hier-
archical naming, trust and authentication, anonymity, massive
scalability and fault tolerance. Peer-to-peer overlay systems
go beyond services offered by client-server systems by having
symmetry in roles where a client may also be a server. It allows
access to its resources by other systems and supports resource-
sharing, which requires fault-tolerance, self-organization and
massive scalability properties. Unlike Grid systems, P2P over-
lay networks do not arise from the collaboration between
established and connected groups of systems and without a
more reliable set of resources to share.
We can view P2P overlay network models spanning a wide
spectrum of the communication framework, which specifies a
fully-distributed, cooperative network design with peers build-
ing a self-organizing system. Figure 1 shows an abstract P2P
overlay architecture, illustrating the components in the overlay
Manuscript received March 31, 2004; revised November 20, 2004.
Eng Keong Lua, Jon Crowcroft and Marcelo Pias are with the University
of Cambridge, Computer Laboratory.
Ravi Sharma is with the Nanyang Technological University.
Steven Lim is with the Microsoft Asia.
!
"
#
$
%
&
'
$
(
(
)
*
+
,
-
"
+
$
*
.
/
-
0
!
%
1
2
!
%
3
-
0
$
4
!
.
5
-
*
-
6
!
(
!
*
"
/
-
0
!
%
7
!
-
"
)
%
!
.
5
-
*
-
6
!
(
!
*
"
/
-
0
!
%
8
!
%
2
+
,
!
.
9
.
:
!
,
+
;
+
,
/
-
0
!
%
<
:
:
3
+
,
-
"
+
$
*
9
3
!
2
!
3
/
-
0
!
%
Fig. 1. An Abstract P2P Overlay Network Architecture
communications framework. The Network Communications
layer describes the network characteristics of desktop ma-
chines connected over the Internet or small wireless or sensor-
based devices that are connected in an ad-hoc manner. The
dynamic nature of peers poses challenges in communication
paradigm. The Overlay Nodes Management layer covers the
management of peers, which include discovery of peers and
routing algorithms for optimization. The Features Management
layer deals with the security, reliability, fault resiliency and
aggregated resource availability aspects of maintaining the ro-
bustness of P2P systems. The Services Specific layer supports
the underlying P2P infrastructure and the application-specific
components through scheduling of parallel and computation-
intensive tasks, content and file management. Meta-data de-
scribes the content stored across the P2P peers and the
location information. The Application-level layer is concerned
with tools, applications and services that are implemented
with specific functionalities on top of the underlying P2P
overlay infrastructure. So, there are two classes of P2P overlay
networks: Structured and Unstructured.
The technical meaning of Structured is that the P2P overlay
network topology is tightly controlled and content are placed
not at random peers but at specified locations that will make
subsequent queries more efficient. Such Structured P2P sys-
tems use the Distributed Hash Table (DHT) as a substrate,
in which data object (or value) location information is placed
deterministically, at the peers with identifiers corresponding
to the data object’s unique key. DHT-based systems have a

2 IEEE COMMUNICATIONS SURVEY AND TUTORIAL, MARCH 2004
Distributed Structured P2P Overlay Application
Distributed Hash Table
Peer Peer Peer
Peer
Value
Fig. 2. Application Interface for Structured DHT-based P2P Overlay Systems
property that consistently assigned uniform random NodeIDs
to the set of peers into a large space of identifiers. Data objects
are assigned unique identifiers called keys, chosen from the
same identifier space. Keys are mapped by the overlay network
protocol to a unique live peer in the overlay network. The P2P
overlay networks support the scalable storage and retrieval
of {key,value} pairs on the overlay network, as illustrated
in Figure 2. Given a key, a store operation (put(key,value))
lookup retrieval operation (value=get(key)) can be invoked to
store and retrieve the data object corresponding to the key,
which involves routing requests to the peer corresponding to
the key.
Each peer maintains a small routing table consisting of its
neighboring peers’ NodeIDs and IP addresses. Lookup queries
or message routing are forwarded across overlay paths to peers
in a progressive manner, with the NodeIDs that are closer to
the key in the identifier space. Different DHT-based systems
will have different organization schemes for the data objects
and its key space and routing strategies. In theory, DHT-based
systems can guarantee that any data object can be located
in a small O(logN) overlay hops on average, where N is
the number of peers in the system. The underlying network
path between two peers can be significantly different from the
path on the DHT-based overlay network. Therefore, the lookup
latency in DHT-based P2P overlay networks can be quite high
and could adversely affect the performance of the applications
running over it. Plaxton et al. [1] provides an elegant algorithm
that achieves nearly optimal latency on graphs that exhibit
power-law expansion [2], at the same time, preserving the
scalable routing properties of the DHT-based system. However,
this algorithm requires pair-wise probing between peers to
determine latencies and it is unlikely to scale to a large number
of peers in the overlay. DHT-based systems [3]–[7] are an
important class of P2P routing infrastructures. They support
the rapid development of a wide variety of Internet-scale
applications ranging from distributed file and naming systems
to application-layer multicast. They also enable scalable, wide-
area retrieval of shared information.
In 1999, the Napster [8] pioneered the idea of a peer-to-
peer file sharing system supporting a centralized file search
facility. It was the first system to recognize that requests for
popular content need not to be sent to a central server but
instead it could be handled by many peers, that have the
requested content. Such P2P file-sharing systems are self-
scaling in that as more peers join the system, they add
to the aggregate download capability. Napster achieved this
self-scaling behavior by using a centralized search facility
based on file lists provided by each peer, thus, it does not
require much bandwidth for the centralized search. Such a
system has the issue of a single point of failure due to
the centralized search mechanism. However, a lawsuit filed
by the Recording Industry Association of America (RIAA)
forced Napster to shut down the file-sharing service of digital
music literally, its killer application. However, the paradigm
caught the imagination of platform providers and users alike.
Gnutella [9]–[11] is a decentralized system that distributes
both the search and download capabilities, establishing an
overlay network of peers. It is the first system that makes use
of an Unstructured P2P overlay network. An Unstructured P2P
system is composed of peers joining the network with some
loose rules, without any prior knowledge of the topology. The
network uses flooding as the mechanism to send queries across
the overlay with a limited scope. When a peer receives the
flood query, it sends a list of all content matching the query
to the originating peer. While flooding-based techniques are
effective for locating highly replicated items and are resilient
to peers joining and leaving the system, they are poorly suited
for locating rare items. Clearly this approach is not scalable as
the load on each peer grows linearly with the total number of
queries and the system size. Thus, Unstructured P2P networks
face one basic problem: peers readily become overloaded,
therefore, the system does not scale when handling a high
rate of aggregate queries and sudden increase in system size.
Although Structured P2P networks can efficiently locate
rare items since the key-based routing is scalable, they incur
significantly higher overheads than Unstructured P2P networks
for popular content. Consequently, over the Internet today, the
decentralized Unstructured P2P overlay networks are more
commonly used. However, there are recent efforts on Key-
based Routing (KBR) API abstractions [12] that allow more
application-specific functionality to be built over this common
basic KBR API abstractions, and OpenHash (Open publicly
accessible DHT service) [13] that allows the unification plat-
form of providing developers with basic DHT service models
that runs on a set of infrastructure hosts, to deploy DHT-based
overlay applications without the burden of maintaining a DHT
and with ease of use to spur the deployment of DHT-based
applications. In contrast, Unstructured P2P overlay systems are
Ad-Hoc in nature, and do not present the possibilities of being
unified under a common platform for application development.
In the sections II and IV of the paper, we will describe
the key features Structured P2P and Unstructured P2P overlay
networks and their operation functionalities. After providing a
basic understanding of the various overlays schemes in these
two classes, we proceed to evaluate these various overlays
schemes in both classes and discuss its developments in
sections III and V. Then, we attempt to use the taxonomy to
make comparisons between the various discussed Structured

A SURVEY AND COMPARISON OF PEER-TO-PEER OVERLAY NETWORK SCHEMES 3
and Unstructured P2P overlay schemes:
Decentralization examine whether the overlay system
is distributed.
Architecture describe the overlay system architecture
with respect to its operation.
Lookup Protocol the lookup query protocol adopted
by the overlay system.
System Parameters the required system parameters for
the overlay system operation.
Routing Performance the lookup routing protocol
performance in overlay routing.
Routing State the routing state and scalability of the
overlay system.
Peers Join and Leave describe the behavior of the
overlay system when churn and self-organization oc-
curred.
Security look into the security vulnerabilities of over-
lay system.
Reliability and Fault Resiliency examine how robust
the overlay system when subjected to faults.
Lastly, in section VI, we conclude with some thoughts
on the relative applicability of each class to some of the
research problems that arise in Ad-Hoc, location-based or
content delivery networks.
II. STRUCTURED P2P OVERLAY NETWORKS
In this category, the overlay network assigns keys to data
items and organizes its peers into a graph that maps each data
key to a peer. This structured graph enables efficient discovery
of data items using the given keys. However, in its simple
form, this class of systems does not support complex queries
and it is necessary to store a copy or a pointer to each data
object (or value) at the peer responsible for the data object’s
key. In this section, we survey and compare the Structured
P2P overlay networks: Content Addressable Network (CAN)
[5], Tapestry [7], Chord [6], Pastry [4], Kademlia [14] and
Viceroy [15].
A. Content Addressable Network (CAN)
The Content Addressable Network (CAN) [5] is a dis-
tributed decentralized P2P infrastructure that provides hash-
table functionality on Internet-like scale. CAN is designed to
be scalable, fault-tolerant, and self-organizing. The architec-
tural design is a virtual multi-dimensional Cartesian coordinate
space on a multi-torus. This d-dimensional coordinate space is
completely logical. The entire coordinate space is dynamically
partitioned among all the peers (N number of peers) in the
system such that every peer possesses its individual, distinct
zone within the overall space. A CAN peer maintains a
routing table that holds the IP address and virtual coordinate
zone of each of its neighbors in the coordinate space. A
CAN message includes the destination coordinates. Using the
neighbor coordinates, a peer routes a message towards its
destination using a simple greedy forwarding to the neighbor
peer that is closest to the destination coordinates. CAN has a
routing performance of O(d.N
1
d
) and its routing state is of
2.d bound. As shown in Figure 3 which we adapted from the
Fig. 3. Example of 2-d space CAN before and after Peer Z joins
CAN paper [5], the virtual coordinate space is used to store
{key,value} pairs as follows: to store a pair {K,V}, key K
is deterministically mapped onto a point P in the coordinate
space using a uniform hash function. The lookup protocol to
retrieve an entry corresponding to key K, any peer can apply
the same deterministic hash function to map K onto point P
and then retrieve the corresponding value V from the point
P. If the requesting peer or its immediate neighbors do not
own the point P, the request must be routed through the CAN
infrastructure until it reaches the peer where P lays. A peer
maintains the IP addresses of those peers that hold coordinate
zones adjoining its zone. This set of immediate neighbors in
the coordinate space serves as a coordinate routing table that
enables efficient routing between points in this space.
A new peer that joins the system must have its own portion
of the coordinate space allocated. This can be achieved by
splitting existing peer’s zone in half; retaining half for the
peer and allocating the other half to the new peer. CAN has
an associated DNS domain name which is resolved into IP
address of one or more CAN bootstrap peers (which maintains
a partial list of CAN peers). For a new peer to join CAN

4 IEEE COMMUNICATIONS SURVEY AND TUTORIAL, MARCH 2004
network, the peer looks up in the DNS a CAN domain
name to retrieve a bootstrap peer’s IP address, similar to the
bootstrap mechanism in [16]. The bootstrap peer supplies the
IP addresses of some randomly chosen peers in the system.
The new peer randomly chooses a point P and sends a JOIN
request destined for point P. Each CAN’s peer uses the CAN
routing mechanism to forward the message until it reaches
the peer in which zone P lies. The current peer in zone P then
splits its in half and assigns the other half to the new peer.
For example, in a 2-dimensional (2 d) space, a zone would
first be split along the X dimension, then the Y , and so on.
The {K,V} pairs from the half zone to be handed over are
also transferred to the new peer. After obtaining its zone, the
new peer learns of the IP addresses of its neighbor set from
the previous peer in point P, and adds to that previous peer
itself.
When a peer leaves the CAN network, an immediate
takeover algorithm ensures that one of the failed peer’s neigh-
bors takes over the zone and starts a takeover timer. The peer
updates its neighbor set to eliminate those peers that are no
longer its neighbors. Every peer in the system then sends
soft-state updates to ensure that all of their neighbors will
learn about the change and update their own neighbor sets.
The number of neighbors a peer maintains depends only on
the dimensionality of the coordinate space (i.e. 2.d) and it is
independent of the total number of peers in the system.
The Figure 3 example illustrated a simple routing path from
peer X to point E and a new peer Z joining the CAN network.
For a d-dimensional space partitioned into n equal zones,
the average routing path length is (d/4) x (n
1
d
) hops and
individual peers maintain a list of 2.d neighbors. Thus, the
growth of peers (or zones) can be achieved without increasing
per peer state while the average path length grows as O(n
1
d
).
Since there are many different paths between two points in the
space, when one or more of a peer’s neighbors fail, this peer
can still route along the next best available path.
Improvement to the CAN algorithm can be done by main-
taining multiple, independent coordinate spaces with each
peer in the system being assigned a different zone in each
coordinate space, called reality. For a CAN with r realities, a
single peer is assigned r coordinate zones, one on each reality
available, and this peer holds r independent neighbor sets.
The contents of the hash table are replicated on every reality,
thus improving data availability. For further data availability
improvement, CAN could use k different hash functions to
map a given key onto k points in the coordinate space.
This results in the replication of a single {key,value} pair
at k distinct peers in the system. A {key,value} pair is then
unavailable only when all the k replicas are simultaneously
unavailable. Thus, queries for a particular hash table entry
could be forwarded to all k peers in parallel thereby reducing
the average query latency, and reliability and fault resiliency
properties are enhanced.
CAN could be used in large scale storage management
systems such as the OceanStore [17], Farsite [18], and Publius
[19]. These systems require efficient insert and retrieval of
content in a large distributed storage network with a scalable
indexing mechanism. Another potential application for CANs
is in the construction of wide-area name resolution services
that decouple the naming scheme from the name resolution
process. This enables an arbitrary and location-independent
naming scheme.
B. Chord
Chord [6] uses consistent hashing [20] to assign keys to its
peers. Consistent hashing is designed to let peers enter and
leave the network with minimal interruption. This decentral-
ized scheme tends to balance the load on the system, since
each peer receives roughly the same number of keys, and
there is little movement of keys when peers join and leave the
system. In a steady state, for N peers in the system, each peer
maintains routing state information for about only O(logN)
other peers (N number of peers in the system). This may
be efficient but performance degrades gracefully when that
information is out-of-date.
The consistent hash functions assign peers and data keys an
m-bit identifier using SHA-1 [21] as the base hash function.
A peer’s identifier is chosen by hashing the peer’s IP address,
while a key identifier is produced by hashing the data key. The
length of the identifier m must be large enough to make the
probability of keys hashing to the same identifier negligible.
Identifiers are ordered on an identifier circle modulo 2m.
Key k is assigned to the first peer whose identifier is equal
to or follows k in the identifier space. This peer is called
the successor peer of key k, denoted by successor(k). If
identifiers are represented as a circle of numbers from 0 to
2m 1, then successor(k ) is the first peer clockwise from k.
The identifier circle is termed as the Chord ring. To maintain
consistent hashing mapping when a peer n joins the network,
certain keys previously assigned to ns successor now need to
be reassigned to n. When peer n leaves the Chord system, all
of its assigned keys are reassigned to ns successor. Therefore,
peers join and leave the system with (logN)
2
performance. No
other changes of keys assignment to peers need to occur. In
Figure 4 (adapted from [6]), the Chord ring is depicted with
m = 6. This particular ring has ten peers and stores five keys.
The successor of the identifier 10 is peer 14, so key 10 will
be located at NodeID 14. Similarly, if a peer were to join with
identifier 26, it would store the key with identifier 24 from the
peer with identifier 32.
Each peer in the Chord ring needs to know how to contact
its current successor peer on the identifier circle. Lookup
queries involve the matching of key and NodeID. For a given
identifier could be passed around the circle via these successor
pointers until they encounter a pair of peers that include the
desired identifier; the second peer in the pair is the peer the
query maps to. An example is presented in Figure 4, whereby
peer 8 performs a lookup for key 54. Peer 8 invokes the
find successor operation for this key, which eventually returns
the successor of that key, i.e. peer 56. The query visits every
peer on the circle between peer 8 and peer 56. The response
is returned along the reverse of the path.
As m is the number of bits in the key/NodeID space, each
peer n maintains a routing table with up to m entries, called
the finger table. The i
th
entry in the table at peer n contains

A SURVEY AND COMPARISON OF PEER-TO-PEER OVERLAY NETWORK SCHEMES 5









Fig. 4. Chord ring with identifier circle consisting of ten peers and five
data keys. It shows the path followed by a query originated at peer 8 for the
lookup of key 54. Finger table entries for peer 8.
the identity of the first peer s that succeeds n by at least 2
i1
on the identifier circle, i.e. s = successor(n + 2
i1
), where
1 i m. Peer s is the i
th
finger of peer n (n.finger[i]).
A finger table entry includes both the Chord identifier and the
IP address (and port number) of the relevant peer. Figure 4
shows the finger table of peer 8, and the first finger entry for
this peer points to peer 14, as the latter is the first peer that
succeeds (8+20) mod 26 = 9. Similarly, the last finger of peer
8 points to peer 42, i.e. the first peer that succeeds (8 + 25)
mod 26 = 40 . In this way, peers store information about only
a small number of other peers, and know more about peers
closely following it on the identifier circle than other peers.
Also, a peer’s finger table does not contain enough information
to directly determine the successor of an arbitrary key k. For
example, peer 8 cannot determine the successor of key 34 by
itself, as successor of this key (peer 38) is not present in peer
8s finger table.
When a peer joins the system, the successor pointers of
some peers need to be changed. It is important that the
successor pointers are up to date at any time because the
correctness of lookups is not guaranteed otherwise. The Chord
protocol uses a stabilization protocol [6] running periodically
in the background to update the successor pointers and the
entries in the finger table. The correctness of the Chord
protocol relies on the fact that each peer is aware of its
successors. When peers fail, it is possible that a peer does
not know its new successor, and that it has no chance to learn
about it. To avoid this situation, peers maintain a successor
list of size r, which contains the peer’s first r successors.
When the successor peer does not respond, the peer simply
contacts the next peer on its successor list. Assuming that
peer failures occur with a probability p, the probability that
every peer on the successor list will fail is p
r
. Increasing r
makes the system more robust. By tuning this parameter, any
degree of robustness with good reliability and fault resiliency
may be achieved.
The following applications are examples of how Chord
could be used:
Cooperative mirroring or Cooperative File System (CFS)
[22], in which multiple providers of content cooperate
to store and serve each others’ data. Spreading the total
load evenly over all participant hosts lowers the total cost
of the system, since each participant needs to provide
capacity only for the average load, not for the peak
load. There are two layers in CFS. The DHash (Dis-
tributed Hash) layer performs block fetches for the peer,
distributes the blocks among the servers, and maintains
cached and replicated copies. The Chord layer distributed
lookup system is used to locate the servers responsible
for a block.
Chord-based DNS [23] provides a lookup service, with
host names as keys and IP addresses (and other host
information) as values. Chord could provide a DNS-like
service by hashing each host name to a key [20]. Chord-
based DNS would require no special servers, while ordi-
nary DNS systems rely on a set of special root servers.
DNS also requires manual management of the routing
information (DNS records) that allows clients to navigate
the name server hierarchy; Chord automatically maintains
the correctness of the analogous routing information.
DNS only works well when host names are hierarchically
structured to reflect administrative boundaries; Chord
imposes no naming structure. DNS is specialized to the
task of finding named hosts or services, while Chord can
also be used to find data object values that are not tied
to particular machines.
C. Tapestry
Sharing similar properties as Pastry, Tapestry [7] employs
decentralized randomness to achieve both load distribution and
routing locality. The difference between Pastry and Tapestry
is the handling of network locality and data object replica-
tion, and this difference will be more apparent, as described
in Pasty section. Tapestry’s architecture uses variant of the
Plaxton et al. [1] distributed search technique, with additional
mechanisms to provide availability, scalability, and adaptation
in the presence of failures and attacks. Plaxton et al. proposes
a distributed data structure, known as the Plaxton mesh,
optimized to support a network overlay for locating named
data objects which are connected to one root peer. On the other
hand, Tapestry uses multiple roots for each data object to avoid
single point of failure. In the Plaxton mesh, peers can take on
the roles of servers (where data objects are stored), routers
(forward messages), and clients (entity of requests). It uses

Citations
More filters
Journal ArticleDOI

A survey of network virtualization

TL;DR: The existing technologies and a wide array of past and state-of-the-art projects on network virtualization are surveyed followed by a discussion of major challenges in this area.
Book ChapterDOI

InterCloud: utility-oriented federation of cloud computing environments for scaling of application services

TL;DR: The results demonstrate that federated Cloud computing model has immense potential as it offers significant performance gains as regards to response time and cost saving under dynamic workload scenarios.
Journal ArticleDOI

A Survey on Security and Privacy Issues of Bitcoin

TL;DR: In this paper, the authors present a systematic survey that covers the security and privacy aspects of Bitcoin and discuss the current anonymity considerations in Bitcoin and the privacy-related threats to Bitcoin users along with the analysis of the existing privacy-preserving solutions.
Journal ArticleDOI

Bubbles of Trust: A decentralized blockchain-based authentication system for IoT

TL;DR: This paper proposes an original decentralized system called bubbles of trust, which ensures a robust identification and authentication of devices, and protects the data integrity and availability in IoT.
Proceedings Article

An advanced hybrid peer-to-peer botnet

TL;DR: In this paper, an advanced hybrid peer-to-peer botnet is proposed, which provides robust network connectivity, individualized encryption and control traffic dispersion, limited botnet exposure by each bot, and easy monitoring and recovery by its botmaster.
References
More filters
Journal ArticleDOI

Basic Local Alignment Search Tool

TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal ArticleDOI

Emergence of Scaling in Random Networks

TL;DR: A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
Book

Design Patterns: Elements of Reusable Object-Oriented Software

TL;DR: The book is an introduction to the idea of design patterns in software engineering, and a catalog of twenty-three common patterns, which most experienced OOP designers will find out they've known about patterns all along.
Journal ArticleDOI

The Tragedy of the Commons

TL;DR: The population problem has no technical solution; it requires a fundamental extension in morality.
Journal ArticleDOI

Error and attack tolerance of complex networks

TL;DR: It is found that scale-free networks, which include the World-Wide Web, the Internet, social networks and cells, display an unexpected degree of robustness, the ability of their nodes to communicate being unaffected even by unrealistically high failure rates.
Related Papers (5)
Frequently Asked Questions (21)
Q1. What are the contributions in "A survey and comparison of peer-to-peer overlay network schemes" ?

In this paper, the authors present a survey and comparison of various Structured and Unstructured P2P networks. The authors categorize the various schemes into these two groups in the design spectrum and discuss the application-level network performance of each group. 

Finally, the authors close this survey with their thoughts on some directions for the future in P2P overlay networking research: • Future research would aim to reduce the stretch ( ratio of overlay path to underlying network path ) routing metric based on scalable and robust proximity calculations. The authors see the future of P2P overlay networks inexorably linked to the take-up and subsequent commercial success of P2P overlay computing, personal area and ad-hoc networking, mobile location-based services, mirrored content delivery, and networked file-sharing. 

Since de Bruijn graphs give very short average routing distances and high resilience to peer failure, they are well suited for structured P2P overlay networks. 

The routing algorithm for storing and retrieving data is designed to adaptively adjust routes over time and to provide efficient performance while using local knowledge, since peers only have knowledge of their immediate neighbors. 

The Chord protocol uses a stabilization protocol [6] running periodically in the background to update the successor pointers and the entries in the finger table. 

The trackers use a simple protocol layered on top of HTTP in which a downloader sends information about the file it is downloading and the port number. 

The authenticity of the data objects can be handled by using cryptographic techniques through some cost-effective public keys and/or content hashes to securely link together different pieces of data objects. 

P2P decentralized overlay network is designed to handle the discovery and location of data and resources in a dynamic butterfly fashion. 

The technique can tolerate up to 25% of malicious peers while providing good performance when the number of compromised peers is small. 

The neighborhood set is not used in the routing of messages, but it is still kept fresh/update because the set plays an important role in exchanging information about nearby peers. 

The authors see the future of P2P overlay networks inexorably linked to the take-up and subsequent commercial success of P2P overlay computing, personal area and ad-hoc networking, mobile location-based services, mirrored content delivery, and networked file-sharing. 

This can result in high network delay and unnecessary long-distance network traffics, from a deterministic short overlay path of O(logN), (where N is the number of peers). 

Stoica et al. [6] demonstrates that the advantage of recursive lookups over iterative lookups, but future improvement work is proposed to improve resiliency to network partitions using a small set of known peers, and to reduce the amount of messages in lookups by increasing the size of each step around the ring with a larger fingers in each peer. 

They used the collapse point lookup query rate (define as the per node query rate at which the successful query rate falls below 90%) and the average hopcounts prior to collapse. 

The argument is that DHT-based systems while more efficient at many tasks and have strong theoretical fundamentals to guarantee a key to be found if it exists, they are not well suited for mass-market file sharing. 

A good comparison study done by Loguinov et al. [50] where they use example of Chord, CAN and de Bruijn to study routing performance and resilience of P2P overlay networks, including graph expansion and clustering properties. 

Singh et al. [55] proposes a defense that prevents Eclipse attacks for both Structured and Unstructured P2P overlay networks, by bounding degree of overlay peers, i.e. the in-degree of overlay peers is likely to be higher than the average in-degree of legitimate peers and legitimate peers choose their neighbors from a subset of overlay peers whose in-degree is below a threshold. 

this defense restricts the flexibility necessary to implement optimizations such as proximity neighbor selection and only works in Structured P2P overlay networks. 

in the research community, efforts are being made in improving the lookup properties of Unstructured P2P overlays to include flow control, dynamic geometric topology adaptation, one-hop replication, peer heterogeneity, etc. 

The private half of the asymmetric key pair is used to sign the data file, thus, providing a minimal integrity check that a retrieved data file matches its data file key. 

The PING message is then forwarded to its neighbors and initiates a back-propagated PONG message, which contains information about the peer such as the IP address, number and size of the data items.