scispace - formally typeset
Open AccessJournal ArticleDOI

Gossip-based aggregation in large dynamic networks

TLDR
This work proposes a gossip-based protocol for computing aggregate values over network components in a fully decentralized fashion and demonstrates the efficiency and robustness of the protocol both theoretically and experimentally under a variety of scenarios including node and communication failures.
Abstract
As computer networks increase in size, become more heterogeneous and span greater geographic distances, applications must be designed to cope with the very large scale, poor reliability, and often, with the extreme dynamism of the underlying network. Aggregation is a key functional building block for such applications: it refers to a set of functions that provide components of a distributed system access to global information including network size, average load, average uptime, location and description of hotspots, and so on. Local access to global information is often very useful, if not indispensable for building applications that are robust and adaptive. For example, in an industrial control application, some aggregate value reaching a threshold may trigger the execution of certain actions; a distributed storage system will want to know the total available free space; load-balancing protocols may benefit from knowing the target average load so as to minimize the load they transfer. We propose a gossip-based protocol for computing aggregate values over network components in a fully decentralized fashion. The class of aggregate functions we can compute is very broad and includes many useful special cases such as counting, averages, sums, products, and extremal values. The protocol is suitable for extremely large and highly dynamic systems due to its proactive structure---all nodes receive the aggregate value continuously, thus being able to track any changes in the system. The protocol is also extremely lightweight, making it suitable for many distributed applications including peer-to-peer and grid computing systems. We demonstrate the efficiency and robustness of our gossip-based protocol both theoretically and experimentally under a variety of scenarios including node and communication failures.

read more

Content maybe subject to copyright    Report

Gossip-based Aggregation in Large Dynamic Networks
Márk Jelasity, Alberto Montresor and Ozalp Babaoglu
Università di Bologna
Abstract
As computer networks increase in size, become mor e hete rogeneous and span greater
geogra phic distances, applications must be designed to cope with the very large scale, poor
reliability, and often, with the extreme dynamism of the underlying network. Aggregation
is a key functio nal building block for such applications: it refers to a set of functions that
provide comp onents of a distributed system access to global information includ ing network
size, average lo ad, average uptime, location an d description of hotspots, etc. Local access
to g lobal information is often very useful, if not indispensable for building applications that
are robust and adaptive. For examp le , in an industrial control application, some aggregate
value reaching a threshold may trigger the execution of certain actions; a d istributed storage
system will want to know the total available free space; load balancing protocols may benefit
from knowing the target average load so as to minimize the load they transfer. We propose
a gossip-ba sed protocol for computing aggregate values over network components in a fully
decentralized fashion. The c la ss of aggregate function s we can compute is very broa d and
includes many useful special cases such as counting, averages, sums, products and extremal
values. The protocol is suitable for extrem e ly large and highly dynamic systems due to
its proactive structure—all nodes receive the aggregate value continuously, thus being able
to track any changes in the system. The protocol is also extremely lightweight making it
suitable for many distributed applications including peer-to-peer and grid computing systems.
We demonstrate the efficiency and robustness of our gossip-based protocol b oth theoretica lly
and experimentally under a variety of scenarios including nod e and communication failures.
1 Introduction
Computer networks in general, and the Internet in particular, are experiencing explosive growth
in many dimensions, including size, performance, user base and geographical span. The poten-
tial for communication and access to computational resources have improved dramatically both
quantitatively and qualitatively in a relatively short time. New design paradigms such as peer-to-
peer (P2P) [18] and grid computing [14] have emerged in response to these trends. The Internet,
and all similar networks, pose special challenges for large-scale, reliable, distributed application
builders. The “best-effort” design philosophy that characterizes such networks renders the com-
munication channels inherently unreliable and the continuous ux of nodes joining and leaving
the network make them highly dynamic. Control and monitoring in such systems are particularly
challenging: performing global computations requires orchestrating a huge number of nodes.
In this paper, we focus on aggregation which is a useful building block in large, unreliable
and dynamic systems [25]. Aggregation is a common name for a set of functions that provide a
c
ACM, 2005. This is the author’s version of the work. It is posted here by permission of ACM for your
personal use. Not for redistribution. The definitive version was published in ACM Transactions on Computer Systems,
23(3):219–252, August 2005. http://doi.acm.org/10.1145/1082469.1082470
1

summary of some global system property. In other words, they allow local access to global infor-
mation in order to simplify the task of controlling, monitoring and optimization in distributed ap-
plications. E xamples of aggregation functions include network size, total free storage, maximum
load, average uptime, location and intensity of hotspots, etc. Furthermore, simple aggregation
functions can be used as building blocks to support more complex protocols. For example, the
knowledge of average load in a system can be exploited to implement near-optimal load-balancing
schemes [12].
We distinguish reactive and proactive protocols for computing aggregation functions. Re-
active protocols respond to specific queries issued by nodes in the network. The answers are
returned directly to the issuer of the query while the rest of the nodes may or may not learn about
the answer. Proactive protocols, on the other hand, continuously provide the value of some ag-
gregate function to all nodes in the system in an adaptive fashion. By adaptive we mean that
if the aggregate changes due to network dynamism or because of variations in the input values,
the output of the aggregation protocol should track these changes reasonably quickly. Proactive
protocols are often useful when aggregation is used as a building block for completely decen-
tralized solutions to complex tasks. For example, in the load-balancing scheme cited above, the
knowledge of the global average load is used by each node to decide if and when it should transfer
load [12].
Contribution In this paper we introduce a robust and adaptive protocol for calculating aggre-
gates in a proactive manner. We assume that each node maintains a local approximate of the
aggregate value. The core of the protocol is a simple gossip-based communication scheme in
which each node periodically selects some other random node to communicate with. During this
communication the nodes update their local approximate values by performing some aggregation-
specific and strictly local computation based on their previous approximate values. This local
pairwise interaction is designed in such a way that all approximate values in the system will
quickly converge to the desired aggregate value.
In addition to introducing our gossip-based protocol, the contributions of this paper are three-
fold. First, we present a full-fledged practical solution for proactive aggregation in dynamic
environments, complete with mechanisms for adaptivity, robustness and topology management.
Second, we show how our approach can be extended to compute complex aggregates such as vari-
ances and different means. Third, we present theoretical and experimental evidence supporting
the efciency of the protocol and illustrating its robustness with respect to node and link failures
and message loss.
Outline In Section 2 we define the system model. S ection 3 describes the core idea of the proto-
col and presents theoretical and simulation results of its performance. In Section 4 we discuss the
extensions necessary for practical applications. Section 5 introduces novel algorithms for com-
puting statistical functions including several means, network size and variance. Sections 6 and 7
present analytical and experimental evidence on the high robustness of our protocol. Section 8
describes the prototype implementation of our protocol on PlanetLab and gives experimental re-
sults of its performance. Section 9 discusses related work. Finally, conclusions are drawn in
Section 10.
2 System Model
We consider a network consisting of a large collection of nodes that are assigned unique iden-
tifiers and that communicate through message exchanges. The network is highly dynamic; new
2

do exactly once in each consecutive
δ time units at a randomly picked time
q GETN EIGHBOR()
send s
p
to q
s
q
receive(q)
s
p
UPDATE(s
p
, s
q
)
(a) active thread
do forever
s
q
receive(*)
send s
p
to sender(s
q
)
s
p
UPDATE(s
p
, s
q
)
(b) passive thread
Figure 1: Push-pull gossip protocol executed by node p. The local state of p is denoted as s
p
.
nodes may join at any time, and existing nodes m ay leave, either voluntarily or by crashing. Our
approach does not require any mechanism specific to leaves: spontaneous crashes and voluntary
leaves are treated uniformly. Thus, in the following, we limit our discussion to node crashes.
Byzantine failures, with nodes behaving arbitrarily, are excluded from the present discussion (but
see [11]).
We assume that nodes are connected through an existing routed network, such as the Internet,
where every node can potentially communicate with every other node. To actually communicate,
a node has to know the identifiers of a set of other nodes, called its neighbors. This neighborhood
relation over the nodes defines the topology of an overlay network. Given the large scale and
the dynamicity of our envisioned system, neighborhoods are typically limited to small subsets
of the entire network. The set of neighbors of a node (thus the overlay network topology) can
change dynamically. Communication incurs unpredictable delays and is subject to failures. Single
messages m ay be lost, links between pairs of nodes may break. Occasional performance failures
(e.g., delay in receiving or sending a message in time) can be seen as general communication
failures, and are treated as such. Nodes have access to local clocks that can measure the passage
of real time with reasonable accuracy, that is, with small short-term drift.
In this paper we focus on node and communication failures. Some other aspects of the model
that are outside of the scope of the present analysis (such as clock drift and message delays) are
discussed only informally in Section 4.
3 Gossip-based Aggregation
We assume that each node in the network holds a numeric value. In a practical setting, this value
can characterize any (possibly dynamic) aspect of the node or its environment (e.g., the load at
the node, available storage space, temperature measured by a sensor network, etc.). The task of a
proactive protocol is to continously provide all nodes with an up-to-date estimate of an aggregate
function, computed over the values held by the current set of nodes.
3.1 The Basic Aggregation Protocol
Our basic aggregation protocol is based on the “push-pull gossiping scheme illustrated in F ig-
ure 1. Each node p executes two different threads. The active thread periodically initiates an
information exchange with a random neighbor q by sending it a message containing the local
state s
p
and waiting for a response with the remote state s
q
. The passive thread waits for mes-
sages sent by an initiator and replies with the local state. The term push-pull refers to the fact
that each information exchange is performed in a symmetric manner: both participants send and
receive their states.
3

Even though the system is not synchronous, we find it convenient to describe the protocol
execution in terms of consecutive real time intervals of length δ called cycles that are enumerated
starting from some convenient point.
Method GETNEIGHBOR can be thought of as an underlying service to the aggregation proto-
col, which is normally (but not necessarily) implemented by sampling a locally available set of
neighbors. In other words, an overlay network is applied to find communication partners. In
Section 3.2 we will assume that GETNEIGHBOR returns a uniform random sample over the entire
set of nodes. In Section 4.4 we revisit this service from a practical point of view, by looking at
realistic implementations based on non-uniform or dynamically changing overlay topologies.
Method UPDATE computes a new local state based on the current local state and the remote
state received during the information exchange. The output of UPDATE and the semantics of the
node state depend on the specific aggregation function being implemented by the protocol. In
this section, we limit the discussion to computing the average over the set of numbers distributed
among the nodes. Additional functions (most of them derived from the averaging protocol) are
described in Section 5.
In the case of computing the average, each node stores a single numeric value representing the
current estimate of the nal aggregation output w hich is the global average. Each node initializes
the estimate with the local value it holds. Method UPDATE(s
p
, s
q
), where s
p
and s
q
are the esti-
mates exchanged by p and q, returns (s
p
+ s
q
)/2. After one exchange, the sum of the two local
estimates remains unchanged since method UPDATE simply redistributes the initial sum equally
among the two nodes. So, the operation does not change the global average but it decreases the
variance over the set of all estimates in the system.
It is easy to see that the variance tends to zero, that is, the value at each node will converge
to the true global average, as long as the network of nodes is not partitioned into disjoint clusters.
To see this, one should consider the minimal value in the system. It can be proven that there
is a positive probability in each cycle that either the number of instances of the minimal value
decreases or the global minimum increases if there are different values from the minimal value
(otherwise we are done because all values are equal). The idea is that if there is at least one
different value, than at least one of the instances of the minimal values will have a neighbor with
a different (thus larger) value and so it will have a positive probability to be matched with this
neighbor.
In the following, we give basic theoretical results that characterize the speed of the conver-
gence of the variance. We will show that each cycle results in a reduction of the variance by a
constant factor, which provides exponential convergence. We will assume that no failures oc-
cur and that the starting point of the protocol is synchronized. Later in the paper, all of these
assumptions will be relaxed.
3.2 Theoretical Analysis of Gossip-based Aggregation
We begin by introducing the conceptual framework and notations to be used for the purpose of
the mathematical analysis. We proceed by calculating convergence rates for various algorithms.
Our results are validated and illustrated by numerical simulation when necessary.
We will treat the averaging protocol as an iterative variance reduction algorithm over a vector
of numbers. In this framework, we can formulate our approach as follows. We are given an initial
vector of numbers w
0
= (w
0,1
. . . w
0,N
). The elements of this vector correspond to the initial
values at the nodes. We shall model this vector by assuming that w
0,1
, . . . , w
0,N
are independent
random variables with identical expected values and a finite variance.
The assumption of identical expected values is not as restrictive as it may seem. Too see
this, observe that after any permutation of the initial values, the statistical behavior of the system
4

// vector w is the input
do N times
(i, j) = GETPAIR()
// perform elementary variance reduction step
w
i
= w
j
= (w
i
+ w
j
)/2
return w
Figure 2: Skeleton of global algorithm AVG used to model the distributed protocol of Figure 1.
remains unchanged since the protocol causes nodes to communicate in random order. This means
that if we analyze the model in which we first apply a random permutation over the variables,
we will obtain identical predictions for convergence. But if we apply a permutation, then we
essentially transform the original vector of variables into another vector in which all variables
have identical distribution, so the assumption of identical expected values holds.
In more detail, starting with random variables w
0,1
, . . . , w
0,N
with arbitrary expected values,
after a random permutation, the new value at index i, denoted b
i
, will have the distribution
P (b
i
< x) =
1
N
N
X
j=1
P (w
j
< x) (1)
since all variables can be shifted to any position with equal probability. That is, while obtaining an
equivalent probability model as mentioned above, the distributions of random variables b
0
, . . . , b
N
are now identical. Note that the assumption of independence is technically violated (variables
b
0
, . . . , b
N
are not independent), but in the case of large networks, the consequences will be
insignificant.
When considering the network as a whole, one cycle of the averaging protocol can be seen
as a variance reduction algorithm (let us call it AVG) which takes a vector w of length N as a
parameter and produces a new vector w
= AVG(w) of the same length. In other words, AVG is a
a single, central algorithm operating globally on the distributed state of the system, as opposed to
the distributed protocol of Figure 1. This centralized view of the protocol serves to simplify our
theoretical analysis of its behavior.
The consecutive cycles of the protocol result in a series of vectors w
1
, w
2
, . . ., where w
i+1
=
AVG(w
i
). The elements of vector w
i
are denoted as w
i
= (w
i,1
. . . w
i,N
). Algorithm AVG
is illustrated in Figure 2 and takes w as a parameter and modifies it in place producing a new
vector. The behavior of our distributed gossip-based protocol can be reproduced by an appropriate
implementation of GETPAIR. In addition, other implementations of GETPAI R are possible that do
not necessarily map to any distributed protocol but are of theoretical interest. We will discuss
some important special cases as part of our analysis.
We introduce the following empirical statistics for characterizing the state of the system in
cycle i:
w
i
=
1
N
N
X
k=1
w
i,k
(2)
σ
2
i
= σ
2
w
i
=
1
N 1
N
X
k=1
(w
i,k
w
i
)
2
(3)
where
w
i
is the target value of the protocol and σ
2
i
is a variance-like measure of homogeneity
that characterizes the quality of local approximations. In other words, it expresses the deviation
5

Citations
More filters
Journal ArticleDOI

Internet of things: Vision, applications and research challenges

TL;DR: A survey of technologies, applications and research challenges for Internetof-Things is presented, in which digital and physical entities can be linked by means of appropriate information and communication technologies to enable a whole new class of applications and services.
Proceedings ArticleDOI

PeerSim: A scalable P2P simulator

TL;DR: The key features of peer-to-peer (P2P) systems are scalability and dynamism, so simulation is crucial in P2P research.
Journal ArticleDOI

State Estimation and Sliding-Mode Control of Markovian Jump Singular Systems

TL;DR: A new necessary and sufficient condition is proposed in terms of strict linear matrix inequality (LMI), which guarantees the stochastic admissibility of the unforced Markovian jump singular system.
Journal ArticleDOI

Gossip-based peer sampling

TL;DR: This paper presents a generic framework to implement a peer-sampling service in a decentralized manner by constructing and maintaining dynamic unstructured overlays through gossiping membership information itself, which generalizes existing approaches and makes it easy to discover new ones.
Proceedings Article

Design Patterns from Biology for Distributed Computing

TL;DR: In this article, a conceptual framework that captures several basic biological processes in the form of a family of design patterns is proposed, such as plain diffusion, replication, chemotaxis, and stigmergy.
References
More filters
Journal ArticleDOI

Collective dynamics of small-world networks

TL;DR: Simple models of networks that can be tuned through this middle ground: regular networks ‘rewired’ to introduce increasing amounts of disorder are explored, finding that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs.
Book

Linked: The New Science of Networks

TL;DR: An ink jet comprises an elastic tubular member characterized by piezoelectric properties that is terminated in an orifice adapted to pass droplets of ink when the chamber formed within the tubular members is reduced in size.
Journal ArticleDOI

Reaching Agreement in the Presence of Faults

TL;DR: It is shown that the problem is solvable for, and only for, n ≥ 3m + 1, where m is the number of faulty processors and n is the total number and this weaker assumption can be approximated in practice using cryptographic methods.
Book

Small Worlds: The Dynamics of Networks between Order and Randomness

TL;DR: Duncan Watts uses the small-world phenomenon--colloquially called "six degrees of separation"--as a prelude to a more general exploration: under what conditions can a small world arise in any kind of network?
Proceedings ArticleDOI

Epidemic algorithms for replicated database maintenance

TL;DR: This paper descrikrs several randomized algorit, hms for dist,rihut.ing updates and driving t,he replicas toward consist,c>nc,y.
Related Papers (5)
Frequently Asked Questions (11)
Q1. What are the contributions in "Gossip-based aggregation in large dynamic networks∗" ?

The authors propose a gossip-based protocol for computing aggregate values over network components in a fully decentralized fashion. The authors demonstrate the efficiency and robustness of their gossip-based protocol both theoretically and experimentally under a variety of scenarios including node and communication failures. The class of aggregate functions the authors can compute is very broad and includes many useful special cases such as counting, averages, sums, products and extremal values. 

Examples of aggregation functions include network size, total free storage, maximum load, average uptime, location and intensity of hotspots, etc. 

The Watts-Strogatz and scale-free topologies represent two classes of realistic small-world topologies that are often used to model different natural and artificial phenomena [1, 28]. 

To calculate the nth central moment, given by (w − w)n, the authors can calculate all the raw moments in parallel up to the nth and combine them appropriately, or the authors can proceed in two sequential steps first calculating the average and then the appropriate central moment. 

To implement termination, the authors adopt a very simple mechanism: each node executes the protocol for a predefined number of cycles, denoted as γ, depending on the required accuracy of the output and the convergence factor that can be achieved in the particular overlay topology adopted (see the convergence factor given in Section 3). 

For intermediate values of β, the structure of the graph lies between these two extreme cases: complete order and complete disorder. 

The exact details of the implementation of dynamic queries (if necessary) will depend on the specific environment, taking into account efficiency and performance constraints and possible sources of new queries. 

This represents another important source of error, although the authors note that from their point of view node crashes are more important because the authors model leaves as crashes, so in the presence of churn crash events dominate all other types of failure. 

While static topologies are unrealistic in the presence of churn, the authors still consider them due to their theoretical importance and the fact that their protocol can in fact be applied in static networks as well, although they are not the primary focus of the present discussion. 

In Section 3.2 it was proven that ρ = 1/e (where ρ is the convergence factor) if the authors assume that during a cycle for each particular variance reduction step, each pair of nodes has an equal probability to perform that particular variance reduction step. 

When iterating AVG, the waiting time between two consecutive selections of a given node can be described by the exponential distribution.