What are the contributions mentioned in the paper "Resilient asymptotic consensus in robust networks" ?

This paper addresses the problem of resilient innetwork consensus in the presence of misbehaving nodes. Secure and fault-tolerant consensus algorithms typically assume knowledge of nonlocal information ; however, this assumption is not suitable for large-scale dynamic networks. To remedy this, the authors focus on local strategies that provide resilience to faults and compromised nodes. The authors provide necessary and sufficient conditions for the normal nodes to reach asymptotic consensus despite the influence of the misbehaving nodes under different threat assumptions. The authors show that traditional metrics such as connectivity are not adequate to characterize the behavior of such algorithms, and develop a novel graph-theoretic property referred to as network robustness.

What is the key metric for studying robustness of distributed algorithms?

network connectivity has been the key metric for studying robustness of distributed algorithms because it formalizes the notion of redundant information flow across the network through independent paths.

What is the main idea of graph connectivity?

The notion of graph connectivity has long been the backbone of investigations into fault tolerant and secure distributed algorithms.

What is the key metric in determining whether a fixed number of adversaries can be?

under the assumption of full knowledge of the network topology, connectivity is the key metric in determining whether a fixed number of malicious adversaries can be overcome.

What is the role of robustness in the investigation of purely local algorithms?

Just as connectivity has played a central role in the existing analysis of reliable distributed algorithms with global topological knowledge, the authors believe that robustness will play an important role in the investigation of purely local algorithms.

What is the definition of (r, s)-robustness?

The definition of (r, s)-robustness aims to capture the idea that “enough” nodes in every pair of nonempty, disjoint sets S1,S2 ⊂ V have at least r neighbors outside of their respective sets.

What is the definition of p-fraction reachable set?

Definition 9 (p-fraction reachable set): Given a digraph D and a nonempty subset S of nodes of D, the authors say S is a pfraction reachable set if ∃i ∈ S such that |Vi| > 0 and |Vi \\ S| ≥ p|Vi|, where 0 ≤ p ≤ 1.

What is the way to compare the robustness of different networks?

In general, the parameter r in (r, s)-robustness takes precedence in the partial order that determines relative robustness, and the maximal s is used for ordering the robustness of networks with the same value of r.

What is the condition for asymptotic consensus in discrete-time networks?

a more general condition referred to as the infinite flow property has been shown to be both necessary and sufficient for asymptotic consensus for a class of discrete-time stochastic models [42].

What is the definition of a (r, s)-reachable set?

Definition 12 ((r, s)-reachable set): Given a digraph D and a nonempty subset of nodes S, the authors say that S is an (r, s)reachable set if there are at least s nodes in S, each of which has at least r neighbors outside of S, where r, s ∈ Z≥0; i.e., given X rS = {i ∈ S : |Vi \\ S| ≥ r}, then |X rS | ≥ s.An illustration of an (r, s)-reachable set of nodes is shown in Fig.

What is the reason for the failure of consensus?

Taking a closer look at the graph in Fig. 1, the authors see that the reason for the failure of consensus is that no node has enough neighbors in the opposite set; this causes every node to throw away all useful information from outside of its set, and prevents consensus.

(Open Access) Resilient Asymptotic Consensus in Robust Networks (2013) | Heath J. LeBlanc

Q: What is the condition for reaching asymptotic consensus in dynamic networks?

In this case, under the conditions stated above, a sufficient condition for reaching asymptotic consensus is that there exists a uniformly bounded sequence of contiguous time intervals such that the union of digraphs across each interval has a rooted out-branching [40].

Q: What is the condition for reaching asymptotic consensus in time-invariant networks?

Given these conditions, a necessary and sufficient condition for reaching asymptotic consensus in time-invariant networks is that the digraph has a rooted out-branching, also called a rooted directed spanning tree [38].

766 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL 2013

Resilient Asymptotic Consensus in

Robust Networks

Heath J. LeBlanc, Member, IEEE, Haotian Zhang, Student Member, IEEE,

Xenofon Koutsoukos, Senior Member, IEEE, Shreyas Sundaram, Member, IEEE

Abstract—This paper addresses the problem of resilient in-

network consensus in the presence of misbehaving nodes. Secure

and fault-tolerant consensus algorithms typically assume knowl-

edge of nonlocal information; however, this assumption is not

suitable for large-scale dynamic networks. To remedy this, we

focus on local strategies that provide resilience to faults and

compromised nodes. We design a consensus protocol based on

local information that is resilient to worst-case security breaches,

assuming the compromised nodes have full knowledge of the

network and the intentions of the other nodes. We provide

necessary and sufﬁcient conditions for the normal nodes to

reach asymptotic consensus despite the inﬂuence of the mis-

behaving nodes under different threat assumptions. We show

that traditional metrics such as connectivity are not adequate

to characterize the behavior of such algorithms, and develop a

novel graph-theoretic property referred to as network robustness.

Network robustness formalizes the notion of redundancy of direct

information exchange between subsets of nodes in the network,

and is a fundamental property for analyzing the behavior of

certain distributed algorithms that use only local information.

Index Terms—Consensus; In-Network Computation; Robust

Networks; Resilience; Byzantine; Adversary; Distributed Algo-

rithms.

I. INTRODUCTION

NGINEERED systems have undergone a paradigm shift

from centralized to distributed, propelled by advances

in networking and low-cost, high performance embedded

devices. These advances have enabled a transition from end-to-

end routing of information in large-scale networked systems to

in-network computation of aggregate quantities of interest [3].

In-network computing offers certain performance advantages,

including reduced latency, smaller communication overhead,

and greater robustness to node and link failures.

A fundamental challenge of in-network computation is that

the quantities of interest must be calculated using only local

information, i.e., information obtained by each node through

sensor measurements, calculations, or communication only

with neighbors in the network. Another important challenge

Manuscript received April 9, 2012; revised December 30, 2012. Some of the

results in this paper were presented in preliminary form at the First Conference

on High-Conﬁdence Networked Systems (HiCoNS 2012) [1] and at the 2012

American Control Conference [2].

H. LeBlanc is with the Electrical & Computer Engineering and Computer

Science Department, Ohio Northern University, Ada, OH, 45810 USA (e-

mail: h-leblanc@onu.edu).

H. Zhang and S. Sundaram are with the Department of Electrical and

Computer Engineering at the University of Waterloo, Waterloo, Ontario,

Canada (e-mail: {h223zhan,ssundara}@uwaterloo.ca).

X. Koutsoukos is with the Department of Electrical Engineering and

Computer Science, Vanderbilt University, Nashville, TN, USA (e-mail: xeno-

fon.koutsoukos@vanderbilt.edu).

Digital Object Identiﬁer 10.1109/JSAC.2013.130413.

is that large-scale distributed systems have many potential

vulnerable points for failures or attacks. To obtain the desired

computational results, it is important to design the in-network

algorithms to be able to withstand the compromise of a subset

of the nodes and still ensure some notion of correctness

(possibly at a degraded level of performance). We refer to

such a networked system as being resilient to adversaries.

Given the growing threat of malicious attacks in large-scale

cyber-physical systems, this is an important and challenging

problem [4].

One of the most important objectives in networked sys-

tems is to reach consensus on a quantity of interest [5]–

[8]. Consensus is fundamental to diverse applications such as

data aggregation [9], distributed estimation [10], distributed

optimization [11], distributed classiﬁcation [12], and ﬂock-

ing [13]. Reaching consensus (and more generally, trans-

mitting information) resiliently in the presence of faulty or

misbehaving nodes has been studied extensively in distributed

computing [5], [14]–[18], communication networks [19], [20],

and mobile robotics [21]–[23]. Among other things, it has

been shown that given F (worst-case) adversarial nodes,

there exists a strategy for these nodes to disrupt consensus

if the network connectivity

is 2F or less. Conversely, if

the network connectivity is at least 2F +1, then there exist

strategies for the normal nodes to use that ensure consensus

is reached (under the local broadcast model of communica-

tion) [5], [24], [25]. However, these consensus algorithms

either require that normal nodes have at least some nonlocal

information (e.g., knowledge of multiple independent paths in

the network between themselves and other nodes) or assume

that the network is complete, i.e., all-to-all communication or

sensing [14], [21]–[23], [26]. Moreover, these algorithms tend

to be computationally expensive. Therefore, there is a need for

resilient consensus algorithms that have low complexity and

operate using only local information (i.e., without knowledge

of the network topology and the identities of non-neighboring

nodes). A key challenge is to characterize fundamental topo-

logical properties that allow the normal nodes to compute an

appropriate consensus value, despite the inﬂuence of misbe-

having nodes.

The faulty or misbehaving nodes can be characterized

by threat models and scope of threat assumptions. Exam-

ples of fault or threat models include non-colluding [25],

malicious [24]–[26], Byzantine [14], [21], [27], [28], or

The network connectivity is deﬁned as the smaller of the two following

values: (i) the size of a minimal vertex cut and (ii) n − 1,wheren is the

number of nodes in the network.

0733-8716/13/$31.00

 2013 IEEE

LEBLANC et al.: RESILIENT ASYMPTOTIC CONSENSUS IN ROBUST NETWORKS 767

crashed [21], [22] nodes. Typically, the scope of the faults or

threats is assumed to be bounded by a constant, i.e., at most

F out of n nodes fail or are compromised. We refer to this

as the F -total model. Alternatively, the scope may be local;

e.g., at most F neighbors of any normal node fail (F -local

model), or at most a fraction f of neighbors are compromised

(f-fraction local model).

A. Previous Work on Consensus With Only Local Information

In [29], the authors introduced the Approximate Byzantine

Consensus problem, in which the normal nodes are required

to achieve approximate agreement

(i.e., they should converge

to a relatively small convex set contained in the convex hull of

their initial values) in the presence of F -total Byzantine faults

in ﬁnite time. They consider only complete networks (where

there is a direct connection between every pair of nodes), and

they propose the following algorithm: each node disregards

the largest and smallest F values received from its neighbors

and updates its state to be the average of a carefully chosen

subset of the remaining values. This algorithm was extended

to a family of algorithms, named the Mean-Subsequence-

Reduced (MSR) algorithms, in [30]. Although the research on

Approximate Byzantine Consensus for complete networks is

mature, there are few papers that have attempted to analyze

this algorithm in more general topologies [31], and even then,

only certain special networks have been investigated.

Recently, we have studied resilient algorithms in the pres-

ence of misbehaving nodes [32], [33]. In [26], we proposed

a continuous-time variation of the MSR algorithms, named

the Adversarial Robust Consensus Protocol (ARC-P),tosolve

asymptotic consensus under the F -total malicious model. The

results of [26] were extended to both malicious and Byzantine

threat models in networks with constrained information ﬂow

and dynamic network topology in [27]. The sufﬁcient condi-

tions studied in [27] are stated in terms of in-degrees and out-

degrees of nodes in the network and are shown to be sharp, i.e.,

if the conditions are relaxed, even minimally, then there are

examples in which the relaxed conditions are not sufﬁcient. In

[2], we generalized the MSR algorithm to the Weighted-Mean-

Subsequence-Reduced (W-MSR) algorithm and studied general

distributed algorithms with F -local malicious adversaries.

In a recent paper, developed independently of our work,

Vaidya et al. have characterized tight conditions for resilient

consensus using the MSR algorithm when the threat model

is Byzantine and the scope is F -total [28]. The network

constructions used in [28] are very similar to the robust

digraphs presented here. In particular, the networks in [28] also

require redundancy of direct information exchange between

subsets of nodes in the network.

In contrast to the deterministic approach taken here, gossip

algorithms have been studied for in-network computation of

aggregate functions such as sums, averages, and quantiles [9].

In such algorithms, each node chooses at random a single

neighbor to communicate with in each round. This scheme

limits the required computational, communication, and energy

resources, and provides some robustness against time-varying

If the network is synchronous, and if one allows t →∞, then approximate

agreement is equivalent to asymptotic consensus.

topologies and random node and link failures [34]. However,

we are not aware of any work that studies the resilience of

gossip-based algorithms to malicious attacks.

B. Contributions

In this paper, we show that traditional graph theoretic prop-

erties such as connectivity and minimum degree, which have

played a vital role in characterizing the resilience of distributed

algorithms (see [5], [24]), are not adequate when the nodes

make purely local decisions (i.e., without knowing nonlocal

aspects of the network topology). Instead, we introduce a novel

topological property, referred to as network robustness,and

show that this concept is the key property for reasoning about

the ability of purely local algorithms to succeed. In particular,

we provide a comprehensive characterization of the network

topologies where algorithms such as W-MSR (which uses only

local information and operates in synchronous networks) can

succeed despite the presence of a broad class of adversaries.

We establish results for both malicious and Byzantine threats,

where the scope is F -total, F -local, and f -fraction local, and

the network is time-invariant or time-varying. For the case of

time-invariant networks, we provide, for the ﬁrst time, a tight

condition for the W-MSR algorithm to succeed under the F -

total malicious model. Furthermore, we give tight conditions

for F -local and F -total Byzantine threats (the proof for the F -

total Byzantine model is different than the proof given in [28],

and is stated for the more general W-MSR algorithm and in

terms of network robustness). We prove separate necessary

and sufﬁcient conditions for the W-MSR algorithm under

the F -local malicious, f -fraction local malicious, and f -

fraction local Byzantine threat models. For all threat models,

we provide sufﬁcient conditions for the case of time-varying

networks.

In addition to the results on resilient asymptotic consensus,

we also examine properties of robust digraphs. We demon-

strate the connectivity and degree properties of robust di-

graphs, explore the robustness maintained after edge removal,

and describe how to compare the relative robustness of dif-

ferent digraphs. Finally, we provide a method that enables the

construction of robust networks and show that the preferential

attachment mechanism for generating complex networks is a

special case of this method (and therefore produces robust

networks).

The rest of the paper is organized as follows. Section II

introduces the problem of resilient consensus. Section III

presents the W-MSR algorithm. Section IV demonstrates

the inadequacy of connectivity as a metric to analyze the

behavior of the W-MSR algorithm, and formally introduces

the notion of network robustness. The main results are given

in Section V. A simulation example is presented in Section VI.

In Section VII, we discuss properties of network robustness

and provide a construction for robust networks. Finally, some

conclusions are given in Section VIII.

C. Notation and Graph Terminology

Throughout this paper, we denote the set of integers by Z

and the set of real numbers by R. The set of integers greater

than or equal to some integer q ∈ Z is denoted Z

≥q

.The

768 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL 2013

cardinality of a set S is denoted by |S|.GivensetsS

, S

,the

reduction of S

by S

is denoted S

= {x ∈S

: x/∈S

A ﬁnite simple directed graph, or just digraph, is denoted

D =(V, E),inwhichV is the node set and E⊂V×Vis the

directed edge set. With a slight abuse of terminology, we often

refer to the network and the digraph that models the topology

of the network synonymously. The underlying graph G(D) is

deﬁned by replacing directed edges of D by undirected ones,

resulting in the edge set E

. A digraph D



=(V



, E



) is a

subdigraph of D, written D



⊆D,ifV



⊆V and E



⊆E.

A path is a sequence of distinct vertices i

,...,i

such

that (i

j+1

) ∈E, j =0, 1,...,k − 1. We say that D

is strongly connected if for every i, j ∈V, there exists a

path starting at i and ending at j. If the underlying graph

is connected, then D is weakly connected. Alternatively, if the

underlying graph is disconnected, then D is disconnected.A

digraph has a rooted out-branching if there exists a node r,

the root, such that for each i ∈V, there exists a path from r

to i.

II. P

ROBLEM FORMULATION

Consider a time-varying network modeled by the digraph

D[t]=(V, E[t]),whereV = {1 , ..., n} is the node set and

E[t] ⊂V×V is the directed edge set at time-step t ∈ Z

≥0

The node set is partitioned into a set of normal nodes N

and a set of adversary nodes A which is unknown a priori

to the normal nodes. Each directed edge (j, i) ∈E[t] models

information ﬂow and indicates that node i can be inﬂuenced

by (or receive information from) node j at time-step t.The

set of in-neighbors,orjustneighbors, of node i at time-step

t is deﬁned as V

[t]={j ∈V:(j, i) ∈E[t]} and the (in-)

degree of i is denoted d

[t]=|V

[t]|. Likewise, the set of out-

neighbors of node i at time-step t is deﬁned as V

out

[t]={j ∈

V :(i, j) ∈E[t]}. Because each node has access to its own

state at time-step t, we also consider the inclusive neighbors

of node i, denoted J

[t]=V

[t]∪{i}. Time-invariant networks

are represented by dropping the dependence on t.

A. Update Model

Suppose that each node i ∈N begins with some private

value x

[0] ∈ R (which could represent a measurement,

optimization variable, vote, etc.). The nodes interact syn-

chronously by conveying their value to (out-)neighbors in the

network. Each normal node updates its own value over time

according to a prescribed rule, which is modeled as

[t +1]=f

({x

[t]}),j∈J

[t],i∈N,t∈ Z

≥0

where x

[t] is the value sent from node j to node i at time-step

t,andx

[t]=x

[t]. The update rule f

(·) can be an arbitrary

function, and may be different for each node, depending

on its role in the network. These functions are designed

aprioriso that the normal nodes compute some desired

function. However, some of the nodes may not follow the

prescribed strategy if they are compromised by an adversary.

Such misbehaving nodes threaten the group objective, and

it is important to design the f

(·)’s in such a way that the

inﬂuence of such nodes can be eliminated or reduced without

prior knowledge about their identities.

B. Threat Model

Deﬁnition 1: A node i ∈Ais said to be Byzantine if it

does not send the same value to all of its neighbors at some

time-step, or if it applies some other function f



(·) at some

time-step.

Deﬁnition 2: A node i ∈Ais said to be malicious if it

sends x

[t] to all of its neighbors at each time-step, but applies

some other function f



(·) at some time-step.

Note that both malicious and Byzantine nodes are allowed

to update their states arbitrarily (perhaps colluding with other

malicious or Byzantine nodes to do so). The only difference

is in their capacity for duplicity. If the network is realized

through sensing or broadcast communication, it is natural to

assume that the out-neighbors receive the same information,

thus motivating the deﬁnition of a malicious node. If the

network is point-to-point, however, Byzantine behavior is

possible. Note that all malicious nodes are Byzantine, but not

vice versa. When we do not need to explicitly distinguish

between Byzantine and malicious threats, we simply say those

nodes are misbehaving.

C. Scope of Threats

Having deﬁned the types of misbehavior in the system,

it is necessary to deﬁne the number of misbehaving nodes.

While there are various stochastic models that could be used to

formalize the scope of threats, we use a deterministic approach

and consider upper bounds on the number of compromised

nodes either in the network (F -total) or in each node’s

neighborhood (F -local). To account for varying degrees of

different nodes, we also introduce a fault model that considers

an upper bound on the fraction of compromised nodes in any

node’s neighborhood.

Deﬁnition 3 (F -total set): AsetS⊂Vis F -total if it

contains at most F nodes in the network, i.e., |S| ≤ F ,

F ∈ Z

≥0

Deﬁnition 4 (F -local set): AsetS⊂Vis F -local if it

contains at most F nodes in the neighborhood of the other

nodes for all t, i.e., |V

[t]



S| ≤ F, ∀i ∈V\S, ∀t ∈ Z

≥0

F ∈ Z

≥0

Deﬁnition 5 (f-fraction local set): AsetS⊂Vis f-

fraction local if it contains at most a fraction f of nodes in the

neighborhood of the other nodes for all t, i.e., |V

[t]



S| ≤

f|V

[t]|, ∀i ∈V\S, ∀t ∈ Z

≥0

, 0 ≤ f ≤ 1.

It should be noted that in time-varying network topologies,

the local properties deﬁning an F -local set (or an f -fraction

local set) must hold at all time instances. These deﬁnitions

facilitate the following scope of threat models.

Deﬁnition 6: A set of adversary nodes is F -totally

bounded, F -locally bounded or f-fraction locally bounded

if it is an F -total set, F -local set or f -fraction local set,

respectively. We refer to these threat scopes as the F -total,

F -local,andf-fraction local models, respectively.

F -totally bounded faults have been studied in distributed

computing [5], [14], [28] and mobile robotics [21]–[23] for

both stopping (or crash) failures and Byzantine failures. The

F -locally bounded fault model has been studied in the context

of fault-tolerant broadcasting [35], [36]. However, to the best

of our knowledge, there are no prior works discussing the f -

fraction local model; our investigation of this model is inspired

LEBLANC et al.: RESILIENT ASYMPTOTIC CONSENSUS IN ROBUST NETWORKS 769

by ideas pertaining to contagion in social and economic

networks [37], where a node will accept some new information

(behavior or technology) if more than a certain fraction of its

neighbors has adopted it. However, these previous works do

not consider faulty or malicious behavior, and our deﬁnition

is a natural extension to the existing interpretations.

D. Resilient Asymptotic Consensus

Given the threat model and scope of threats, we formally

deﬁne resilient asymptotic consensus. Let M [t] and m[t] be

the maximum and minimum values of the normal nodes at

time-step t, respectively.

Deﬁnition 7 (Resilient Asymptotic Consensus):The normal

nodes are said to achieve resilient asymptotic consensus in

the presence of (a) F -totally bounded, (b) F -locally bounded,

or (c) f-fraction locally bounded misbehaving (Byzantine or

malicious) nodes if

• ∃L ∈ R such that lim

t→∞

[t]=L for all i ∈N,and

• [m[0],M[0]] is an invariant set (i.e., the normal values

remain in the interval for all t),

for any choice of initial values. Whenever the scope of threat

is understood, we simply say that the normal nodes reach

asymptotic consensus.

The resilient asymptotic consensus problem has two impor-

tant conditions. First, the normal nodes must reach asymptotic

consensus in the presence of misbehaving nodes given a

particular threat model (e.g., malicious) and scope of threat

(e.g., F -total). This is a condition on agreement. Additionally,

it is required that the interval containing the initial values of

the normal nodes is an invariant set for the normal nodes;

this is a safety condition. This condition is important in safety

critical processes where the interval [m[0],M[0]] is known to

be safe. The agreement and safety conditions, when combined,

imply a third condition on validity: the consensus quantity that

the values of the normal nodes converge to must lie within the

range of initial values of the normal nodes.

The validity condition is reasonable in applications where

any value in the range of initial values of normal nodes is

acceptable to select as the consensus value. For instance,

consider a large sensor network where every sensor takes a

measurement of its environment, captured as a real number.

Suppose that at the time of measurement, all values taken

by correct sensors fall within a range [a, b], and that all

sensors are required to come to an agreement on a common

measurement value. If the range of measurements taken by

the normal sensors is relatively small, it will likely be the

case that reaching agreement on a value within that range will

form a reasonable estimate of the measurements taken by all

sensors. However, if a set of malicious nodes is capable of

biasing the consensus value outside of this range, the error in

the measurements could be arbitrarily large.

More generally, suppose the nodes are trying to distribu-

tively minimize



(θ), where each of the h

’s is a local

convex function and θ is the optimization variable. If the initial

value of each node i represents the value of θ that minimizes

, a convex combination of these initial values will represent

an estimate of the optimal θ, within some bounded error.

On the other hand, if an adversary is capable of biasing the

consensus value arbitrarily, the resulting value of the objective

function will also be arbitrarily far away from its minimum

value. One can formulate similar motivating examples for the

validity condition in other applications as well; for instance, a

swarm of robots that are trying to ﬂock should not be pulled

in arbitrary directions by a malicious agent in the network.

III. C

ONSENSUS ALGORITHM

While there are various approaches to facilitate consensus,

aclassoflinear algorithms have attracted signiﬁcant interest

in recent years [6], [38], due to their applicability in a variety

of contexts. In such strategies, at time t, each node senses or

receives information from its neighbors, and changes its value

according to

[t +1]=



j∈J

[t]

[t]x

[t], (1)

where w

[t] is the weight assigned to node j’s value by node

i at time-step t. The above strategy is the so-called Linear

Consensus Pro tocol (LCP).

Different conditions have been reported in the literature to

ensure asymptotic consensus is reached [7], [13], [39]–[41].

In discrete time, it is common to assume that there exists a

constant α ∈ R, 0 <α<1 such that all of the following

conditions hold:

• w

[t]=0whenever j ∈J

[t],i∈N, t ∈ Z

≥0

;

• w

[t] ≥ α, ∀j ∈J

[t],i∈N, t ∈ Z

≥0

;

•



j=1

[t]=1, ∀i ∈N, t ∈ Z

≥0

Given these conditions, a necessary and sufﬁcient condition

for reaching asymptotic consensus in time-invariant networks

is that the digraph has a rooted out-branching, also called

a rooted directed spanning tree [38]. The case of dynamic

networks is not quite as straightforward. In this case, under

the conditions stated above, a sufﬁcient condition for reaching

asymptotic consensus is that there exists a uniformly bounded

sequence of contiguous time intervals such that the union of

digraphs across each interval has a rooted out-branching [40].

Recently, a more general condition referred to as the inﬁnite

ﬂow property has been shown to be both necessary and

sufﬁcient for asymptotic consensus for a class of discrete-time

stochastic models [42]. Finally, the lower bound on the weights

is needed because there are examples of asymptotically van-

ishing weights in which consensus is not reached [43].

Given a ﬁxed, bidirectional network topology, the selection

of the optimal weights in (1) with respect to the speed of

the consensus process can be done by solving a semideﬁnite

program (SDP) [39]. However, this SDP is solved at design

time with global knowledge of the network topology. A

simple suboptimal choice of weights that requires only local

information is to let w

[t]=1/(1 + d

[t]) for j ∈J

[t].

One problem with the linear update given in (1) is that

it is not resilient to misbehaving nodes. In fact, it was

shown in [13], [44] that a single ‘leader’ node can cause

all agents to reach consensus on an arbitrary value of its

choosing (potentially resulting in a dangerous situation in

physical systems) simply by holding its value constant. Thus,

by themselves, the dynamics given by (1) do not facilitate

resilient asymptotic consensus for any of the fault models.

770 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL 2013

We now describe a simple modiﬁcation to the update rule,

and then provide a comprehensive characterization of network

topologies in which resilient asymptotic consensus is reached

under such dynamics.

A. Description of W-MSR

At every time-step t, each normal node i obtains the values

of other nodes in its neighborhood. At most F of node i’s

neighbors may be misbehaving; however, node i is unsure of

which neighbors may be compromised. To ensure that node

i updates its value in a safe manner, we consider a protocol

where each node removes the extreme values with respect to

its own value. More speciﬁcally:

1) At each time-step t, each normal node i obtains the values

of its neighbors, and forms a sorted list.

2) If there are less than F values strictly larger than its own

value, x

[t], then normal node i removes all values that

are strictly larger than its own. Otherwise, it removes

precisely the largest F values in the sorted list (breaking

ties arbitrarily). Likewise, if there are less than F values

strictly smaller than its own value, then node i removes

all values that are strictly smaller than its own. Otherwise,

it removes precisely the smallest F values.

3) Let R

[t] denote the set of nodes whose values were

removed by normal node i in step 2 at time-step t. Each

normal node i applies the update

[t +1]=



j∈J

[t]\R

[t]

[t]x

[t], (2)

where the weights w

[t] satisfy the conditions stated

above, but with J

[t] replaced by J

[t] \R

[t].

To accommodate the f -fraction local model, the parameter

F in step 2 above is replaced by F

= fd

[t]. As a matter

of terminology, we refer to the bound on the number (or

fraction) of larger or smaller values that could be thrown away

as the parameter of the algorithm. Above, the parameter of

W-MSR with the F -local and F -total models is F , whereas

the parameter with the f -fraction local model is f,andthe

meaning of the parameter will be clear from the context.

Observe that the set of nodes removed by normal node i,

[t], is possibly time-varying. Hence, even if the underlying

network topology is ﬁxed, the W-MSR algorithm effectively

induces switching behavior, and can be viewed as the linear

update of (1) with a speciﬁc rule for state-dependent switching

(the rule given in step 2).

The above algorithm is extremely lightweight, and does not

require any normal node to have any knowledge of the network

topology or of the identities of non-neighbor nodes. Given

these highly desirable properties, the question that we answer

in this paper is: in what networks will the above algorithm

facilitate resilient asymptotic consensus?

B. Use of Related Algorithms in Previous Work

As mentioned in the introduction, the use of similar algo-

rithms that remove extreme values and then form an average

In this case, a simple choice for the weights is to let w

[t]=1/(1 +

[t] −|R

[t]|) for j ∈J

[t] \R

[t].

from a subset of the remaining values have been studied

for decades. In [29], functions that perform this type of

operation are referred to as approximation functions,and

both synchronous and asynchronous algorithms are studied

that use these approximation functions in complete networks

for resilience to F -total Byzantine faults. These approxima-

tion functions are used in the family of Mean-Subsequence-

Reduced (MSR) algorithms [30]. There are a few key differ-

ences between the operations used in the W-MSR algorithm

and the MSR algorithm of [30]. First, W-MSR does not

always remove the largest and smallest F values as in the

MSR algorithm [30]. Instead, only the extreme values that are

strictly larger or strictly smaller than the given node’s value

are removed. Since the node’s own value may be one of the

F extreme values, the MSR algorithm may throw away this

useful (correct) information. Second, W-MSR uses all values

retained after removing the extreme values. MSR, on the other

hand, may select only a subsequence of the remaining values

to use in the update. Finally, MSR averages the remaining

values instead of allowing for weighted averages as in W-

MSR.

MSR algorithms have also been used for Byzantine point

convergence of mobile robots in complete networks [23].

Besides Byzantine faults, some works also consider other

threat models [30]. However, few papers have addressed

the convergence of MSR algorithms in less restrictive (non-

complete) networks. Some exceptions include [31], [45], [46].

In [31], the authors studied local convergence (convergence of

a subset of nodes) in undirected regular graphs

; the results

are extended to asynchronous networks in [46] and global

convergence of a class of undirected regular graphs, named

Partially Fully Connected Networks (PFCN), in [45]. More

recently, [28] provides necessary and sufﬁcient conditions on

the network topology required for a special case of the MSR

algorithm (which retains all of the values after removing

the extreme ones) to achieve consensus in the presence of

F -total Byzantine faults. In the following sections, we will

develop a novel topological property and show that this

property is essential for studying MSR (and more generally,

W-MSR) algorithms in arbitrary networks for the broad class

of adversarial models described in Section II.

Finally, it is worth noting the relationship between the W-

MSR algorithm and robust consensus algorithms designed to

withstand outliers [47], [48]. The problem of robust consensus

to outliers does not assume a threat model, such as malicious

or Byzantine nodes. Instead, some measurements may be

statistical outliers (caused by noise) and the goal is to reach

consensus on the measurements in a manner that reduces the

error introduced by the outliers. In these works the nodes with

outlier measurements are cooperative in the consensus process.

Therefore, such techniques are not designed to work in the

presence of misbehaving nodes. Furthermore, the W-MSR al-

gorithm will also handle the case where the misbehaving nodes

change their initial values, but behave normally otherwise.

A regular graph is a graph where each vertex has the same number of

neighbors.

Resilient Asymptotic Consensus in Robust Networks

Figures

Citations

A survey of distributed optimization

Observer-Based Event-Triggering Consensus Control for Multiagent Systems With Lossy Sensors and Cyber-Attacks

A Systems and Control Perspective of CPS Security

Resilient consensus of second-order agent networks: Asynchronous update rules with delays

Distributed Optimization Under Adversarial Nodes

References

Statistical mechanics of complex networks

Consensus and Cooperation in Networked Multi-Agent Systems

Coordination of groups of mobile autonomous agents using nearest neighbor rules

Consensus seeking in multiagent systems under dynamically changing interaction topologies

The Byzantine Generals Problem

Related Papers (5)

Consensus Computation in Unreliable Networks: A System Theoretic Approach

Distributed Function Calculation via Linear Iterative Strategies in the Presence of Malicious Agents

Consensus and Cooperation in Networked Multi-Agent Systems

The Byzantine Generals Problem

Reaching approximate agreement in the presence of faults

Frequently Asked Questions (14)

Q1. What are the contributions mentioned in the paper "Resilient asymptotic consensus in robust networks" ?

Q2. What is the key metric for studying robustness of distributed algorithms?

Q3. What is the main idea of graph connectivity?

Q4. What is the key metric in determining whether a fixed number of adversaries can be?

Q5. What is the role of robustness in the investigation of purely local algorithms?

Q6. What is the condition for reaching asymptotic consensus in dynamic networks?

Q7. What is the definition of (r, s)-robustness?

Q8. What is the definition of p-fraction reachable set?

Q9. What is the way to compare the robustness of different networks?

Q10. What is the condition for reaching asymptotic consensus in time-invariant networks?

Q11. What is the condition for asymptotic consensus in discrete-time networks?

Q12. What are the conditions that are shown to be sharp?

Q13. What is the definition of a (r, s)-reachable set?

Q14. What is the reason for the failure of consensus?