scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Path-Based Epidemic Spreading in Networks

01 Feb 2017-IEEE ACM Transactions on Networking (IEEE)-Vol. 25, Iss: 1, pp 565-578
TL;DR: This work uses continuous-time Markov chain analysis to model the influence of the infectious agent and routing paths on the spreading behavior by taking into account the state transitions of each node individually, rather than the mean aggregated behavior of all nodes.
Abstract: Conventional epidemic models assume omni-directional contact -based infection. This strongly associates the epidemic spreading process with node degrees. The role of the infection transmission medium is often neglected. In real-world networks, however, the infectious agent as the physical contagion medium usually flows from one node to another via specific directed routes (path-based infection). Here, we use continuous-time Markov chain analysis to model the influence of the infectious agent and routing paths on the spreading behavior by taking into account the state transitions of each node individually, rather than the mean aggregated behavior of all nodes. By applying a mean field approximation, the analysis complexity of the path-based infection mechanics is reduced from exponential to polynomial. We show that the structure of the topology plays a secondary role in determining the size of the epidemic. Instead, it is the routing algorithm and traffic intensity that determine the survivability and the steady-state of the epidemic. We define an infection characterization matrix that encodes both the routing and the traffic information. Based on this, we derive the critical path-based epidemic threshold below which the epidemic will die off, as well as conditional bounds of this threshold which network operators may use to promote/suppress path-based spreading in their networks. Finally, besides artificially generated random and scale-free graphs, we also use real-world networks and traffic, as case studies, in order to compare the behaviors of contact- and path-based epidemics. Our results further corroborate the recent empirical observations that epidemics in communication networks are highly persistent.

Summary (3 min read)

Introduction

  • Motivated by these observations, the authors model path-based information spreading by advancing the state of the art of epidemic theory to account for the directional effect caused by the paths constructed by different routing protocols as well as by the role played by the infectious agent1 as the “infection carrier” that spreads the epidemic.
  • The authors focus on communication networks that transport data traffic via paths/routes.
  • The authors pathbased spreading analytical framework is described next in Section III-C.

A. General Path-based Spreading Process

  • The authors use the Markov chain diagrams of SI, SIS and SIR models in Fig. 2 to illustrate the key departure of path-based spreading process from the conventional contact-based one.
  • Note that a direct transition from state SIS to state III is not possible for a line graph because the flow of packets and thus infection is directional.
  • An infected node in the middle can only infect nodes either to its left or right at any time.
  • Therefore, it is already obvious that while the topology still plays a role in influencing the spread of the infection, it is the routing protocol that finally governs the actual infection dynamics.
  • Packets traverse from one node to another via a path with certain traffic arrival rate, λ.

B. Modelling the Infectious Agent

  • For their purpose, the authors depart from the conventional routing matrix that maps traffic to links (e.g., [44]) and instead, encode the traffic to nodes it traverses or is destined to.
  • Rn describe the node involvement in delivering the traffic originating from node n.
  • By applying Markov theory, the infinitesimal generator Qn(t) of this two-state continuous Markov chain can be written as below: Qn(t) = [ −q1;n q1;n q2;n −q2;n ] (10) where the transitions involving the curing process are independent of the states of other nodes and thus, q2;n = δ (See Section V for discussion.).
  • The instantaneous fraction of infected nodes in the network can then be written as 6The implications of this approximation are discussed in [20].
  • Furthermore, Fig. 5 shows the steady state for sample networks of different sizes obtained both from their model and simulation runs.

IV. PATH-BASED EPIDEMIC THRESHOLD

  • As briefly mentioned in Section II, in previous studies (e.g., [18]), a theoretical critical threshold, τc, has been found below which the epidemic will almost certainly die off and vice versa.
  • From Eq. 26, the authors note that the critical threshold is independent of the initial state of the system.
  • The authors begin their comparison of the two types of epidemic spreading based on their corresponding exact 2N -state Markov chain of the network states.
  • Therefore, QpathU4 will have more non-zero elements than QcontactU4 and thus, Q path is denser than Qcontact.
  • For such matrices, the largest eigenvalue in magnitude is a real number and for this eigenvalue, the real part is the largest amongst all the eigenvalues [51].

VI. EFFECT OF ROUTING PROTOCOL, TRAFFIC AND NETWORK TOPOLOGY

  • The authors investigate the role of the network topology (i.e., A), the routing protocol (i.e., R) and the traffic load in the system (i.e., Γ) in determining the path-based spreading of an epidemic (i.e., τc).
  • Essentially, changing the link weight distribution results in a different set of shortest paths for the same graphs and thus a different disorder limit is achieved.
  • When the network topology, A, and traffic load, Γ, are known, then τc = τ max c when R = R ∪SPT where R∪SPT is the routing matrix of the shortest paths by hop count between all possible node pairs in A. Proof.
  • This implies that the total node involvement for delivering infectious agent in the network is inflated since by not using the shortest paths, more nodes are involved in delivering the same amount of infectious agents (i.e., ρ∪SPT ≤ ρ∗).
  • This theorem is especially useful for cases where the network topology can be flexibly constructed either following certain requirements (e.g., data center networks) or specific rules/guidelines (e.g., self-organizing wireless sensor networks).

VII. CASE STUDIES

  • The authors apply their path-based epidemic analytical framework to three real world Tier-1 networks (i.e., Level3 (AS1), Sprint (AS1239) and AT&T (AS7018) at point-ofpresence (POP)-level based on the data from [21]) to investigate how conducive they are regarding path-based epidemic spreading.
  • Two sets of values are computed for parameters related to paths, (1) non-weighted and (2) weighted links.
  • For a contact-based epidemic, the probability of a node being infected is strongly correlated with its degree (i.e., compare columns 2 and 3).
  • For the Level-3 network, the operator should protect Washington, Denver and Indianapolis against contact-based, path-based unweighted and path-based weighted epidemics respectively.

VIII. CONCLUSIONS

  • The authors express 8Bracketed values in columns 2-6 indicate the node degrees.
  • Infection permeation in contact-based epidemic is largely determined by the topology structure, A. This is not the case for path-based epidemic as the primary factors for infection spreading are now related to the traffic load and the way this is routed to destinations.
  • In addition, since the authors consider each node separately, they can also easily identify/rank nodes within the network that are the most conducive to spreading the infection.
  • Based on their model, the critical epidemic thresholds are diminishingly small with λ and this re-affirms the observations reported in [29], [30] that epidemics in communication networks are extremely robust to extinction.

Did you find this useful? Give us your feedback

Figures (13)

Content maybe subject to copyright    Report

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 1, NO. 1, JULY 2016 1
Path-based Epidemic Spreading in Networks
Wei Koong Chai, Member, IEEE, and George Pavlou, Senior Member, IEEE
Abstract—Conventional epidemic models assume omni-
directional contact-based infection. This strongly associates the
epidemic spreading process with node degrees. The role of
the infection transmission medium is often neglected. In real-
world networks, however, the infectious agent as the physical
contagion medium usually flows from one node to another via
specific directed routes (i.e., path-based infection). Here, we use
continuous-time Markov chain analysis to model the influence of
the infectious agent and routing paths on the spreading behavior
by taking into account the state transitions of each node individ-
ually, rather than the mean aggregated behavior of all nodes. By
applying a mean field approximation, the analysis complexity of
the path-based infection mechanics is reduced from exponential
to polynomial. We show that the structure of the topology plays
a secondary role in determining the size of the epidemic. Instead,
it is the routing algorithm and traffic intensity that determine the
survivability and the steady-state of the epidemic. We define an
infection characterization matrix that encodes both the routing
and traffic information. Based on this, we derive the critical path-
based epidemic threshold below which the epidemic will die off,
as well as conditional bounds of this threshold which network
operators may use to promote/suppress path-based spreading in
their networks. Finally, besides artificially generated random and
scale-free graphs, we also use real-world networks and traffic,
as case studies, in order to compare the behaviors of contact-
and path-based epidemics. Our results further corroborate the
recent empirical observations that epidemics in communication
networks are highly persistent.
Index Terms—Epidemic spreading, routing paths, Markov
theory, mean field theory, complex networks.
I. INTRODUCTION
O
RIGINATED as part of epidemiology in biology studies
for modelling disease spreading [1], epidemic theory
has found applications in various scientific fields, ranging
from natural networks (e.g., hub protein and human brain
structure [2], (online) social networks [3], etc.) to manmade
infrastructures (e.g., transportation systems [4], [5], power
grid [6], telecommunication and computer networks [7], [8],
[9], etc.). Epidemic theory, given its vast cross-disciplinary
applicability, is now considered as part of network science.
In many real-world networks, the propagation of informa-
tion follows specific paths. In computer networks, informa-
tion messages are routed following routing protocol informa-
tion from one host to another via a path. In the emerging
information-centric networking (ICN) paradigm [10], content
is cached along the path the content traverses (e.g., [11], [12]).
In cyber physical systems such as the smart grid, assessing
the vulnerability of the power network requires understanding
the path cascading failures will take when some nodes are
attacked or fail (e.g., [13], [14]). The information spreading
W. K. Chai and G. Pavlou are with the Department of Electronic and Elec-
trical Engineering, University College London, WC1E 7JE, Torrington Place,
London, United Kingdom; e-mail: w.chai@ucl.ac.uk, g.pavlou@ucl.ac.uk.
process also forms paths in vehicular networks (e.g., [15],
[16]). Viral marketing/content spreading is another new area in
which information is propagated from one “friend” to another
in social media, following a self-perpetuating or time/distance
diminishing spreading rate [17]. Finally, many networks (e.g.,
delay-tolerant networks) are time-varying as not all nodes
are active/connected at the same time, causing infection to
spread in a path-like manner. In these cases, the current
epidemic models, which assume contact-based diffusion, do
not capture and thus, provide no explicit insights into the
epidemic pathways driven by traffic flows.
Currently, theoretical epidemic models largely assume that
infection propagation is based on contact in the sense that as
long as there exists a link/contact between two nodes, there
will be a fixed infection probability at all time. As such, each
infected node constantly infects all its immediate neighbors
even though not all nodes may be active at all time (e.g., sensor
nodes which often stay in “sleep” mode to save battery usage).
Such a directionless reactive contact-based contagion process
fails to capture two aspects of the spreading dynamics. First,
as we mentioned, in many cases spreading follows certain
paths and hence, each neighbor may be infected with different
probabilities. Second, the reactionary infection process based
on contacts does not take into account the need of an infectious
agent to physically transfer the infection to another node. It
implies that infection can still pass between nodes even when
there are no actual interactions taking place.
Motivated by these observations, we model path-based
information spreading by advancing the state of the art of
epidemic theory to account for the directional effect caused
by the paths constructed by different routing protocols as well
as by the role played by the infectious agent
1
as the “infection
carrier” that spreads the epidemic. Specifically, we first model
the infectious agent by taking as input the network topology,
routing protocol and traffic distribution. We then employ
continuous time Markov chain analysis to model the path-
based infection mechanics for one of the most representative
epidemic models (i.e., SIS model). We further apply a mean
field approximation to reduce the overall analysis complexity
from exponential to polynomial.
Our work advances the current contact-based epidemic
modelling approach to additionally account for the above
mentioned important factors. We focus on communication net-
works that transport data traffic via paths/routes. We consider
that the infection must be carried by an infectious agent and
explicitly model its role on the spreading dynamics. With
this taken into account, the susceptible nodes in the network
will possess different chances of getting infected, since now,
the potential to be infected is governed by the amount of
1
In biology, the term “pathogen” is often used in place of “infectious agent”.

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 1, NO. 1, JULY 2016 2
Fig. 1. (Color online) Contact-based vs. path-based: Node 0 is the source of
infection. Grey nodes are susceptible nodes; (left) Contact-based epidemics
only infect immediate neighbors on all directions; (right) Path-based epidemics
infect all nodes along the paths where infectious agents traverse and neighbors
having no interaction with the infected node are not proned to infection.
interactions in the system. A node having higher volume of
infectious agents destined to or traversing it will proportionally
have higher probability of getting infected. Fig. 1 provides a
simple illustration comparing the contact-based and our path-
based epidemic spreading process. In the contact-based model,
with node 0 as the initial infected node, only nodes 1, 2, 3 and
4 are immediately susceptible to infection. Other nodes may be
susceptible in the future, once they have at least one infected
direct neighbor. In contrast, assuming interactions between
nodes 0 and 6, the path-based model has a different set of
susceptible nodes (i.e., nodes 3, 5 and 6), determined by the
information exchanges in the system. In this case, nodes 1,
2 and 4 are not in danger of infection albeit being the direct
neighbors of the infected node.
The main contributions of this paper are four-fold.
We model the spreading dynamics of information based
on paths by taking into account the role of the infectious
agent in the system. We contribute to the theoretical
development of the general epidemic theory whereby we
break from the conventional assumption of the contact-
based infection process and account for the added di-
mension of direction of information flows. In our case, a
host will still have non-zero probability of being infected
even when it has no direct link to an infected node.
Spreading is now based on whether a host lies in the
routing path (usually the shortest path computed based on
specific routing protocol) where infected nodes exist. Our
modelling builds upon the analytical framework studied
in [18], [19], [20].
We characterize the path-based epidemic spreading via
an infection characterization matrix that encodes both the
information on traffic distribution and routing paths and
find the critical epidemic threshold that determines the
prevalence of the epidemic to be the inverse of the spec-
tral radius of this matrix. We further derive conditional
bounds of this threshold and provide the “control space”
available in promoting or containing the spread.
Our modelling approach provides a methodological basis
for the study of various different epidemic models (e.g.,
SIR, SEIS, SAIS, etc.) since the modelling approach is
sufficiently general.
Along with the insights gained from our work, our path-
based epidemic analytical framework forms a set of tools
for network stakeholders such as network operators to
properly dimension or control their infrastructures to
promote or suppress spreading of certain information or
data objects depending on their needs and specific cases.
The rest of the paper is organized as follows. In Section
II, we first review the basics of epidemic theory including the
latest developments and some key relevant results. Then in
Section III, we formalize the actual path-based spreading me-
chanics (Section III-A) and develop our analytical framework.
We model the infectious agent in Section III-B in two ways;
(1) by taking into account the routing protocol but without
the knowledge of traffic distribution and (2) by assuming
prior knowledge of the traffic via a traffic matrix. Our path-
based spreading analytical framework is described next in
Section III-C. Based on this framework, we investigate the
epidemic threshold corresponding to the path-based spreading
dynamics in Section IV. In Section V, a comparison between
the contact- and path-based spreading dynamics is made. We
then study the effects of network topology, routing protocol
and traffic distribution on the epidemic spreading in Section VI
and derive the bounds of the epidemic threshold with which
one can use to determine the extent to which the epidemic
can be controlled. We consider three use cases based on real
networks and traffic in Section VII, showing the behavior of
contact- and path-based epidemic spreading in these networks.
A hypothetical epidemic that infects nodes via traceroute
packets is studied using data collected in [21]. Finally, we
conclude our work in Section VIII. Table I lists the notations
used in this paper.
TABLE I
NOTATIONS
Symbol Descriptions
A Adjacency matrix representing the network topology
N Number of nodes in the network
L Number of links in the network
β Infection probability
δ Curing rate
d
max
,
¯
d, hd
2
i Maximum node degree, mean degree, second moment
of network degree distribution
τ Effective spreading rate
τ
c
Critical epidemic threshold
λ
n
Traffic generation rate of node n
µ
x
max
Spectral radius or largest eigenvalue of matrix x
b
alg
Algorithmic betweenness
R Routing matrix
B Matrix describing node involvement in forwarding packets
C Infection characterization matrix
Γ Traffic matrix
i
n
(t) Probability of node n in the infected state
s
n
(t) Probability of node n in the healthy state
ρ Fraction of infected nodes in the network
Q Infinitesimal generator of the continuous Markov chain
II. BACKGROUND, BASICS AND RELATED WORK
Recently, epidemic theory has been applied to computer
networks in areas such as computer virus/malware propagation

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 1, NO. 1, JULY 2016 3
and immunization (e.g., [7]), information dissemination (e.g.,
[8], [9]), protocol design (e.g., [22]) and cascading network
failures/faults and relevant protection strategies (e.g., [23]).
The classical epidemic analytical framework involves the
following two main aspects:
1) States by compartmentalization, epidemic models
break down a “disease” into distinct states (or stages)
and each individual in the network is considered to be
in one of the states at any given time. Two of the most
common models are the SIS and SIR models [1], [24],
[25] where the possible states are the following:
Susceptible (S) Clean and healthy individuals who
are not infected but prone to infection.
Infected (I) Infected individuals who are at the
same time infectious.
Removed (R) Immune individuals who are neither
susceptible to infection nor infectious.
There exist a number of variants in the literature, such as
SEIS/SEIR with an additional state where an individual
is infected but not yet infectious [26], SAIS where an
individual may be alerted and thus having less chance
of getting infected [27], etc.
2) Infection mechanism this describes how a disease is
passed from one individual to another. Essentially, this
refers to the transition of states. This transition is often
related to an effective spreading rate, conventionally
defined as τ = β where β is the infection probability
(sometimes known as the transmission rate) and δ is
the curing rate. However, for our case, this rate is
additionally affected by the traffic in the system. Early
work (e.g., [1], [24]) mostly considers homogeneous
mixing based on the law of mass action, where individ-
uals have equal probability of being in contact with an
infected individual, while more recent work has started
to consider heterogeneous cases.
There exist already several key works on contact-based
epidemic modelling for computer networks. In [28], the au-
thors developed an homogeneous infection model for computer
viruses in the Internet. In their work, they advanced the
literature by considering the additional effect of directed links
in a fixed network and discovered a critical threshold such
that the epidemic will die off when the effective spreading
rate is below the reciprocal of the mean degree, 1/
¯
d. In [29],
[30], the authors observed from data that Internet viruses are
more persistent than that predicted by the theoretical results for
an homogeneous network and refined this critical threshold,
still as a function of node degrees, as
¯
d/hd
2
i where hd
2
i
is the second moment of the network degree distribution.
More recently, instead of relating the threshold directly to the
network node degree, the authors in [18], [31], [32] found
that the threshold is governed by the spectral radius of the
adjacency matrix, A, representing the network topology. It is
stated in these works that the critical epidemic threshold, τ
c
,
equals 1
A
max
where µ
A
max
is the largest eigenvalue of A.
This threshold is further derived for generalized networks with
heterogeneous infection rates in [33].
Since it is well-known that
¯
d µ
A
max
d
max
, we can also
state that the bounds for the contact-based epidemic threshold
are:
1
d
max
τ
c
1
¯
d
(1)
where d
max
is the maximum degree. For instance, τ
c
=
1/d
max
when the graph is d
max
-regular. In our work, however,
we will show that the threshold for path-based epidemic
spreading is no longer directly bounded by the node degree
or degree distribution of the network nodes. Furthermore, we
are interested in finding the bounds of τ
c
. Unlike contact-
based epidemics which usually have a fixed system (i.e., fixed
network topology with constant infection/curing probabilities),
with a path-based epidemic model, we have the possibility to
“tune” the system based on τ
c
. For instance, we can either
encourage or control the epidemic spreading through design
of different routing protocols or through traffic engineering
techniques that change the traffic pattern in the network.
In one way or another, the known epidemic models in
the literature employ some approximations or assumptions
(e.g., network size is assumed to be sufficiently large such
that asymptotic regime behavior is reached) to ensure com-
putational feasibility since the complexity to obtain an exact
solution of an epidemic spread grows exponentially with
the network size. In [34], the authors propose a pair-wise
approximation SIS model that provides higher model accuracy
but results in the need to consider
N
2
number of pairs.
The nature of the exact solution has been studied in [18]
by using a 2
N
-state continuous-time Markov chain for the
SIS model. By observing each node separately, the authors
further introduced an N-intertwined model that reduces the
complexity of exact solution from exponential O(State
N
) to
polynomial O(N) where State is the number of possible
states and N is the number of network nodes. This work
forms the starting point of our work as we retain its reduced
polynomial complexity feature. The approach has also been
applied to the contact-based SIR epidemic model in [35].
In the literature, the incorporation of traffic dynamics into
epidemic modelling was investigated in the form of a meta-
population system in [36], [37] where the role of the infec-
tious agent was considered. The authors departed from the
previous epidemic studies that assumed infection will take
place whenever a link (or contact) between two nodes exists.
In [38], the authors compared pathogen spreading between
the shortest paths of a fully and partially observable network.
However, in that work, the authors still considered a contact-
based infection mechanism with no specific destination nodes
(stochastic process). In [39], the role of the data packet as
infectious agent was modelled considering random source-
destination pairs. The authors derived the critical threshold
of such traffic-driven epidemic, given in Eq. 2, analogous to
the previous work on contact-based studies that relates the
threshold to the network degree
τ
c
=
hb
alg
i
hb
2
alg
i
1
N
(2)
where hb
alg
i and hb
2
alg
i are the mean and second moment of

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 1, NO. 1, JULY 2016 4
the algorithmic betweenness respectively [40]
2
. The algorith-
mic betweenness of a node is defined as the number of packets
passing through that node when each node in the network
sends one packet to every other node in the network. Using re-
sults from [39], the works in [42] and [43] investigated how to
deter traffic-driven epidemic spreading by removing network
links and altering routing protocol respectively. By exploiting
the concept of algorithmic betweenness, these works took an
implicit assumption that each network node has homogeneous
interactions with all other nodes which unfortunately is often
not the case in communication networks. Our work here does
not rely on this assumption.
The critical epidemic threshold has now been considered as
an important and fundamental quantity in describing epidemic
dynamics. In our work, we derive the critical threshold that
determines the survivability of a path-based epidemic. We find
that this threshold relates to the spectral radius of an infection
characterization matrix (detailed in Section III) which takes
into account the effects of routing and traffic.
III. PATH-BASED SPREADING MODEL DEVELOPMENT
A. General Path-based Spreading Process
Consider a 3-node line graph as an example. We use the
Markov chain diagrams of SI, SIS and SIR models in Fig. 2
to illustrate the key departure of path-based spreading process
from the conventional contact-based one. While the number
of states remains the same, the transitions are not. Based on
the 3-node line graph, since there is always only one valid
route between any two nodes, Fig. 2 is representative for
any routing protocol. A transition involving state changes to
multiple nodes is possible (e.g., state SSI to state III) while
the conventional contact-based infection forbids this. Note that
a direct transition from state SIS to state III is not possible
for a line graph because the flow of packets and thus infection
is directional. An infected node in the middle can only infect
nodes either to its left or right at any time. However, if we
consider a ring topology of the same size, then this transition
from SIS to III is possible (illustrated with the dash arrows
in Fig. 2) if the routing protocol chooses the longer path to
deliver the packet. Therefore, it is already obvious that while
the topology still plays a role in influencing the spread of the
infection, it is the routing protocol that finally governs the
actual infection dynamics.
Without loss of generality, for the rest of this paper we
focus on the SIS model where an individual in the network
can only be healthy (i.e., susceptible) or infected. We model
the spreading process that is based on the paths of information
flows. For simplicity, we use data packets as the universal
infectious agent although the “infection” can be transmitted by
different agents depending on the specific application context
(e.g., content chunk in the case of in-network caching, software
patch in the case of computer virus immunization, tweet in the
context of gossip spreading in online social networks, etc.).
The exact mechanics of the infection process are as follows:
2
If the shortest paths are used, then it coincides with topological between-
ness [41].
Fig. 2. The Markov chain state diagram of path-based epidemic spreading
for a line graph with N = 3 for SI model with 2
3
= 8 states (top-left), SIS
model with 2
3
= 8 states (top-right) and SIR model with 3
3
= 27 states
(bottom). The grey states indicate absorbing ones.
A packet is infectious if it originates from an infected
node. Otherwise, it is a clean packet.
Packets traverse from one node to another via a path with
certain traffic arrival rate, λ. We assume a Poisson packet
arrival process with no routing delay.
A clean packet is infected and thus becomes infectious
when it traverses an infected node (i.e., an infected node
is infectious).
A susceptible node can only be infected by an infec-
tious packet. An infectious packet infects a node with
probability β. A path-based spreading process may infect
multiple nodes in one transmission. An infectious packet
traversing a path of l hops length has the probability of
β
l
of infecting l nodes in one transition. Alternatively,
an infected packet traversing a path has 1 (1 β)
l
probability of infecting a node along that path.
An infected node becomes susceptible again with a
curing rate, δ. This is assumed to be a Poisson process
independent of the traffic and infection rates. For the rest
of the paper, we also assume δ = 1.
B. Modelling the Infectious Agent
The state of a node (i.e., either healthy or infected) is
dependent on the number of infected packets that traverse

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 1, NO. 1, JULY 2016 5
it or terminate there. This, in turn, is proportionate to the
volume of traffic it carries (i.e., the infection requires packets,
as infectious agents, to spread).
Consider an undirected network, G(V, E) with V =
v
1
, ..., v
N
nodes and E = e
1
, ..., e
L
links where N = |V | and
L = |E|. G can be represented by A, the N × N symmetric
adjacency matrix, with a
n,m
= 1 if there exists a link between
node n and m and 0 otherwise. Furthermore, we describe the
effect of the routing protocol via an N × N(N 1) routing
matrix
3
, R. For our purpose, we depart from the conventional
routing matrix that maps traffic to links (e.g., [44]) and instead,
encode the traffic to nodes it traverses or is destined to.
Accordingly, we define the routing matrix as follows:
r
n,k
=
1 if traffic on path k traverses across
or is destined to n,
0 otherwise
(3)
where k indexes routes (or paths) between all source-
destination pairs in the network (i.e., node pair 1 2, 1
3, . . . , N (N 1)). Note that matrix R is formed by
concatenating N blocks of N × (N 1) matrices as follows:
R =
R
1
|R
2
| . . . |R
N
(4)
where R
n
describe the node involvement in delivering the
traffic originating from node n. Assuming traffic is not routed
to itself, then all elements in the n
th
row of R
n
equal zero:
k V : r
n
n,k
= 0. (5)
Given A and R, we construct an N×N matrix, B describing
the probability of a node involved in delivering a packet
originating from any other nodes in the network. In the context
of this work, the destination node of the infected packets is
also subjected to infection, thus it must be considered. Matrix
B can then be constructed as in Eq. 6 below
b
n,m
=
P
m(N1)
k=1+(m1)(N 1)
r
n,k
N 1
(6)
where b
n,m
denotes the probability of a packet originating
from node m traverses across or destines to node n. Concep-
tually, b
n,m
resembles the notion of conditional betweenness
centrality in [45] where the total node involvement in for-
warding packet originating from source node, n, is computed,
instead of all possible node pairs in the network
4
.
The elements in B can further be weighted with the node
specific traffic generation rate. If Λ is an N × 1 vector whose
entries, λ
n
, denote the traffic generation rate of the nodes in
G, then the total traffic node n is involved in can be computed
as follows:
C = diag(Λ) × B (7)
where diag(Λ) is the diagonal matrix with elements
λ
1
, λ
2
, . . . , λ
N
as its entries at the principal diagonal. This
3
We assume traffic is not destined to the source.
4
In [45], the authors conditioned the betweenness metric using the desti-
nation node as oppose to the source node.
formulation however makes the implicit assumption that traffic
distribution is uniform in the network (i.e., all nodes send equal
volume of traffic to all other nodes).
Often in practice, the traffic in a network is known or can
be estimated/predicted (e.g., [46]). This is usually represented
via a traffic matrix. With this additional information, we can
reformulate Eq. 7 above and model the traffic dynamics more
precisely. Consider a stationary non-negative N × N traffic
matrix
5
, Γ where its entries, Γ
n,m
, denote the traffic volume
from node m to n. Since we do not consider self-traffic, the
trace of the traffic matrix, tr(Γ) =
P
N
n=1
Γ
n,n
= 0. The
equivalent of Λ matrix can be computed by taking the column
sum of Γ.
Further, we define a reduced Γ
where these zero elements
along the main diagonal are removed, resulting in a (N 1)×
N matrix. Similar to the R matrix, Γ
can be decomposed to
the following form:
Γ
=
(,1)
|Γ
(,2)
| . . . |Γ
(,N)
] (8)
where Γ
(,n)
is the n
th
column of Γ
indicating the traffic
volume originating from node n. Using the additional source-
destination traffic information, we can therefore, alternatively,
construct C as follows:
C = [R
1
× Γ
(,1)
|R
2
× Γ
(,2)
| . . . |R
N
× Γ
(,N)
]. (9)
We call the N × N matrix C as infection characterization
matrix (discussed later in Section IV) whereby it conceptually
signifies the overall level of involvement of the nodes in the
graph in receiving and delivering infectious agents.
C. SIS Path-based Spreading Model
Let X
n
(t) be the state of node n at time t. For the SIS
model, X
n
(t) can only be either “susceptible” or “infected”.
We further denote the probability of a node n be in the
infected state at time t to be i
n
(t) = P r[X
n
(t) = 1] with
“1” indicating the infected state and “0” the susceptible one.
Hence, the probability of a node being in the healthy state is
s
n
(t) = P r[X
n
(t) = 0] = 1 i
n
(t). By applying Markov
theory, the infinitesimal generator Q
n
(t) of this two-state
continuous Markov chain can be written as below:
Q
n
(t) =
q
1;n
q
1;n
q
2;n
q
2;n
(10)
where the transitions involving the curing process are inde-
pendent of the states of other nodes and thus, q
2;n
= δ (See
Section V for discussion.).
On the other hand, q
1;n
is a random variable dependent on
the activities taking place in other nodes within the network.
To proceed with the Markov analysis, the randomness of q
1;n
must be removed. One way to achieve this is to condition
q
1;n
to all possible combinations of states for all nodes,
X
n
, 1 n N , resulting in the exact Markov chain solution
of exponential complexity.
5
Conventionally, the traffic matrix is often defined as the transpose of Γ
(e.g., [44]).

Citations
More filters
Proceedings ArticleDOI
22 Jan 2006
TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.
Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

7,116 citations

Journal Article
TL;DR: This work develops a framework for understanding the robustness of interacting networks subject to cascading failures and presents exact analytical solutions for the critical fraction of nodes that, on removal, will lead to a failure cascade and to a complete fragmentation of two interdependent networks.
Abstract: Complex networks have been studied intensively for a decade, but research still focuses on the limited case of a single, non-interacting network. Modern systems are coupled together and therefore should be modelled as interdependent networks. A fundamental property of interdependent networks is that failure of nodes in one network may lead to failure of dependent nodes in other networks. This may happen recursively and can lead to a cascade of failures. In fact, a failure of a very small fraction of nodes in one network may lead to the complete fragmentation of a system of several interdependent networks. A dramatic real-world example of a cascade of failures (‘concurrent malfunction’) is the electrical blackout that affected much of Italy on 28 September 2003: the shutdown of power stations directly led to the failure of nodes in the Internet communication network, which in turn caused further breakdown of power stations. Here we develop a framework for understanding the robustness of interacting networks subject to such cascading failures. We present exact analytical solutions for the critical fraction of nodes that, on removal, will lead to a failure cascade and to a complete fragmentation of two interdependent networks. Surprisingly, a broader degree distribution increases the vulnerability of interdependent networks to random failure, which is opposite to how a single network behaves. Our findings highlight the need to consider interdependent network properties in designing robust networks.

132 citations

Journal ArticleDOI
TL;DR: This paper unify the aforementioned class of preventive and reactive cyber defense dynamics models and the closely related class of $N$ -intertwined epidemic models into a single framework and characterize the convergence speed of the unified dynamics.
Abstract: A class of the preventive and reactive cyber defense dynamics has recently been proven to be globally convergent , meaning that the dynamics always converges to a unique equilibrium whose location only depends on the values of the model parameters (but not the initial state of the dynamics). In this paper, we unify the aforementioned class of preventive and reactive cyber defense dynamics models and the closely related class of $N$ -intertwined epidemic models into a single framework. We prove that the unified dynamics is still globally convergent under some mild conditions, which are naturally satisfied by the two specific classes of dynamics models mentioned above and are inevitable when analyzing a more general framework. We also characterize the convergence speed of the unified dynamics. As a corollary, we obtain that the $N$ -intertwined epidemic model and its extension are globally convergent, together with a full characterization on their convergence speed, which is only partially addressed in the literature.

24 citations


Cites methods from "Path-Based Epidemic Spreading in Ne..."

  • ...This approach has been investigated in, for example, [4], [7], [17], [20], [30], [32], [34], [36]....

    [...]

  • ...These models have been studied in, for example, [4], [7], [17], [20], [34]....

    [...]

Journal ArticleDOI
Yonglei Lu1, Jing Liu1
TL;DR: An SIR-A (susceptible–infected–recovered–active) model is proposed to map the infection and information dissemination to a double-layer network based on the assumption that the community size and individual’s awareness may have an impact on the infection rate between two individuals.
Abstract: Epidemic spreading is one of the popular dynamics on complex networks. The classic SIR (susceptible–infected–recovered) model describes the spreading process of infections. The information dissemination is also considered in many works because people’s reactions to the outbreak of epidemic influence the spreading. In this work, we analyze how the operations on information dissemination affect the infected individuals as well as the spreading conditions of epidemics. We propose an SIR-A (susceptible–infected–recovered–active)model to map the infection and information dissemination to a double-layer network based on the assumption that the community size and individual’s awareness may have an impact on the infection rate ( β ) between two individuals. We improve the widely used index infected ratio (i(t)) and propose the spreading risk ( R S P (t)) to evaluate the epidemic spreading process. Then with the inspiration of immunization strategies, we propose three information dissemination strategies, which are random, targeted and path-based ones. They are used to speed up the information dissemination to control the epidemic spreading. By considering two measures namely speak value ( P v ) and peak time ( P t ) of R S P (t), we compare the efficiency of these three strategies. The experimental results show that any one of these strategies can reduce P v and delay the coming of P t effectively, especially the path-based strategy in the situation of scale-free networks with low μ .

13 citations

Journal ArticleDOI
TL;DR: This work focuses on the Susceptible-Infected-Removed (SIR) epidemic model and uses continuous-time Markov chain analysis to model the impact of such agent mobility induced contagion mechanics by taking into account the state transitions of each node individually, as oppose to most conventional epidemic approaches which usually consider the mean aggregated behavior of all nodes.
Abstract: Most conventional epidemic models assume contact-based contagion process. We depart from this assumption and study epidemic spreading process in networks caused by agents acting as carrier of infection. These agents traverse from origins to destinations following specific paths in a network and in the process, infecting the sites they travel across. We focus our work on the Susceptible-Infected-Removed (SIR) epidemic model and use continuous-time Markov chain analysis to model the impact of such agent mobility induced contagion mechanics by taking into account the state transitions of each node individually, as oppose to most conventional epidemic approaches which usually consider the mean aggregated behavior of all nodes. Our approach makes one mean field approximation to reduce complexity from exponential to polynomial. We study both network-wide properties such as epidemic threshold as well as individual node vulnerability under such agent assisted infection spreading process. Furthermore, we provide a first order approximation on the agents’ vulnerability since infection is bi-directional. We compare our analysis of spreading process induced by agent mobility against contact-based epidemic model via a case study on London Underground network, the second busiest metro system in Europe, with real dataset recording commuters’ activities in the system. We highlight the key differences in the spreading patterns between the contact-based versus agent assisted spreading models. Specifically, we show that our model predicts greater spreading radius than conventional contact-based models due to agents’ movements. Another interesting finding is that, in contrast to contact-based model where nodes located more centrally in a network are proportionally more prone to infection, our model shows no such strict correlation as in our model, nodes may not be highly susceptible even located at the heart of the network and vice versa.

12 citations


Cites background or methods from "Path-Based Epidemic Spreading in Ne..."

  • ...This is logical since we get higher g1 when cn;m assumes larger values (see Theorem 2 in [25]) which means there is higher volume of agents traversing nodes (and thus, passing on the contagion) within the network....

    [...]

  • ...Our approach takes the analytical framework studied in [12], [25] as the starting point....

    [...]

References
More filters
Journal ArticleDOI
15 Oct 1999-Science
TL;DR: A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
Abstract: Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.

33,771 citations


"Path-Based Epidemic Spreading in Ne..." refers background in this paper

  • ...The literature has shown evidence that pure random graphs such as Erdős-Rényi (ER) graph model [49], which has binomial degree distribution, and scale-free graphs [50], which have power-law degree distribution, exhibit very different epidemic behaviors....

    [...]

Book
25 Nov 1994
TL;DR: This paper presents mathematical representation of social networks in the social and behavioral sciences through the lens of Dyadic and Triadic Interaction Models, which describes the relationships between actor and group measures and the structure of networks.
Abstract: Part I. Introduction: Networks, Relations, and Structure: 1. Relations and networks in the social and behavioral sciences 2. Social network data: collection and application Part II. Mathematical Representations of Social Networks: 3. Notation 4. Graphs and matrixes Part III. Structural and Locational Properties: 5. Centrality, prestige, and related actor and group measures 6. Structural balance, clusterability, and transitivity 7. Cohesive subgroups 8. Affiliations, co-memberships, and overlapping subgroups Part IV. Roles and Positions: 9. Structural equivalence 10. Blockmodels 11. Relational algebras 12. Network positions and roles Part V. Dyadic and Triadic Methods: 13. Dyads 14. Triads Part VI. Statistical Dyadic Interaction Models: 15. Statistical analysis of single relational networks 16. Stochastic blockmodels and goodness-of-fit indices Part VII. Epilogue: 17. Future directions.

17,104 citations

Journal ArticleDOI
TL;DR: This work characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties, edges, or links that connect them.
Abstract: Social Network Analysis Methods And Social network analysis (SNA) is the process of investigating social structures through the use of networks and graph theory. It characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties, edges, or links (relationships or interactions) that connect them. Examples of social structures commonly visualized through social network ...

12,634 citations


"Path-Based Epidemic Spreading in Ne..." refers background in this paper

  • ...2If the shortest paths are used, then it coincides with topological betweenness [41]....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors considered the problem of finding a causal factor which appears to be adequate to account for the magnitude of the frequent epidemics of disease which visit almost every population.
Abstract: (1) One of the most striking features in the study of epidemics is the difficulty of finding a causal factor which appears to be adequate to account for the magnitude of the frequent epidemics of disease which visit almost every population. It was with a view to obtaining more insight regarding the effects of the various factors which govern the spread of contagious epidemics that the present investigation was undertaken. Reference may here be made to the work of Ross and Hudson (1915-17) in which the same problem is attacked. The problem is here carried to a further stage, and it is considered from a point of view which is in one sense more general. The problem may be summarised as follows: One (or more) infected person is introduced into a community of individuals, more or less susceptible to the disease in question. The disease spreads from the affected to the unaffected by contact infection. Each infected person runs through the course of his sickness, and finally is removed from the number of those who are sick, by recovery or by death. The chances of recovery or death vary from day to day during the course of his illness. The chances that the affected may convey infection to the unaffected are likewise dependent upon the stage of the sickness. As the epidemic spreads, the number of unaffected members of the community becomes reduced. Since the course of an epidemic is short compared with the life of an individual, the population may be considered as remaining constant, except in as far as it is modified by deaths due to the epidemic disease itself. In the course of time the epidemic may come to an end. One of the most important probems in epidemiology is to ascertain whether this termination occurs only when no susceptible individuals are left, or whether the interplay of the various factors of infectivity, recovery and mortality, may result in termination, whilst many susceptible individuals are still present in the unaffected population. It is difficult to treat this problem in its most general aspect. In the present communication discussion will be limited to the case in which all members of the community are initially equally susceptible to the disease, and it will be further assumed that complete immunity is conferred by a single infection.

8,238 citations

01 Jan 1927
TL;DR: The present communication discussion will be limited to the case in which all members of the community are initially equally susceptible to the disease, and it will be further assumed that complete immunity is conferred by a single infection.
Abstract: (1) One of the most striking features in the study of epidemics is the difficulty of finding a causal factor which appears to be adequate to account for the magnitude of the frequent epidemics of disease which visit almost every population. It was with a view to obtaining more insight regarding the effects of the various factors which govern the spread of contagious epidemics that the present investigation was undertaken. Reference may here be made to the work of Ross and Hudson (1915-17) in which the same problem is attacked. The problem is here carried to a further stage, and it is considered from a point of view which is in one sense more general. The problem may be summarised as follows: One (or more) infected person is introduced into a community of individuals, more or less susceptible to the disease in question. The disease spreads from the affected to the unaffected by contact infection. Each infected person runs through the course of his sickness, and finally is removed from the number of those who are sick, by recovery or by death. The chances of recovery or death vary from day to day during the course of his illness. The chances that the affected may convey infection to the unaffected are likewise dependent upon the stage of the sickness. As the epidemic spreads, the number of unaffected members of the community becomes reduced. Since the course of an epidemic is short compared with the life of an individual, the population may be considered as remaining constant, except in as far as it is modified by deaths due to the epidemic disease itself. In the course of time the epidemic may come to an end. One of the most important probems in epidemiology is to ascertain whether this termination occurs only when no susceptible individuals are left, or whether the interplay of the various factors of infectivity, recovery and mortality, may result in termination, whilst many susceptible individuals are still present in the unaffected population. It is difficult to treat this problem in its most general aspect. In the present communication discussion will be limited to the case in which all members of the community are initially equally susceptible to the disease, and it will be further assumed that complete immunity is conferred by a single infection.

7,769 citations


"Path-Based Epidemic Spreading in Ne..." refers background in this paper

  • ...Furthermore, we are interested in finding the bounds of τc....

    [...]

Frequently Asked Questions (2)
Q1. What contributions have the authors mentioned in the paper "Path-based epidemic spreading in networks" ?

The authors show that the structure of the topology plays a secondary role in determining the size of the epidemic. Based on this, the authors derive the critical pathbased epidemic threshold below which the epidemic will die off, as well as conditional bounds of this threshold which network operators may use to promote/suppress path-based spreading in their networks. Their results further corroborate the recent empirical observations that epidemics in communication networks are highly persistent. 

Based on these, the authors further derive conditional bounds for τc, subject to the availability of information regarding traffic load in the network and the routing algorithms, such that the network operator may use to control the epidemic as needed. In addition, since the authors consider each node separately, they can also easily identify/rank nodes within the network that are the most conducive to spreading the infection. Such nodal-level information may be used as a new centrality metric when designing immunization/protection schemes. Their modelling approach is general in nature and can be easily extended to model different epidemic models such as SIR ( analogous to [ 35 ] for contact-based epidemic ).