scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Random Linear Network Coding Approach to Multicast

TL;DR: This work presents a distributed random linear network coding approach for transmission and compression of information in general multisource multicast networks, and shows that this approach can take advantage of redundant network capacity for improved success probability and robustness.
Abstract: We present a distributed random linear network coding approach for transmission and compression of information in general multisource multicast networks. Network nodes independently and randomly select linear mappings from inputs onto output links over some field. We show that this achieves capacity with probability exponentially approaching 1 with the code length. We also demonstrate that random linear coding performs compression when necessary in a network, generalizing error exponents for linear Slepian-Wolf coding in a natural way. Benefits of this approach are decentralized operation and robustness to network changes or link failures. We show that this approach can take advantage of redundant network capacity for improved success probability and robustness. We illustrate some potential advantages of random linear network coding over routing in two examples of practical scenarios: distributed network operation and networks with dynamically varying connections. Our derivation of these results also yields a new bound on required field size for centralized network coding on general multicast networks

Summary (3 min read)

Introduction

  • This paper is an initial exploration of random linear network coding, posing more questions that it answers.
  • Resource consumption can naturally be traded off against capacity and robustness, and across multiple communicating sessions; subsequent work on distributed resource optimization, e.g., [13], [21], has used random linear network coding as a component of the solution.

A. Overview

  • In Section II, the authors describe the network model and algebraic coding approach they use in their analyses, and introduce some notation and existing results.
  • Section III gives some insights arising from consideration of bipartite matching and network flows.
  • Success/error probability bounds for random linear network coding are given for independent and linearly correlated sources in Section IV and for arbitrarily correlated sources in Section V.
  • The authors also give examples of practical scenarios in which randomized network coding can be advantageous compared to routing, in Section VI.
  • The authors present their conclusions and some directions for further work in Section VII.

A. Basic Model

  • Nodes and are called the origin and destination, respectively, of link .
  • The authors consider the multicast case where for all .
  • For the case of linearly correlated sources, the authors assume that the sources can be modeled as given linear combinations of underlying independent source processes, each with an entropy rate of one bit per unit time, as described further in Section II-B.
  • For the latter case, the authors consider general networks without buffering, and make the simplifying assumption that each link has the same delay.

OR LINEARLY CORRELATED SOURCES

  • The authors consider random linear network codes in which some or all of the network code coefficients for linearly correlated sources are chosen independently and uniformly over , where is greater than the number of receivers .
  • The code length is the logarithm of the field size .
  • The bound of Theorem 2 is very general, applying across all networks with the same number of receivers and the same number of links with associated random code coefficients, without considering specific network structure.
  • If there exists a solution to the network connection problem with the same values for the fixed code coefficients, then the probability that the random network code is valid for the problem is at least , where is the maximum number of links with associated random coefficients in any set of links constituting a flow solution for any receiver.
  • Consider a multicast connection problem on an acyclic network with independent or linearly correlated sources of joint entropy rate , and links which fail (are deleted from the network) with probability, also known as Theorem 5.

V. RANDOM LINEAR NETWORK CODING FOR ARBITRARILY CORRELATED SOURCES

  • So far the authors have been considering independent or linearly correlated sources.
  • Analogously to Slepian and Wolf [28], the authors consider the problem of distributed encoding and joint decoding of two sources whose output values in each unit time period are drawn independent and identically distributed (i.i.d.) from the same joint distribution .
  • The authors denote by the maximum source–receiver path length.
  • The error probability of the random linear network code is at most , where and are dummy random variables with joint distribution, also known as Theorem 6.
  • THE TABLE GIVES BOUNDS AS WELL AS SOME ACTUAL PROBABILITY VALUES WHERE EXACT CALCULATIONS ARE TRACTABLE.

VI. BENEFITS OF RANDOMIZED CODING OVER ROUTING

  • Network coding, as a superset of routing, has been shown to offer significant capacity gains for networks with special structure [26].
  • For many other networks, network coding does not give higher capacity than centralized optimal routing, but can offer other advantages when centralized optimal routing is difficult.
  • The authors consider two types of network scenarios in which distributed random linear coding can be particularly useful.

A. Distributed Settings

  • In networks with large numbers of nodes and/or changing topologies, it may be expensive or infeasible to reliably maintain routing state at network nodes.
  • The source node sends one process in both directions on one axis and the other process in both directions along the other axis, as illustrated in Fig.
  • A node receiving information on two links sends one of the incoming processes on one of its two outgoing links with equal probability, and the other process on the remaining link.
  • For the randomized flooding scheme RF, the probability that a receiver located at grid position relative to the source can decode both source processes is at least, also known as 5 Proposition 1.
  • 5This simple scheme, unlike the randomized flooding scheme RF, leaves out the optimization that each node receiving two linearly independent processes should always send out two linearly independent processes.

B. Dynamically Varying Connections

  • Another scenario in which random linear network coding can be advantageous is for multisource multicast with dynamically varying connections.
  • The parameter values for the tests were chosen such that the resulting random graphs would in general be connected and able to support some of the desired connections, while being small enough for the simulations to run efficiently.
  • In each time slot, a source was either on, i.e., transmitting source informa- tion, or off, i.e., not transmitting source information.
  • Each of these types of overhead depends on the coding field size.
  • To this end, the authors use a small field size that allows random linear coding to generally match the performance of the Steiner heuristic, and to surpass it in networks whose topology makes Steiner tree routing difficult.

VII. CONCLUSION

  • The authors have presented a distributed random linear network coding approach which asymptotically achieves capacity, as given by the max-flow min-cut bound of [1], in multisource multicast networks.
  • These examples suggest that the decentralized nature and robustness of random linear network coding can offer significant advantages in settings that hinder optimal centralized network control.
  • Further work includes extensions to nonuniform code distributions, possibly chosen adaptively or with some rudimentary coordination, to optimize different performance goals.
  • The randomized and distributed nature of the approach also leads us naturally to consider applications in network security.
  • It would also be interesting to consider protocol issues for different communication scenarios, and to compare specific coding and routing protocols over a range of performance metrics.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 10, OCTOBER 2006 4413
A Random Linear Network Coding Approach to
Multicast
Tracey Ho, Member, IEEE, Muriel Médard, Senior Member, IEEE, Ralf Koetter, Senior Member, IEEE,
David R. Karger, Associate Member, IEEE, Michelle Effros, Senior Member, IEEE, Jun Shi, and Ben Leong
Abstract—We present a distributed random linear network
coding approach for transmission and compression of informa-
tion in general multisource multicast networks. Network nodes
independently and randomly select linear mappings from inputs
onto output links over some field. We show that this achieves ca-
pacity with probability exponentially approaching
1
with the code
length. We also demonstrate that random linear coding performs
compression when necessary in a network, generalizing error ex-
ponents for linear Slepian–Wolf coding in a natural way. Benefits
of this approach are decentralized operation and robustness to
network changes or link failures. We show that this approach
can take advantage of redundant network capacity for improved
success probability and robustness. We illustrate some potential
advantages of random linear network coding over routing in two
examples of practical scenarios: distributed network operation
and networks with dynamically varying connections. Our deriva-
tion of these results also yields a new bound on required field size
for centralized network coding on general multicast networks.
Index Terms—Distributed compression, distributed networking,
multicast, network coding, random linear coding.
I. INTRODUCTION
T
HE capacity of multicast networks with network coding
was given in [1]. We present an efficient distributed ran-
domized approach that asymptotically achieves this capacity.
We consider a general multicast framework—multisource mul-
ticast, possibly with correlated sources, on general networks.
Manuscript received February 26, 2004; revised June 1, 2006. This work
was supported in part by the National Science Foundation under Grants CCF-
0325324, CCR-0325673, and CCR-0220039, by Hewlett-Packard under Con-
tract 008542-008, and by Caltech’s Lee Center for Advanced Networking.
T. Ho was with the Laboratory for Information and Decision Systems (LIDS),
Massachusetts Institute of Technology (MIT), Cambridge, MA 02139 USA.
She is now with the California Institute of Technology (Caltech), Pasadena, CA
91125 USA (e-mail: tho@caltech.edu).
M. Médard is with Laboratory for Information and Decision Systems, the
Massachusetts Institute of Technology (MIT), Cambridge, MA 02139 USA
(e-mail: medard@mit.edu).
R. Koetter is with the Coordinated Science Laboratory, University of Illinois
at Urbana-Champaign, Urbana, IL 61801 USA (e-mail: koetter@csl.uiuc.edu).
D. R. Karger is with the Computer Science and Artificial Intelligence Labo-
ratory (CSAIL), the Massachusetts Institute of Technology (MIT), Cambridge,
MA 02139 USA (e-mail: karger@csail.mit.edu).
M. Effros is with the Department of Electrical Engineering, California Insti-
tute of Technology, Pasadena, CA 91125 USA (e-mail: effros@caltech.edu).
J. Shi was with the University of California, Los Angeles, CA, USA. He is
now with Intel Corporation, Santa Clara, CA 95054 USA (e-mail: junshi@ee.
ucla.edu).
B. Leong was with the Computer Science and Artificial Intelligence Labo-
ratory (CSAIL), the Massachusetts Institute of Technology (MIT), Cambridge,
MA 02139 USA. He is now with the National University of Singapore, Singa-
pore 119260, Republic of Singapore (e-mail: benleong@comp.nus.edu.sg).
Communicated by A. Ashikhmin, Associate Editor for Coding Theory.
Digital Object Identifier 10.1109/TIT.2006.881746
Fig. 1. An example of distributed random linear network coding.
X
and
X
are the source processes being multicast to the receivers, and the coefficients
are randomly chosen elements of a finite field. The label on each link represents
the process being transmitted on the link.
This family of problems includes traditional single-source mul-
ticast for content delivery and the incast or reachback problem
for sensor networks, in which several, possibly correlated,
sources transmit to a single receiver. We use a randomized
strategy: all nodes other than the receiver nodes perform
random linear mappings from inputs onto outputs over some
field. These mappings are selected independently at each node.
An illustration is given in Fig. 1. The receivers need only know
the overall linear combination of source processes in each of
their incoming transmissions. This information can be sent with
each transmission block or packet as a vector of coefficients
corresponding to each of the source processes, and updated at
each coding node by applying the same linear mappings to the
coefficient vectors as to the information signals. The relative
overhead of transmitting these coefficients decreases with
increasing length of blocks over which the codes and network
remain constant. For instance, if the network and network code
are fixed, all that is needed is for the sources to send, once, at
the start of operation, a canonical basis through the network.
Our primary results show, first, that such random linear
coding achieves multicast capacity with probability exponen-
tially approaching
with the length of code. Second, in the
context of a distributed source coding problem, we demonstrate
that random linear coding also performs compression when
necessary in a network, generalizing known error exponents for
linear Slepian–Wolf coding [4] in a natural way.
This approach not only recovers the capacity and achievable
rates, but also offers a number of advantages. While capacity
can be achieved by other deterministic or random approaches,
they require, in general, network codes that are planned by or
known to a central authority. Random design of network codes
was first considered in [1]; our contribution is in showing how
random linear network codes can be constructed and efficiently
0018-9448/$20.00 © 2006 IEEE

4414 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 10, OCTOBER 2006
communicated to receivers in a distributed manner. For the case
of distributed operation of a network whose conditions may be
varying over time, our work hints at a beguiling possibility: that
a network may be operated in a decentralized manner and still
achieve the information rates of the optimized solution. Our
distributed network coding approach has led to and enabled
subsequent developments in distributed network optimization,
e.g., [20], [13]. The distributed nature of our approach also ties
in well with considerations of robustness to changing network
conditions. We show that our approach can take advantage of
redundant network capacity for improved success probability
and robustness. Moreover, issues of stability, such as those
arising from propagation of routing information, are obviated
by the fact that each node selects its code independently from
the others.
Our results, more specically, give a lower bound on the
probability of error-free transmission for independent or lin-
early correlated sources, which, owing to the particular form
of transfer matrix determinant polynomials, is tighter than the
SchwartzZippel bound (e.g., [23]) for general polynomials
of the same total degree. This bound, which is exponentially
dependent on the code length, holds for any feasible set of
multicast connections over any network topology (including
networks with cycles and link delays). The result is derived
using a formulation based on the Edmonds matrix of bipartite
matching, which leads also to an upper bound on eld size
required for deterministic centralized network coding over
general networks. We further give, for acyclic networks, tighter
bounds based on more specic network structure, and show
the effects of redundancy and link reliability on success proba-
bility. For arbitrarily correlated sources, we give error bounds
for minimum entropy and maximum
a posteriori probability
decoding. In the special case of a SlepianWolf source network
consisting of a link from each source to the receiver, our error
exponents reduce to the corresponding results in [4] for linear
SlepianWolf coding. The latter scenario may thus be consid-
ered a degenerate case of network coding.
We illustrate some possible applications with two examples
of practical scenariosdistributed settings and networks with
dynamically varying connectionsin which random linear
network coding shows particular promise of advantages over
routing.
This paper is an initial exploration of random linear network
coding, posing more questions that it answers. We do not cover
aspects such as resource and energy allocation, but focus on op-
timally exploiting a given set of resources. Resource consump-
tion can naturally be traded off against capacity and robustness,
and across multiple communicating sessions; subsequent work
on distributed resource optimization, e.g., [13], [21], has used
random linear network coding as a component of the solution.
There are also many issues surrounding the adaptation of pro-
tocols, which generally assume routing, to random coding ap-
proaches. We do not address these here, but rather seek to estab-
lish that the potential benets of random linear network coding
justify future consideration of protocol compatibility with or
adaptation to network codes.
The basic random linear network coding approach involves
no coordination among nodes. Implementations for various ap-
plications may not be completely protocol-free, but the roles
and requirements for protocols may be substantially redened
in this new environment. For instance, if we allow for retrials to
nd successful codes, we in effect trade code length for some
rudimentary coordination.
Portions of this work have appeared in [9], which introduced
distributed random linear network coding; [8], which presented
the Edmonds matrix formulation and a new bound on required
eld size for centralized network coding; [12], which gener-
alized previous results to arbitrary networks and gave tighter
bounds for acyclic networks; [11], on network coding for ar-
bitrarily correlated sources; and [10], which considered random
linear network coding for online network operation in dynami-
cally varying environments.
A. Overview
A brief overview of related work is given in Section I-B. In
Section II, we describe the network model and algebraic coding
approach we use in our analyses, and introduce some notation
and existing results. Section III gives some insights arising from
consideration of bipartite matching and network ows. Suc-
cess/error probability bounds for random linear network coding
are given for independent and linearly correlated sources in Sec-
tion IV and for arbitrarily correlated sources in Section V. We
also give examples of practical scenarios in which randomized
network coding can be advantageous compared to routing, in
Section VI. We present our conclusions and some directions
for further work in Section VII. Proofs and ancillary results are
given in the Appendix .
B. Related Work
Ahlswede et al. [1] showed that with network coding, as
symbol size approaches innity, a source can multicast infor-
mation at a rate approaching the smallest minimum cut between
the source and any receiver. Li et al. [19] showed that linear
coding with nite symbol size is sufcient for multicast. Koetter
and Médard [17] presented an algebraic framework for network
coding that extended previous results to arbitrary networks and
robust networking, and proved the achievability with time-in-
variant solutions of the min-cut max-ow bound for networks
with delay and cycles. Reference [17] also gave an algebraic
characterization of the feasibility of a multicast problem and
the validity of a network coding solution in terms of transfer
matrices, for which we gave in [8] equivalent formulations
obtained by considering bipartite matching and network ows.
We used these formulations in obtaining a tighter upper bound
on the required eld size than the previous bound of [17], and
in our analysis of distributed randomized network coding, in-
troduced in [9]. Concurrent independent work by Sanders et al.
[26] and Jaggi et al. [14] considered single-source multicast on
acyclic delay-free graphs, showing a similar bound on eld size
by different means, and giving centralized deterministic and
randomized polynomial-time algorithms for nding network
coding solutions over a subgraph consisting of ow solutions
to each receiver. Subsequent work by Fragouli and Soljanin [7]
gave a tighter bound for the case of two sources and for some
congurations with more than two sources. Lower bounds
on coding eld size were presented by Rasala Lehman and
Lehman [18] and Feder et al. [6]. [6] also gave graph-specic
upper bounds based on the number of clashes between ows
from source to terminals.

HO et al.: A RANDOM LINEAR NETWORK CODING APPROACH TO MULTICAST 4415
Dougherty et al. [5] presented results on linear solutions for
binary solvable multicast networks, and on nonnite eld alpha-
bets. The need for vector coding solutions in some nonmulticast
problems was considered by Rasala Lehman and Lehman [18],
Médard et al. [22], and Riis [25]. Various practical protocols
for and experimental demonstrations of random linear network
coding [3] and nonrandomized network coding [29], [24] have
also been presented.
II. M
ODEL AND
PRELIMINARIES
A. Basic Model
Our basic network coding model is based on [1], [17]. A net-
work is represented as a directed graph
, where is
the set of network nodes and
is the set of links, such that infor-
mation can be sent noiselessly from node
to for all .
Each link
is associated with a nonnegative real number
representing its transmission capacity in bits per unit time.
Nodes
and are called the origin and destination, respec-
tively, of link
. The origin and destination of a link
are denoted and , respectively. We assume
. The information transmitted on a link is ob-
tained as a coding function of information previously received
at
.
There are
discrete memoryless information source pro-
cesses
which are random binary sequences.
We denote the SlepianWolf region of the sources
where .
Source process
is generated at node , and multi-
cast to all nodes
, where
and are arbitrary mappings. In this
paper, we consider the (multisource) multicast case where
for all . The nodes
are called source nodes and the nodes are called
receiver nodes, or receivers. For simplicity, we assume subse-
quently that
. The mapping ,
the set
and the SlepianWolf region specify
a set of multicast connection requirements. The connection
requirements are satised if each receiver is able to reproduce,
from its received information, the complete source information.
A graph
, a set of link capacities , and a
set of multicast connection requirements
specify a multicast
connection problem.
We make a number of simplifying assumptions. Our anal-
ysis for the case of independent source processes assumes that
each source process
has an entropy rate of one bit per unit
time; sources of larger rate are modeled as multiple sources at
the same node. For the case of linearly correlated sources, we
assume that the sources can be modeled as given linear combi-
nations of underlying independent source processes, each with
an entropy rate of one bit per unit time, as described further in
Section II-B. For the case of arbitrarily correlated sources, we
consider sources with integer bit rates and arbitrary joint prob-
ability distributions.
For the case of independent or linearly correlated sources,
each link
is assumed to have a capacity of one bit per
unit time; links with larger capacities are modeled as parallel
links. For the case of arbitrarily correlated sources, the link rates
are assumed to be integers.
Reference [1] shows that coding enables the multicast infor-
mation rate from a single source to attain the minimum of the
individual receivers max-ow bounds,
1
and shows how to con-
vert multicast problems with multiple independent sources to
single-source problems. Reference [19] shows that linear coding
is sufcient to achieve the same individual max-ow rates; in
fact, it sufces to do network coding using only scalar algebraic
operations in a nite eld
, for some sufciently large ,on
length-
vectors of bits that are viewed as elements of [17].
The case of linearly correlated sources is similar.
For arbitrarily correlated sources, we consider operations in
on vectors of bits. This vector coding model can, for given
vector lengths, be brought into the scalar algebraic framework
of [17] by conceptually expanding each source into multiple
sources and each link into multiple links, such that each new
source and link corresponds to one bit of the corresponding in-
formation vectors. We describe this scalar framework in Sec-
tion II-B, and use it in our analysis of arbitrarily correlated
sources in Section V. Note, however, that the linear decoding
strategies of [17] do not apply for the case of arbitrarily corre-
lated sources.
We consider both the case of acyclic networks where link
delays are not considered, as well as the case of general net-
works with cycles and link delays. The former case, which we
call delay-free, includes networks whose links are assumed to
have zero delay, as well as networks with link delays that are
operated in a burst [19], pipelined [26], or batched [3] fashion,
where information is buffered or delayed at intermediate nodes
so as to be combined with other incoming information from the
same batch. A cyclic graph with
nodes and rate may also be
converted to an expanded acyclic graph with
nodes and rate
at least
, communication on which can be emulated over
time steps on the original cyclic graph [1]. For the latter case,
we consider general networks without buffering, and make the
simplifying assumption that each link has the same delay.
We use some additional denitions in this paper. Link
is an
incident outgoing link of node
if , and an incident
incoming link of
if . We call an incident incoming
link of a receiver node a terminal link, and denote by
the set
of terminal links of a receiver
.Apath is a subgraph of the
network consisting of a sequence of links
such that
, , and ,
and is denoted
.Aflow solution for a receiver is
a set of links forming
link-disjoint paths each connecting a
different source to
.
1
That is, the maximum commodity ow from the source to individual re-
ceivers.

4416 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 10, OCTOBER 2006
B. Algebraic Network Coding
In the scalar algebraic coding framework of [17], the source
information processes, the receiver output processes, and the in-
formation processes transmitted on each link, are sequences of
length-
blocks or vectors of bits, which are treated as elements
of a nite eld
, . The information process trans-
mitted on a link
is formed as a linear combination, in ,of
link
s inputs, i.e., source processes for which
and random processes for which , if any. For the
delay-free case, this is represented by the equation
The th output process at receiver node is a linear com-
bination of the information processes on its terminal links, rep-
resented as
For multicast on a network with link delays, memory is needed
at the receiver (or source) nodes, but memoryless operation
sufces at all other nodes [17]. We consider unit delay links,
modeling links with longer delay as links in series. The corre-
sponding linear coding equations are
where , , , , and are the values
of the corresponding variables at time
, respectively, and
represents the memory required. These equations, as with the
random processes in the network, can be represented alge-
braically in terms of a delay variable
where
(1)
and
The coefcients can be collected into
matrices
in the acyclic delay-free case
in the general case with delays
and
, and the matrix
in the acyclic delay-free case
in the general case with delays
whose structure is constrained by the network. A pair
or
tuple
can be called a linear network code.
We also consider a class of linearly correlated sources mod-
eled as given linear combinations of underlying independent
processes, each with an entropy and bit rate of one bit per unit
time. To simplify the notation in our subsequent development,
we work with these underlying independent processes in a
similar manner as for the case of independent sources: the
th
column of the
matrix is a linear function of given
column vectors
, where species the mapping from
underlying independent processes to the th source process
at
.
2
A receiver that decodes these underlying independent
processes is able to reconstruct the linearly correlated source
processes.
For acyclic graphs, we assume an ancestral indexing of links
in
, i.e., if ) for any links , then has a lower
index than
. Such indexing always exists for acyclic networks.
It then follows that matrix
is upper triangular with zeros on
the diagonal.
Let
.
3
The mapping from source pro-
cesses
to output processes at
a receiver
is given by the transfer matrix [17].
For a given multicast connection problem, if some network
code
in a eld (or ) satises
the condition that
has full rank for each receiver
, then satises
, and is a solution to the
multicast connection problem in the same eld. A multicast
connection problem for which there exists a solution in some
eld
or is called feasible, and the corresponding
connection requirements are said to be feasible for the network.
2
We can also consider the case where
xx
x
2
by restricting network
coding to occur in
,
q
=2
.
3
For the acyclic delay-free case, the sequence
(
II
I
0
FF
F
) =
II
I
+
FF
F
+
FF
F
+
111
converges since
FF
F
is nilpotent for an acyclic network. For the case with delays,
(
II
I
0
FF
F
)
exists since the determinant of
II
I
0
FF
F
is nonzero in its eld of
denition
(
D;
...
;f ;
...)
, as seen by letting
D
=0
. [17]

HO et al.: A RANDOM LINEAR NETWORK CODING APPROACH TO MULTICAST 4417
In subsequent sections, where we consider choosing the value
of
by distributed random coding, the following deni-
tions are useful: if for a receiver
there exists some value of
such that has full rank , then is a valid
network code for
; a network code is valid for a
multicast connection problem if it is valid for all receivers.
The
th column of matrix species the mapping from
source processes to the random process on link
. We denote by
the submatrix consisting of columns of corresponding to
a set of links
.
For a receiver
to decode, it needs to know the mapping
from the source processes to the random processes on its
terminal links. The entries of
are scalar elements of
in the acyclic delay-free case, and polynomials in delay variable
in the case with link delays. In the latter case, the number of
terms of these polynomials and the memory required at the re-
ceivers depend on the number of links involved in cycles, which
act like memory registers, in the network.
We use the notational convention that matrices are named
with bold upper case letters and vectors are named with bold
lower case letters.
III. I
NSIGHTS FROM
BIPARTITE MATCHING
AND
NETWORK FLOWS
As described in the previous section, for a multicast connec-
tion problem with independent or linearly correlated sources,
the transfer matrix condition of [17] for the problem to be fea-
sible (or for a particular linear network code dened by ma-
trices
to be valid for the connection problem) is that
for each receiver
, the transfer matrix has nonzero de-
terminant. The following result shows the equivalence of this
transfer matrix condition and the Edmonds matrix formulation
for checking if a bipartite graph has a perfect matching (e.g.,
[23]). The problem of determining whether a bipartite graph
has a perfect matching is a classical reduction of the problem of
checking the feasibility of an
ow [15].
4
This latter problem
can be viewed as a degenerate case of network coding, restricted
to the binary eld and without any coding; it is interesting to
nd that the two formulations are equivalent for the more gen-
eral case of linear network coding in higher order elds.
Lemma 1:
(a) For an acyclic delay-free network, the determinant of the
transfer matrix
for receiver is
equal to
4
The problem of checking the feasibility of an
s
0
t
ow of size
r
on graph
G
=(
V
;
E
)
can be reduced to a bipartite matching problem by constructing the
following bipartite graph: one set of the bipartite graph has
r
nodes
u ;
...
;u
,
and a node
v
corresponding to each link
l
2E
; the other set of the bipartite
graph has
r
nodes
w ;
...
;w
, and a node
v
corresponding to each link
l
2
E
. The bipartite graph has links joining each node
u
to each node
v
such that
o
(
l
)=
s
, a link joining node
v
to the corresponding node
v
for all
l
2E
,
links joining node
v
to
v
for each pair
(
l;j
)
2E2E
such that
d
(
l
)=
o
(
j
)
,
and links joining each node
w
to each node
v
such that
d
(
l
)=
t
. The
s
0
t
ow is feasible if and only if the bipartite graph has a perfect matching.
where
is the corresponding Edmonds matrix.
(b) For an arbitrary network with unit delay links, the transfer
matrix
for receiver is non-
singular if and only if the corresponding Edmonds matrix
is nonsingular.
Proof: See Appendix A.
The usefulness of this result is in making apparent various
characteristics of the transfer matrix determinant polynomial
that are obscured in the original transfer matrix by the matrix
products and inverse. For instance, the maximum exponent of
a variable, the total degree of the polynomial, and its form for
linearly correlated sources are easily deduced, leading to Theo-
rems 1 and 2.
For the acyclic delay-free case, Lemma 2 below is another
alternative formulation of the same transfer matrix condition
which illuminates similar properties of the transfer matrix
determinant as Lemma 1. Furthermore, by considering network
coding as a superposition of ow solutions, Lemma 2 allows us
to tighten, in Theorem 3, the bound of Theorem 2 for random
network coding on given acyclic networks in terms of the
number of links in a ow solution for an individual receiver.
Lemma 2: A multicast connection problem with
sources
is feasible (or a particular network code
is valid for the
problem) if and only if each receiver
has a set of terminal
links for which
where is the submatrix of consisting of links
, and
if
if
is the product of gains on the path . The sum
is over all ow solutions from the sources to links in
, each
such solution being a set of
link-disjoint paths each connecting
a different source to a different link in
.
Proof: See Appendix A.
Lemma 1 leads to the following upper bound on required eld
size for a feasible multicast problem, which tightens the upper
bound of
given in [17], where is the number of pro-
cesses being transmitted in the network.
Theorem 1: For a feasible multicast connection problem with
independent or linearly correlated sources and
receivers, in
both the acyclic delay-free case and the general case with delays,

Citations
More filters
Book
16 Jan 2012
TL;DR: In this article, a comprehensive treatment of network information theory and its applications is provided, which provides the first unified coverage of both classical and recent results, including successive cancellation and superposition coding, MIMO wireless communication, network coding and cooperative relaying.
Abstract: This comprehensive treatment of network information theory and its applications provides the first unified coverage of both classical and recent results. With an approach that balances the introduction of new models and new coding techniques, readers are guided through Shannon's point-to-point information theory, single-hop networks, multihop networks, and extensions to distributed computing, secrecy, wireless communication, and networking. Elementary mathematical tools and techniques are used throughout, requiring only basic knowledge of probability, whilst unified proofs of coding theorems are based on a few simple lemmas, making the text accessible to newcomers. Key topics covered include successive cancellation and superposition coding, MIMO wireless communication, network coding, and cooperative relaying. Also covered are feedback and interactive communication, capacity approximations and scaling laws, and asynchronous and random access channels. This book is ideal for use in the classroom, for self-study, and as a reference for researchers and engineers in industry and academia.

2,442 citations

Journal ArticleDOI
TL;DR: The results show that using COPE at the forwarding layer, without modifying routing and higher layers, increases network throughput, and the gains vary from a few percent to several folds depending on the traffic pattern, congestion level, and transport protocol.
Abstract: This paper proposes COPE, a new architecture for wireless mesh networks. In addition to forwarding packets, routers mix (i.e., code) packets from different sources to increase the information content of each transmission. We show that intelligently mixing packets increases network throughput. Our design is rooted in the theory of network coding. Prior work on network coding is mainly theoretical and focuses on multicast traffic. This paper aims to bridge theory with practice; it addresses the common case of unicast traffic, dynamic and potentially bursty flows, and practical issues facing the integration of network coding in the current network stack. We evaluate our design on a 20-node wireless network, and discuss the results of the first testbed deployment of wireless network coding. The results show that using COPE at the forwarding layer, without modifying routing and higher layers, increases network throughput. The gains vary from a few percent to several folds depending on the traffic pattern, congestion level, and transport protocol.

2,190 citations

Journal ArticleDOI
TL;DR: It is shown that there is a fundamental tradeoff between storage and repair bandwidth which is theoretically characterize using flow arguments on an appropriately constructed graph and regenerating codes are introduced that can achieve any point in this optimal tradeoff.
Abstract: Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate encoded fragments in a distributed way while transferring as little data as possible across the network. For an erasure coded system, a common practice to repair from a single node failure is for a new node to reconstruct the whole encoded data object to generate just one encoded block. We show that this procedure is sub-optimal. We introduce the notion of regenerating codes, which allow a new node to communicate functions of the stored data from the surviving nodes. We show that regenerating codes can significantly reduce the repair bandwidth. Further, we show that there is a fundamental tradeoff between storage and repair bandwidth which we theoretically characterize using flow arguments on an appropriately constructed graph. By invoking constructive results in network coding, we introduce regenerating codes that can achieve any point in this optimal tradeoff.

1,919 citations


Cites background from "A Random Linear Network Coding Appr..."

  • ...The studies by Ho et al. [ 22 ] and Sanders et al. [23] further showed that random linear network coding over a sufficiently large finite field can (asymptotically) achieve the multicast capacity....

    [...]

  • ...Further, simple random linear combinations will suffice with high probability as the field size over which coding is performed grows, as shown by Ho. et al. [ 22 ]....

    [...]

Proceedings ArticleDOI
30 Nov 2006
TL;DR: This paper presents their recent experiences with a highly optimized and high-performance C++ implementation of randomized network coding at the application layer, and presents their observations based on an extensive series of experiments.
Abstract: With network coding, intermediate nodes between the source and the receivers of an end-to-end communication session are not only capable of relaying and replicating data messages, but also of coding incoming messages to produce coded outgoing ones. Recent studies have shown that network coding is beneficial for peer-to-peer content distribution, since it eliminates the need for content reconciliation, and is highly resilient to peer failures. In this paper, we present our recent experiences with a highly optimized and high-performance C++ implementation of randomized network coding at the application layer. We present our observations based on an extensive series of experiments, draw conclusions from a wide range of scenarios, and are more cautious and less optimistic as compared to previous studies.

1,525 citations

Posted Content
TL;DR: In this paper, the authors introduce a general technique to analyze storage architectures that combine any form of coding and replication, as well as presenting two new schemes for maintaining redundancy using erasure codes.
Abstract: Peer-to-peer distributed storage systems provide reliable access to data through redundancy spread over nodes across the Internet. A key goal is to minimize the amount of bandwidth used to maintain that redundancy. Storing a file using an erasure code, in fragments spread across nodes, promises to require less redundancy and hence less maintenance bandwidth than simple replication to provide the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate a new fragment in a distributed way while transferring as little data as possible across the network. In this paper, we introduce a general technique to analyze storage architectures that combine any form of coding and replication, as well as presenting two new schemes for maintaining redundancy using erasure codes. First, we show how to optimally generate MDS fragments directly from existing fragments in the system. Second, we introduce a new scheme called Regenerating Codes which use slightly larger fragments than MDS but have lower overall bandwidth use. We also show through simulation that in realistic environments, Regenerating Codes can reduce maintenance bandwidth use by 25 percent or more compared with the best previous design--a hybrid of replication and erasure codes--while simplifying system architecture.

1,389 citations

References
More filters
Journal ArticleDOI
TL;DR: This work reveals that it is in general not optimal to regard the information to be multicast as a "fluid" which can simply be routed or replicated, and by employing coding at the nodes, which the work refers to as network coding, bandwidth can in general be saved.
Abstract: We introduce a new class of problems called network information flow which is inspired by computer network applications. Consider a point-to-point communication network on which a number of information sources are to be multicast to certain sets of destinations. We assume that the information sources are mutually independent. The problem is to characterize the admissible coding rate region. This model subsumes all previously studied models along the same line. We study the problem with one information source, and we have obtained a simple characterization of the admissible coding rate region. Our result can be regarded as the max-flow min-cut theorem for network information flow. Contrary to one's intuition, our work reveals that it is in general not optimal to regard the information to be multicast as a "fluid" which can simply be routed or replicated. Rather, by employing coding at the nodes, which we refer to as network coding, bandwidth can in general be saved. This finding may have significant impact on future design of switching systems.

8,533 citations


"A Random Linear Network Coding Appr..." refers background or methods in this paper

  • ...THE capacity of multicast networks with network coding was given in [1]....

    [...]

  • ...[1] showed that with network coding, as symbol size approaches infinity, a source can multicast information at a rate approaching the smallest minimum cut between the source and any receiver....

    [...]

  • ...Reference [1] shows that coding enables the multicast infor-...

    [...]

  • ...given by the max-flow min-cut bound of [1], in multisource...

    [...]

  • ...A cyclic graph with nodes and rate may also be converted to an expanded acyclic graph with nodes and rate at least , communication on which can be emulated over time steps on the original cyclic graph [1]....

    [...]

Book
01 Jan 1995
TL;DR: This book introduces the basic concepts in the design and analysis of randomized algorithms and presents basic tools such as probability theory and probabilistic analysis that are frequently used in algorithmic applications.
Abstract: For many applications, a randomized algorithm is either the simplest or the fastest algorithm available, and sometimes both. This book introduces the basic concepts in the design and analysis of randomized algorithms. The first part of the text presents basic tools such as probability theory and probabilistic analysis that are frequently used in algorithmic applications. Algorithmic examples are also given to illustrate the use of each tool in a concrete setting. In the second part of the book, each chapter focuses on an important area to which randomized algorithms can be applied, providing a comprehensive and representative selection of the algorithms that might be used in each of these areas. Although written primarily as a text for advanced undergraduates and graduate students, this book should also prove invaluable as a reference for professionals and researchers.

4,412 citations

Journal ArticleDOI
David Slepian1, Jack K. Wolf
TL;DR: The minimum number of bits per character R_X and R_Y needed to encode these sequences so that they can be faithfully reproduced under a variety of assumptions regarding the encoders and decoders is determined.
Abstract: Correlated information sequences \cdots ,X_{-1},X_0,X_1, \cdots and \cdots,Y_{-1},Y_0,Y_1, \cdots are generated by repeated independent drawings of a pair of discrete random variables X, Y from a given bivariate distribution P_{XY} (x,y) . We determine the minimum number of bits per character R_X and R_Y needed to encode these sequences so that they can be faithfully reproduced under a variety of assumptions regarding the encoders and decoders. The results, some of which are not at all obvious, are presented as an admissible rate region \mathcal{R} in the R_X - R_Y plane. They generalize a similar and well-known result for a single information sequence, namely R_X \geq H (X) for faithful reproduction.

4,165 citations


"A Random Linear Network Coding Appr..." refers background in this paper

  • ...Second, in the context of a distributed source coding problem, we demonstrate that random linear coding also performs compression when necessary in a network, generalizing known error exponents for linear Slepian–Wolf coding [4] in a natural way....

    [...]

  • ...The error exponents for general networks reduce to those obtained in [4] for the Slepian–Wolf network where , , , TABLE I SUCCESS PROBABILITIES OF RANDOMIZED FLOODING SCHEME RF AND RANDOM LINEAR CODING SCHEME RC....

    [...]

  • ...Analogously to Slepian and Wolf [28], we consider the problem of distributed encoding and joint decoding of two sources whose output values in each unit time period are drawn independent and identically distributed (i....

    [...]

  • ...Analogously to Slepian and Wolf [28], we consider the problem of distributed encoding and joint decoding of two sources whose output values in each unit time period are drawn independent and identically distributed (i.i.d.) from the same joint distribution ....

    [...]

  • ...In the special case of a network consisting of one direct link from each source to a common receiver, this reduces to the original Slepian–Wolf problem....

    [...]

Journal ArticleDOI
TL;DR: This work forms this multicast problem and proves that linear coding suffices to achieve the optimum, which is the max-flow from the source to each receiving node.
Abstract: Consider a communication network in which certain source nodes multicast information to other nodes on the network in the multihop fashion where every node can pass on any of its received data to others. We are interested in how fast each node can receive the complete information, or equivalently, what the information rate arriving at each node is. Allowing a node to encode its received data before passing it on, the question involves optimization of the multicast mechanisms at the nodes. Among the simplest coding schemes is linear coding, which regards a block of data as a vector over a certain base field and allows a node to apply a linear transformation to a vector before passing it on. We formulate this multicast problem and prove that linear coding suffices to achieve the optimum, which is the max-flow from the source to each receiving node.

3,660 citations

Journal ArticleDOI
TL;DR: For the multicast setup it is proved that there exist coding strategies that provide maximally robust networks and that do not require adaptation of the network interior to the failure pattern in question.
Abstract: We take a new look at the issue of network capacity. It is shown that network coding is an essential ingredient in achieving the capacity of a network. Building on recent work by Li et al.(see Proc. 2001 IEEE Int. Symp. Information Theory, p.102), who examined the network capacity of multicast networks, we extend the network coding framework to arbitrary networks and robust networking. For networks which are restricted to using linear network codes, we find necessary and sufficient conditions for the feasibility of any given set of connections over a given network. We also consider the problem of network recovery for nonergodic link failures. For the multicast setup we prove that there exist coding strategies that provide maximally robust networks and that do not require adaptation of the network interior to the failure pattern in question. The results are derived for both delay-free networks and networks with delays.

2,628 citations


"A Random Linear Network Coding Appr..." refers background or methods in this paper

  • ...We used these formulations in obtaining a tighter upper bound on the required field size than the previous bound of [17], and in our analysis of distributed randomized network coding, introduced in [9]....

    [...]

  • ...a receiver is given by the transfer matrix [17]....

    [...]

  • ...Note, however, that the linear decoding strategies of [17] do not apply for the case of arbitrarily correlated sources....

    [...]

  • ...In the scalar algebraic coding framework of [17], the source information processes, the receiver output processes, and the in-...

    [...]

  • ...The need for vector coding solutions in some nonmulticast problems was considered by Rasala Lehman and Lehman [18], Médard et al. [22], and Riis [25]....

    [...]

Frequently Asked Questions (12)
Q1. What contributions have the authors mentioned in the paper "A random linear network coding approach to multicast" ?

The authors present a distributed random linear network coding approach for transmission and compression of information in general multisource multicast networks. The authors show that this achieves capacity with probability exponentially approaching 1 with the code length. The authors also demonstrate that random linear coding performs compression when necessary in a network, generalizing error exponents for linear Slepian–Wolf coding in a natural way. Benefits of this approach are decentralized operation and robustness to network changes or link failures. The authors show that this approach can take advantage of redundant network capacity for improved success probability and robustness. The authors illustrate some potential advantages of random linear network coding over routing in two examples of practical scenarios: distributed network operation and networks with dynamically varying connections. 

Further work includes extensions to nonuniform code distributions, possibly chosen adaptively or with some rudimentary coordination, to optimize different performance goals. 

Lower bounds on coding field size were presented by Rasala Lehman and Lehman [18] and Feder et al. [6]. [6] also gave graph-specific upper bounds based on the number of “clashes” between flows from source to terminals. 

If there exists a solution to the network connection problem with the same values for the fixed code coefficients, then the probability that the random network code is valid for the problem is at least , where is the maximum number of links with associated random coefficients in any set of links constituting a flow solution for any receiver. 

Nodes that cannot determine the appropriate code coefficients from local information choose the coefficients independently and uniformly from . 

The authors have given a general bound on the success probability of such codes for arbitrary networks, showing that error probability decreases exponentially with code length. 

If there exists a solution to the network connection problem with the same values for the fixed code coefficients, then the probability that the random network code is valid for the problem is at least , where is the number of links with associated random coefficients. 

For the case of arbitrarily correlated sources, the authors consider sources with integer bit rates and arbitrary joint probability distributions. 

The next bound is useful in cases where analysis of connection feasibility is easier than direct analysis of random linear coding. 

To this end, the authors use a small field size that allows random linear coding to generally match the performance of the Steiner heuristic, and to surpass it in networks whose topology makes Steiner tree routing difficult. 

it is intuitive that having more redundant capacity in the network, for instance, should increase the probability that a randomlinear code will be valid. 

These examples suggest that the decentralized nature and robustness of random linear network coding can offer significant advantages in settings that hinder optimal centralized network control.