Journal Article•DOI•

A Random Linear Network Coding Approach to Multicast

Tracey Ho¹, Muriel Medard¹, Ralf Koetter², David R. Karger¹, Michelle Effros³, Jun Shi⁴, Ben Leong¹ - Show less +3 more•Institutions (4)

Massachusetts Institute of Technology¹, University of Illinois at Urbana–Champaign², California Institute of Technology³, University of California, Los Angeles⁴

01 Oct 2006-IEEE Transactions on Information Theory (IEEE)-Vol. 52, Iss: 10, pp 4413-4430

TL;DR: This work presents a distributed random linear network coding approach for transmission and compression of information in general multisource multicast networks, and shows that this approach can take advantage of redundant network capacity for improved success probability and robustness.

read less

Abstract: We present a distributed random linear network coding approach for transmission and compression of information in general multisource multicast networks. Network nodes independently and randomly select linear mappings from inputs onto output links over some field. We show that this achieves capacity with probability exponentially approaching 1 with the code length. We also demonstrate that random linear coding performs compression when necessary in a network, generalizing error exponents for linear Slepian-Wolf coding in a natural way. Benefits of this approach are decentralized operation and robustness to network changes or link failures. We show that this approach can take advantage of redundant network capacity for improved success probability and robustness. We illustrate some potential advantages of random linear network coding over routing in two examples of practical scenarios: distributed network operation and networks with dynamically varying connections. Our derivation of these results also yields a new bound on required field size for centralized network coding on general multicast networks

...read moreread less

Summary (3 min read)

Jump to: [Introduction] – [A. Overview] – [B. Related Work] – [A. Basic Model] – [OR LINEARLY CORRELATED SOURCES] – [V. RANDOM LINEAR NETWORK CODING FOR ARBITRARILY CORRELATED SOURCES] – [VI. BENEFITS OF RANDOMIZED CODING OVER ROUTING] – [A. Distributed Settings] – [B. Dynamically Varying Connections] and [VII. CONCLUSION]

Introduction

This paper is an initial exploration of random linear network coding, posing more questions that it answers.
Resource consumption can naturally be traded off against capacity and robustness, and across multiple communicating sessions; subsequent work on distributed resource optimization, e.g., [13], [21], has used random linear network coding as a component of the solution.

A. Overview

In Section II, the authors describe the network model and algebraic coding approach they use in their analyses, and introduce some notation and existing results.
Section III gives some insights arising from consideration of bipartite matching and network flows.
Success/error probability bounds for random linear network coding are given for independent and linearly correlated sources in Section IV and for arbitrarily correlated sources in Section V.
The authors also give examples of practical scenarios in which randomized network coding can be advantageous compared to routing, in Section VI.
The authors present their conclusions and some directions for further work in Section VII.

A. Basic Model

Nodes and are called the origin and destination, respectively, of link .
The authors consider the multicast case where for all .
For the case of linearly correlated sources, the authors assume that the sources can be modeled as given linear combinations of underlying independent source processes, each with an entropy rate of one bit per unit time, as described further in Section II-B.
For the latter case, the authors consider general networks without buffering, and make the simplifying assumption that each link has the same delay.

OR LINEARLY CORRELATED SOURCES

The authors consider random linear network codes in which some or all of the network code coefficients for linearly correlated sources are chosen independently and uniformly over , where is greater than the number of receivers .
The code length is the logarithm of the field size .
The bound of Theorem 2 is very general, applying across all networks with the same number of receivers and the same number of links with associated random code coefficients, without considering specific network structure.
If there exists a solution to the network connection problem with the same values for the fixed code coefficients, then the probability that the random network code is valid for the problem is at least , where is the maximum number of links with associated random coefficients in any set of links constituting a flow solution for any receiver.
Consider a multicast connection problem on an acyclic network with independent or linearly correlated sources of joint entropy rate , and links which fail (are deleted from the network) with probability, also known as Theorem 5.

V. RANDOM LINEAR NETWORK CODING FOR ARBITRARILY CORRELATED SOURCES

So far the authors have been considering independent or linearly correlated sources.
Analogously to Slepian and Wolf [28], the authors consider the problem of distributed encoding and joint decoding of two sources whose output values in each unit time period are drawn independent and identically distributed (i.i.d.) from the same joint distribution .
The authors denote by the maximum source–receiver path length.
The error probability of the random linear network code is at most , where and are dummy random variables with joint distribution, also known as Theorem 6.
THE TABLE GIVES BOUNDS AS WELL AS SOME ACTUAL PROBABILITY VALUES WHERE EXACT CALCULATIONS ARE TRACTABLE.

VI. BENEFITS OF RANDOMIZED CODING OVER ROUTING

Network coding, as a superset of routing, has been shown to offer significant capacity gains for networks with special structure [26].
For many other networks, network coding does not give higher capacity than centralized optimal routing, but can offer other advantages when centralized optimal routing is difficult.
The authors consider two types of network scenarios in which distributed random linear coding can be particularly useful.

A. Distributed Settings

In networks with large numbers of nodes and/or changing topologies, it may be expensive or infeasible to reliably maintain routing state at network nodes.
The source node sends one process in both directions on one axis and the other process in both directions along the other axis, as illustrated in Fig.
A node receiving information on two links sends one of the incoming processes on one of its two outgoing links with equal probability, and the other process on the remaining link.
For the randomized flooding scheme RF, the probability that a receiver located at grid position relative to the source can decode both source processes is at least, also known as 5 Proposition 1.
5This simple scheme, unlike the randomized flooding scheme RF, leaves out the optimization that each node receiving two linearly independent processes should always send out two linearly independent processes.

B. Dynamically Varying Connections

Another scenario in which random linear network coding can be advantageous is for multisource multicast with dynamically varying connections.
The parameter values for the tests were chosen such that the resulting random graphs would in general be connected and able to support some of the desired connections, while being small enough for the simulations to run efficiently.
In each time slot, a source was either on, i.e., transmitting source informa- tion, or off, i.e., not transmitting source information.
Each of these types of overhead depends on the coding field size.
To this end, the authors use a small field size that allows random linear coding to generally match the performance of the Steiner heuristic, and to surpass it in networks whose topology makes Steiner tree routing difficult.

VII. CONCLUSION

The authors have presented a distributed random linear network coding approach which asymptotically achieves capacity, as given by the max-flow min-cut bound of [1], in multisource multicast networks.
These examples suggest that the decentralized nature and robustness of random linear network coding can offer significant advantages in settings that hinder optimal centralized network control.
Further work includes extensions to nonuniform code distributions, possibly chosen adaptively or with some rudimentary coordination, to optimize different performance goals.
The randomized and distributed nature of the approach also leads us naturally to consider applications in network security.
It would also be interesting to consider protocol issues for different communication scenarios, and to compare specific coding and routing protocols over a range of performance metrics.

Did you find this useful? Give us your feedback

Figures (5)

Fig. 1. An example of distributed random linear network coding. X and X are the source processes being multicast to the receivers, and the coefficients are randomly chosen elements of a finite field. The label on each link represents the process being transmitted on the link.

Fig. 2. An example illustrating vector linear coding. x 2 and x 2

Fig. 3. Rectangular grid network with two processes X and X originating at a source node. The links are all directed outwards from the source node. The labels on the links show the source transmissions in the random flooding scheme RF, where one process is sent in both directions on one axis and the other process in both directions along the other axis.

Fig. 4. Rectangular grid network. Y denotes the process transmitted on the

TABLE II A SAMPLE OF RESULTS ON GRAPHS GENERATED WITH THE FOLLOWING PARAMETERS: NUMBER OF NODES n, NUMBER OF SOURCES r, NUMBER OF RECEIVERS d, TRANSMISSION RANGE , MAXIMUM IN-DEGREE AND OUT-DEGREE i. b AND b ARE THE RATE OF BLOCKED CONNECTIONS FOR ROUTING AND CODING, RESPECTIVELY, AND t AND t ARE THE CORRESPONDING THROUGHPUTS

Content maybe subject to copyright Report

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 10, OCTOBER 2006 4413

A Random Linear Network Coding Approach to

Multicast

Tracey Ho, Member, IEEE, Muriel Médard, Senior Member, IEEE, Ralf Koetter, Senior Member, IEEE,

David R. Karger, Associate Member, IEEE, Michelle Effros, Senior Member, IEEE, Jun Shi, and Ben Leong

Abstract—We present a distributed random linear network

coding approach for transmission and compression of informa-

tion in general multisource multicast networks. Network nodes

independently and randomly select linear mappings from inputs

onto output links over some ﬁeld. We show that this achieves ca-

pacity with probability exponentially approaching

with the code

length. We also demonstrate that random linear coding performs

compression when necessary in a network, generalizing error ex-

ponents for linear Slepian–Wolf coding in a natural way. Beneﬁts

of this approach are decentralized operation and robustness to

network changes or link failures. We show that this approach

can take advantage of redundant network capacity for improved

success probability and robustness. We illustrate some potential

advantages of random linear network coding over routing in two

examples of practical scenarios: distributed network operation

and networks with dynamically varying connections. Our deriva-

tion of these results also yields a new bound on required ﬁeld size

for centralized network coding on general multicast networks.

Index Terms—Distributed compression, distributed networking,

multicast, network coding, random linear coding.

I. INTRODUCTION

HE capacity of multicast networks with network coding

was given in [1]. We present an efﬁcient distributed ran-

domized approach that asymptotically achieves this capacity.

We consider a general multicast framework—multisource mul-

ticast, possibly with correlated sources, on general networks.

Manuscript received February 26, 2004; revised June 1, 2006. This work

was supported in part by the National Science Foundation under Grants CCF-

0325324, CCR-0325673, and CCR-0220039, by Hewlett-Packard under Con-

tract 008542-008, and by Caltech’s Lee Center for Advanced Networking.

T. Ho was with the Laboratory for Information and Decision Systems (LIDS),

Massachusetts Institute of Technology (MIT), Cambridge, MA 02139 USA.

She is now with the California Institute of Technology (Caltech), Pasadena, CA

91125 USA (e-mail: tho@caltech.edu).

M. Médard is with Laboratory for Information and Decision Systems, the

Massachusetts Institute of Technology (MIT), Cambridge, MA 02139 USA

(e-mail: medard@mit.edu).

R. Koetter is with the Coordinated Science Laboratory, University of Illinois

at Urbana-Champaign, Urbana, IL 61801 USA (e-mail: koetter@csl.uiuc.edu).

D. R. Karger is with the Computer Science and Artiﬁcial Intelligence Labo-

ratory (CSAIL), the Massachusetts Institute of Technology (MIT), Cambridge,

MA 02139 USA (e-mail: karger@csail.mit.edu).

M. Effros is with the Department of Electrical Engineering, California Insti-

tute of Technology, Pasadena, CA 91125 USA (e-mail: effros@caltech.edu).

J. Shi was with the University of California, Los Angeles, CA, USA. He is

now with Intel Corporation, Santa Clara, CA 95054 USA (e-mail: junshi@ee.

ucla.edu).

B. Leong was with the Computer Science and Artiﬁcial Intelligence Labo-

ratory (CSAIL), the Massachusetts Institute of Technology (MIT), Cambridge,

MA 02139 USA. He is now with the National University of Singapore, Singa-

pore 119260, Republic of Singapore (e-mail: benleong@comp.nus.edu.sg).

Communicated by A. Ashikhmin, Associate Editor for Coding Theory.

Digital Object Identiﬁer 10.1109/TIT.2006.881746

Fig. 1. An example of distributed random linear network coding.

and

are the source processes being multicast to the receivers, and the coefﬁcients



are randomly chosen elements of a ﬁnite ﬁeld. The label on each link represents

the process being transmitted on the link.

This family of problems includes traditional single-source mul-

ticast for content delivery and the incast or reachback problem

for sensor networks, in which several, possibly correlated,

sources transmit to a single receiver. We use a randomized

strategy: all nodes other than the receiver nodes perform

random linear mappings from inputs onto outputs over some

ﬁeld. These mappings are selected independently at each node.

An illustration is given in Fig. 1. The receivers need only know

the overall linear combination of source processes in each of

their incoming transmissions. This information can be sent with

each transmission block or packet as a vector of coefﬁcients

corresponding to each of the source processes, and updated at

each coding node by applying the same linear mappings to the

coefﬁcient vectors as to the information signals. The relative

overhead of transmitting these coefﬁcients decreases with

increasing length of blocks over which the codes and network

remain constant. For instance, if the network and network code

are ﬁxed, all that is needed is for the sources to send, once, at

the start of operation, a canonical basis through the network.

Our primary results show, ﬁrst, that such random linear

coding achieves multicast capacity with probability exponen-

tially approaching

with the length of code. Second, in the

context of a distributed source coding problem, we demonstrate

that random linear coding also performs compression when

necessary in a network, generalizing known error exponents for

linear Slepian–Wolf coding [4] in a natural way.

This approach not only recovers the capacity and achievable

rates, but also offers a number of advantages. While capacity

can be achieved by other deterministic or random approaches,

they require, in general, network codes that are planned by or

known to a central authority. Random design of network codes

was ﬁrst considered in [1]; our contribution is in showing how

random linear network codes can be constructed and efﬁciently

4414 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 10, OCTOBER 2006

communicated to receivers in a distributed manner. For the case

of distributed operation of a network whose conditions may be

varying over time, our work hints at a beguiling possibility: that

a network may be operated in a decentralized manner and still

achieve the information rates of the optimized solution. Our

distributed network coding approach has led to and enabled

subsequent developments in distributed network optimization,

e.g., [20], [13]. The distributed nature of our approach also ties

in well with considerations of robustness to changing network

conditions. We show that our approach can take advantage of

redundant network capacity for improved success probability

and robustness. Moreover, issues of stability, such as those

arising from propagation of routing information, are obviated

by the fact that each node selects its code independently from

the others.

Our results, more speciﬁcally, give a lower bound on the

probability of error-free transmission for independent or lin-

early correlated sources, which, owing to the particular form

of transfer matrix determinant polynomials, is tighter than the

Schwartz–Zippel bound (e.g., [23]) for general polynomials

of the same total degree. This bound, which is exponentially

dependent on the code length, holds for any feasible set of

multicast connections over any network topology (including

networks with cycles and link delays). The result is derived

using a formulation based on the Edmonds matrix of bipartite

matching, which leads also to an upper bound on ﬁeld size

required for deterministic centralized network coding over

general networks. We further give, for acyclic networks, tighter

bounds based on more speciﬁc network structure, and show

the effects of redundancy and link reliability on success proba-

bility. For arbitrarily correlated sources, we give error bounds

for minimum entropy and maximum

a posteriori probability

decoding. In the special case of a Slepian–Wolf source network

consisting of a link from each source to the receiver, our error

exponents reduce to the corresponding results in [4] for linear

Slepian–Wolf coding. The latter scenario may thus be consid-

ered a degenerate case of network coding.

We illustrate some possible applications with two examples

of practical scenarios—distributed settings and networks with

dynamically varying connections—in which random linear

network coding shows particular promise of advantages over

routing.

This paper is an initial exploration of random linear network

coding, posing more questions that it answers. We do not cover

aspects such as resource and energy allocation, but focus on op-

timally exploiting a given set of resources. Resource consump-

tion can naturally be traded off against capacity and robustness,

and across multiple communicating sessions; subsequent work

on distributed resource optimization, e.g., [13], [21], has used

random linear network coding as a component of the solution.

There are also many issues surrounding the adaptation of pro-

tocols, which generally assume routing, to random coding ap-

proaches. We do not address these here, but rather seek to estab-

lish that the potential beneﬁts of random linear network coding

justify future consideration of protocol compatibility with or

adaptation to network codes.

The basic random linear network coding approach involves

no coordination among nodes. Implementations for various ap-

plications may not be completely protocol-free, but the roles

and requirements for protocols may be substantially redeﬁned

in this new environment. For instance, if we allow for retrials to

ﬁnd successful codes, we in effect trade code length for some

rudimentary coordination.

Portions of this work have appeared in [9], which introduced

distributed random linear network coding; [8], which presented

the Edmonds matrix formulation and a new bound on required

ﬁeld size for centralized network coding; [12], which gener-

alized previous results to arbitrary networks and gave tighter

bounds for acyclic networks; [11], on network coding for ar-

bitrarily correlated sources; and [10], which considered random

linear network coding for online network operation in dynami-

cally varying environments.

A. Overview

A brief overview of related work is given in Section I-B. In

Section II, we describe the network model and algebraic coding

approach we use in our analyses, and introduce some notation

and existing results. Section III gives some insights arising from

consideration of bipartite matching and network ﬂows. Suc-

cess/error probability bounds for random linear network coding

are given for independent and linearly correlated sources in Sec-

tion IV and for arbitrarily correlated sources in Section V. We

also give examples of practical scenarios in which randomized

network coding can be advantageous compared to routing, in

Section VI. We present our conclusions and some directions

for further work in Section VII. Proofs and ancillary results are

given in the Appendix .

B. Related Work

Ahlswede et al. [1] showed that with network coding, as

symbol size approaches inﬁnity, a source can multicast infor-

mation at a rate approaching the smallest minimum cut between

the source and any receiver. Li et al. [19] showed that linear

coding with ﬁnite symbol size is sufﬁcient for multicast. Koetter

and Médard [17] presented an algebraic framework for network

coding that extended previous results to arbitrary networks and

robust networking, and proved the achievability with time-in-

variant solutions of the min-cut max-ﬂow bound for networks

with delay and cycles. Reference [17] also gave an algebraic

characterization of the feasibility of a multicast problem and

the validity of a network coding solution in terms of transfer

matrices, for which we gave in [8] equivalent formulations

obtained by considering bipartite matching and network ﬂows.

We used these formulations in obtaining a tighter upper bound

on the required ﬁeld size than the previous bound of [17], and

in our analysis of distributed randomized network coding, in-

troduced in [9]. Concurrent independent work by Sanders et al.

[26] and Jaggi et al. [14] considered single-source multicast on

acyclic delay-free graphs, showing a similar bound on ﬁeld size

by different means, and giving centralized deterministic and

randomized polynomial-time algorithms for ﬁnding network

coding solutions over a subgraph consisting of ﬂow solutions

to each receiver. Subsequent work by Fragouli and Soljanin [7]

gave a tighter bound for the case of two sources and for some

conﬁgurations with more than two sources. Lower bounds

on coding ﬁeld size were presented by Rasala Lehman and

Lehman [18] and Feder et al. [6]. [6] also gave graph-speciﬁc

upper bounds based on the number of “clashes” between ﬂows

from source to terminals.

HO et al.: A RANDOM LINEAR NETWORK CODING APPROACH TO MULTICAST 4415

Dougherty et al. [5] presented results on linear solutions for

binary solvable multicast networks, and on nonﬁnite ﬁeld alpha-

bets. The need for vector coding solutions in some nonmulticast

problems was considered by Rasala Lehman and Lehman [18],

Médard et al. [22], and Riis [25]. Various practical protocols

for and experimental demonstrations of random linear network

coding [3] and nonrandomized network coding [29], [24] have

also been presented.

II. M

ODEL AND

PRELIMINARIES

A. Basic Model

Our basic network coding model is based on [1], [17]. A net-

work is represented as a directed graph

, where is

the set of network nodes and

is the set of links, such that infor-

mation can be sent noiselessly from node

to for all .

Each link

is associated with a nonnegative real number

representing its transmission capacity in bits per unit time.

Nodes

and are called the origin and destination, respec-

tively, of link

. The origin and destination of a link

are denoted and , respectively. We assume

. The information transmitted on a link is ob-

tained as a coding function of information previously received

There are

discrete memoryless information source pro-

cesses

which are random binary sequences.

We denote the Slepian–Wolf region of the sources

where .

Source process

is generated at node , and multi-

cast to all nodes

, where

and are arbitrary mappings. In this

paper, we consider the (multisource) multicast case where

for all . The nodes

are called source nodes and the nodes are called

receiver nodes, or receivers. For simplicity, we assume subse-

quently that

. The mapping ,

the set

and the Slepian–Wolf region specify

a set of multicast connection requirements. The connection

requirements are satisﬁed if each receiver is able to reproduce,

from its received information, the complete source information.

A graph

, a set of link capacities , and a

set of multicast connection requirements

specify a multicast

connection problem.

We make a number of simplifying assumptions. Our anal-

ysis for the case of independent source processes assumes that

each source process

has an entropy rate of one bit per unit

time; sources of larger rate are modeled as multiple sources at

the same node. For the case of linearly correlated sources, we

assume that the sources can be modeled as given linear combi-

nations of underlying independent source processes, each with

an entropy rate of one bit per unit time, as described further in

Section II-B. For the case of arbitrarily correlated sources, we

consider sources with integer bit rates and arbitrary joint prob-

ability distributions.

For the case of independent or linearly correlated sources,

each link

is assumed to have a capacity of one bit per

unit time; links with larger capacities are modeled as parallel

links. For the case of arbitrarily correlated sources, the link rates

are assumed to be integers.

Reference [1] shows that coding enables the multicast infor-

mation rate from a single source to attain the minimum of the

individual receivers’ max-ﬂow bounds,

and shows how to con-

vert multicast problems with multiple independent sources to

single-source problems. Reference [19] shows that linear coding

is sufﬁcient to achieve the same individual max-ﬂow rates; in

fact, it sufﬁces to do network coding using only scalar algebraic

operations in a ﬁnite ﬁeld

, for some sufﬁciently large ,on

length-

vectors of bits that are viewed as elements of [17].

The case of linearly correlated sources is similar.

For arbitrarily correlated sources, we consider operations in

on vectors of bits. This vector coding model can, for given

vector lengths, be brought into the scalar algebraic framework

of [17] by conceptually expanding each source into multiple

sources and each link into multiple links, such that each new

source and link corresponds to one bit of the corresponding in-

formation vectors. We describe this scalar framework in Sec-

tion II-B, and use it in our analysis of arbitrarily correlated

sources in Section V. Note, however, that the linear decoding

strategies of [17] do not apply for the case of arbitrarily corre-

lated sources.

We consider both the case of acyclic networks where link

delays are not considered, as well as the case of general net-

works with cycles and link delays. The former case, which we

call delay-free, includes networks whose links are assumed to

have zero delay, as well as networks with link delays that are

operated in a burst [19], pipelined [26], or batched [3] fashion,

where information is buffered or delayed at intermediate nodes

so as to be combined with other incoming information from the

same batch. A cyclic graph with

nodes and rate may also be

converted to an expanded acyclic graph with

nodes and rate

at least

, communication on which can be emulated over

time steps on the original cyclic graph [1]. For the latter case,

we consider general networks without buffering, and make the

simplifying assumption that each link has the same delay.

We use some additional deﬁnitions in this paper. Link

is an

incident outgoing link of node

if , and an incident

incoming link of

if . We call an incident incoming

link of a receiver node a terminal link, and denote by

the set

of terminal links of a receiver

.Apath is a subgraph of the

network consisting of a sequence of links

such that

, , and ,

and is denoted

.Aﬂow solution for a receiver is

a set of links forming

link-disjoint paths each connecting a

different source to

That is, the maximum commodity ﬂow from the source to individual re-

ceivers.

4416 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 10, OCTOBER 2006

B. Algebraic Network Coding

In the scalar algebraic coding framework of [17], the source

information processes, the receiver output processes, and the in-

formation processes transmitted on each link, are sequences of

length-

blocks or vectors of bits, which are treated as elements

of a ﬁnite ﬁeld

, . The information process trans-

mitted on a link

is formed as a linear combination, in ,of

link

’s inputs, i.e., source processes for which

and random processes for which , if any. For the

delay-free case, this is represented by the equation

The th output process at receiver node is a linear com-

bination of the information processes on its terminal links, rep-

resented as

For multicast on a network with link delays, memory is needed

at the receiver (or source) nodes, but memoryless operation

sufﬁces at all other nodes [17]. We consider unit delay links,

modeling links with longer delay as links in series. The corre-

sponding linear coding equations are

where , , , , and are the values

of the corresponding variables at time

, respectively, and

represents the memory required. These equations, as with the

random processes in the network, can be represented alge-

braically in terms of a delay variable

where

(1)

and

The coefﬁcients can be collected into

matrices

in the acyclic delay-free case

in the general case with delays

and

, and the matrix

in the acyclic delay-free case

in the general case with delays

whose structure is constrained by the network. A pair

tuple

can be called a linear network code.

We also consider a class of linearly correlated sources mod-

eled as given linear combinations of underlying independent

processes, each with an entropy and bit rate of one bit per unit

time. To simplify the notation in our subsequent development,

we work with these underlying independent processes in a

similar manner as for the case of independent sources: the

column of the

matrix is a linear function of given

column vectors

, where speciﬁes the mapping from

underlying independent processes to the th source process

A receiver that decodes these underlying independent

processes is able to reconstruct the linearly correlated source

processes.

For acyclic graphs, we assume an ancestral indexing of links

, i.e., if ) for any links , then has a lower

index than

. Such indexing always exists for acyclic networks.

It then follows that matrix

is upper triangular with zeros on

the diagonal.

Let

The mapping from source pro-

cesses

to output processes at

a receiver

is given by the transfer matrix [17].

For a given multicast connection problem, if some network

code

in a ﬁeld (or ) satisﬁes

the condition that

has full rank for each receiver

, then satisﬁes

, and is a solution to the

multicast connection problem in the same ﬁeld. A multicast

connection problem for which there exists a solution in some

ﬁeld

or is called feasible, and the corresponding

connection requirements are said to be feasible for the network.

We can also consider the case where

by restricting network

coding to occur in

For the acyclic delay-free case, the sequence

(

) =

111

converges since

is nilpotent for an acyclic network. For the case with delays,

(

)

exists since the determinant of

is nonzero in its ﬁeld of

deﬁnition

(

...

;f ;

...)

, as seen by letting

. [17]

HO et al.: A RANDOM LINEAR NETWORK CODING APPROACH TO MULTICAST 4417

In subsequent sections, where we consider choosing the value

by distributed random coding, the following deﬁni-

tions are useful: if for a receiver

there exists some value of

such that has full rank , then is a valid

network code for

; a network code is valid for a

multicast connection problem if it is valid for all receivers.

The

th column of matrix speciﬁes the mapping from

source processes to the random process on link

. We denote by

the submatrix consisting of columns of corresponding to

a set of links

For a receiver

to decode, it needs to know the mapping

from the source processes to the random processes on its

terminal links. The entries of

are scalar elements of

in the acyclic delay-free case, and polynomials in delay variable

in the case with link delays. In the latter case, the number of

terms of these polynomials and the memory required at the re-

ceivers depend on the number of links involved in cycles, which

act like memory registers, in the network.

We use the notational convention that matrices are named

with bold upper case letters and vectors are named with bold

lower case letters.

III. I

NSIGHTS FROM

BIPARTITE MATCHING

AND

NETWORK FLOWS

As described in the previous section, for a multicast connec-

tion problem with independent or linearly correlated sources,

the transfer matrix condition of [17] for the problem to be fea-

sible (or for a particular linear network code deﬁned by ma-

trices

to be valid for the connection problem) is that

for each receiver

, the transfer matrix has nonzero de-

terminant. The following result shows the equivalence of this

transfer matrix condition and the Edmonds matrix formulation

for checking if a bipartite graph has a perfect matching (e.g.,

[23]). The problem of determining whether a bipartite graph

has a perfect matching is a classical reduction of the problem of

checking the feasibility of an

ﬂow [15].

This latter problem

can be viewed as a degenerate case of network coding, restricted

to the binary ﬁeld and without any coding; it is interesting to

ﬁnd that the two formulations are equivalent for the more gen-

eral case of linear network coding in higher order ﬁelds.

Lemma 1:

(a) For an acyclic delay-free network, the determinant of the

transfer matrix

for receiver is

equal to

The problem of checking the feasibility of an

ﬂow of size

on graph

;

)

can be reduced to a bipartite matching problem by constructing the

following bipartite graph: one set of the bipartite graph has

nodes

u ;

...

and a node

corresponding to each link

; the other set of the bipartite

graph has

nodes

w ;

...

, and a node

corresponding to each link

. The bipartite graph has links joining each node

to each node

such that

(

, a link joining node

to the corresponding node

for all

links joining node

for each pair

(

l;j

)

2E2E

such that

(

)

and links joining each node

to each node

such that

(

. The

ﬂow is feasible if and only if the bipartite graph has a perfect matching.

where

is the corresponding Edmonds matrix.

(b) For an arbitrary network with unit delay links, the transfer

matrix

for receiver is non-

singular if and only if the corresponding Edmonds matrix

is nonsingular.

Proof: See Appendix A.

The usefulness of this result is in making apparent various

characteristics of the transfer matrix determinant polynomial

that are obscured in the original transfer matrix by the matrix

products and inverse. For instance, the maximum exponent of

a variable, the total degree of the polynomial, and its form for

linearly correlated sources are easily deduced, leading to Theo-

rems 1 and 2.

For the acyclic delay-free case, Lemma 2 below is another

alternative formulation of the same transfer matrix condition

which illuminates similar properties of the transfer matrix

determinant as Lemma 1. Furthermore, by considering network

coding as a superposition of ﬂow solutions, Lemma 2 allows us

to tighten, in Theorem 3, the bound of Theorem 2 for random

network coding on given acyclic networks in terms of the

number of links in a ﬂow solution for an individual receiver.

Lemma 2: A multicast connection problem with

sources

is feasible (or a particular network code

is valid for the

problem) if and only if each receiver

has a set of terminal

links for which

where is the submatrix of consisting of links

, and

is the product of gains on the path . The sum

is over all ﬂow solutions from the sources to links in

, each

such solution being a set of

link-disjoint paths each connecting

a different source to a different link in

Proof: See Appendix A.

Lemma 1 leads to the following upper bound on required ﬁeld

size for a feasible multicast problem, which tightens the upper

bound of

given in [17], where is the number of pro-

cesses being transmitted in the network.

Theorem 1: For a feasible multicast connection problem with

independent or linearly correlated sources and

receivers, in

both the acyclic delay-free case and the general case with delays,

HTML Viewer

Frequently Asked Questions (12)

Q1. What contributions have the authors mentioned in the paper "A random linear network coding approach to multicast" ?

The authors present a distributed random linear network coding approach for transmission and compression of information in general multisource multicast networks. The authors show that this achieves capacity with probability exponentially approaching 1 with the code length. The authors also demonstrate that random linear coding performs compression when necessary in a network, generalizing error exponents for linear Slepian–Wolf coding in a natural way. Benefits of this approach are decentralized operation and robustness to network changes or link failures. The authors show that this approach can take advantage of redundant network capacity for improved success probability and robustness. The authors illustrate some potential advantages of random linear network coding over routing in two examples of practical scenarios: distributed network operation and networks with dynamically varying connections.

Q2. What have the authors stated for future works in "A random linear network coding approach to multicast" ?

Further work includes extensions to nonuniform code distributions, possibly chosen adaptively or with some rudimentary coordination, to optimize different performance goals.

Q3. What is the way to get the upper bounds on coding field size?

Lower bounds on coding field size were presented by Rasala Lehman and Lehman [18] and Feder et al. [6]. [6] also gave graph-specific upper bounds based on the number of “clashes” between flows from source to terminals.

Q4. What is the probability that a random network code is valid for the problem?

If there exists a solution to the network connection problem with the same values for the fixed code coefficients, then the probability that the random network code is valid for the problem is at least , where is the maximum number of links with associated random coefficients in any set of links constituting a flow solution for any receiver.

Q5. What is the probability of a network connection problem where the code coefficients are fixed?

Nodes that cannot determine the appropriate code coefficients from local information choose the coefficients independently and uniformly from .

Q6. What is the general bound on the success probability of random linear network coding?

The authors have given a general bound on the success probability of such codes for arbitrary networks, showing that error probability decreases exponentially with code length.

Q7. What is the probability that the random network code is valid for the problem?

If there exists a solution to the network connection problem with the same values for the fixed code coefficients, then the probability that the random network code is valid for the problem is at least , where is the number of links with associated random coefficients.

Q8. What are the simplest assumptions for the case of arbitrarily correlated sources?

For the case of arbitrarily correlated sources, the authors consider sources with integer bit rates and arbitrary joint probability distributions.

Q9. What is the next bound for the acyclic delay-free case?

The next bound is useful in cases where analysis of connection feasibility is easier than direct analysis of random linear coding.

Q10. What is the way to achieve the performance of the Steiner tree?

To this end, the authors use a small field size that allows random linear coding to generally match the performance of the Steiner heuristic, and to surpass it in networks whose topology makes Steiner tree routing difficult.

Q11. What is the probability of a randomlinear code being valid?

it is intuitive that having more redundant capacity in the network, for instance, should increase the probability that a randomlinear code will be valid.

Q12. What are the advantages of random linear network coding?

These examples suggest that the decentralized nature and robustness of random linear network coding can offer significant advantages in settings that hinder optimal centralized network control.

A Random Linear Network Coding Approach to Multicast

Summary (3 min read)

Introduction

A. Overview

A. Basic Model

OR LINEARLY CORRELATED SOURCES

V. RANDOM LINEAR NETWORK CODING FOR ARBITRARILY CORRELATED SOURCES

VI. BENEFITS OF RANDOMIZED CODING OVER ROUTING

A. Distributed Settings

B. Dynamically Varying Connections

VII. CONCLUSION

Figures (5)

Citations

Cites background from "A Random Linear Network Coding Appr..."

References

"A Random Linear Network Coding Appr..." refers background or methods in this paper

"A Random Linear Network Coding Appr..." refers background in this paper

"A Random Linear Network Coding Appr..." refers background or methods in this paper

Related Papers (5)

Frequently Asked Questions (12)

Q1. What contributions have the authors mentioned in the paper "A random linear network coding approach to multicast" ?

Q2. What have the authors stated for future works in "A random linear network coding approach to multicast" ?

Q3. What is the way to get the upper bounds on coding field size?

Q4. What is the probability that a random network code is valid for the problem?

Q5. What is the probability of a network connection problem where the code coefficients are fixed?

Q6. What is the general bound on the success probability of random linear network coding?

Q7. What is the probability that the random network code is valid for the problem?

Q8. What are the simplest assumptions for the case of arbitrarily correlated sources?

Q9. What is the next bound for the acyclic delay-free case?

Q10. What is the way to achieve the performance of the Steiner tree?

Q11. What is the probability of a randomlinear code being valid?

Q12. What are the advantages of random linear network coding?