scispace - formally typeset
Open AccessProceedings ArticleDOI

Tree-structured data regeneration with network coding in distributed storage systems

TLDR
It is proved that a maximum spanning tree is an optimal regeneration tree and the results show the tree-structured scheme can reduce the regeneration time by 75%–82% and improve data availability by 73%–124%.
Abstract
Distributed storage systems, built on peer-to-peer networks, can provide large-scale data storage and high data reliability by redundant schemes, such as replica, erasure codes and linear network coding. Redundant data may get lost due to the instability of distributed systems, such as permanent node departures, hardware failures, and accidental deletions. In order to maintain data availability, it is necessary to regenerate new redundant data in another node, referred to as a newcomer. Regeneration is expected to be finished as soon as possible, because the regeneration time can influence the data reliability and availability of distributed storage systems. It has been acknowledged that linear network coding can regenerate redundant data with less network traffic than replica and erasure codes. However, previous regeneration schemes are all star-structured regeneration schemes, in which data are transferred directly from existing storage nodes, referred to as providers, to the newcomer, so the regeneration time is always limited by the path with the narrowest bandwidth between newcomer and provider, due to bandwidth heterogeneity. In this paper, we exploit the bandwidth between providers and propose a tree-structured regeneration scheme using linear network coding. In our scheme, data can be transferred from providers to the newcomer through a regeneration tree, defined as a spanning tree covering the newcomer and all the providers. In a regeneration tree, a provider can receive data from other providers, then encode the received data with the data this provider stores, and finally send the encoded data to another provider or to the newcomer. We prove that a maximum spanning tree is an optimal regeneration tree and analyze its performance. In a trace-based simulation, the results show the tree-structured scheme can reduce the regeneration time by 75%–82% and improve data availability by 73%–124%.

read more

Content maybe subject to copyright    Report

Tree-structured Data Regeneration w ith Network
Coding in Distributed Storage Systems
Jun Li, Shuang Yang, Xin Wang, Xiangyang Xue
School of Computer Science
Fudan University, China
{0572222, 06300720227, xinw, xyxue}@fudan.edu.cn
Baochun Li
Department of Electrical and Computer Eng ineering
University of Toronto, Canada
bli@eecg.toronto.edu
Abstract—Distributed storage systems, built on peer-to-peer
networks, can provide large-scale data storage and high data
reliability by redundant schemes, such as replica, erasure codes
and linear network coding. Redundant data may get lost due to
the instability of distributed systems, such as permanent node
departures, hardware failures, and accidental deletions. In order
to maintain data availability, it is necessary to regenerate new
redundant data in another node, referred to as a newcomer.
Regeneration is expected to be nished as soon as possible,
because the regeneration time can inuence the data reliability
and availability of distributed storage systems. It has been ac-
knowledged that linear network coding can regenerate redundant
data with less network trafc than replica and erasure codes.
Howev er, previous regeneration schemes are all star-structured
regeneration schemes, in which data are transferred directly from
existing storage nodes, referred to as providers, to the newcomer,
so the regeneration time is always limited by the path with the
narrowest bandwidth between newcomer and provider, due to
bandwidth heterogeneity .
In this paper, we exploit the bandwidth between providers
and pr o pose a tree-structured regeneration scheme using linear
network coding. In our scheme, data can be transferred from
providers to the newcomer through a regeneration tree, de ned
as a spanning tree covering the newcomer and all the providers.
In a regeneration tree, a provider can receive data from other
providers, then encode the received data with the data this
provider stores, and nally send the encoded data to another
provider or to the newcomer. We prove that a maximum spanning
tree is an optimal regeneration tree and analyze its performance.
In a trace-based simulation, the results show the tree-structured
scheme can reduce the regeneration time by 75%-82% and
improve data availability by 73%-124%.
Index Terms—Distributed Storage System, Linear Network
Coding, Maximum Spanning Tree.
I. INTRODUCTION
Distribu ted storage systems store data in a large number
of storage nodes, either in the context of data centers in
cloud computing systems, or in the context of peer-assisted
online storage system s e.g., [1]. Due to the inherent lack of
reliability caused by node departures and hard ware failures,
data may become temporarily or permanently unavailable in
such systems. Concerns about Quality of Service (QoS) in
storage systems hinge upon two aspects: the reliability and
availability of data. Data are reliable w hen data saved in the
distributed storage system are suf cient to recover the original
data. Data are available when there are enough active nodes in
the distributed storage system so that the original data can be
recovered at once. In order to provide high data reliability and
availability, distributed storage systems usually use redundant
data. The form s of redundant data include replica, erasure
codes and linear network coding.
Redundant data can provide higher availability because
there can be more active storage nodes for data recovery, when
there may be nodes temporarily unav ailable. Ho wever , when
data are lost permanently in the distributed storage system, the
number of storage nodes will decrease gradually. Therefore
it is necessary to regenerate new redundant data to maintain
data availability. Regeneration is the process that a node in the
distributed storage system, referred to as a newcomer, receiv es
data from active storage nodes, referred to as providers, and
nally becomes a new storage node, so that the lost redundant
data are regener ated.
To ensure data reliability and av ailability, we expect the
regenerate time to be as little as possible. The less time
regeneration costs, the more redundant data can be preserved
in the distributed storage system with data loss. The newcomer
or the provider may also leave the system even during the
regeneration process, so less regeneration time can result in
higher probability that the regeneration is nished before any
node (newcomer or provider) leaves the system. The simplest
way to reduce the regeneration time is to reduce the network
trafc in the regeneration. Dimakis et al. [2] showed that linear
network coding can incur less regeneration trafc and the
corresponding encoding scheme is given in [3].
To our knowled ge, previous regeneration schemes mainly
focused on how to generate redundant data to reduce the
regeneration trafc, but the bandwidth capacity between nodes
has not been taken into account. In this paper, we propose
a tree-structured regeneration scheme based on linear net-
work coding from the perspec tive of bandwidth capacity.
Conventional regeneration is a star-structured scheme, i.e. the
newcomer downlo ads data directly from providers. Thus the
regeneration time is limited by the path between the newcomer
and the provider with the narrow est bandwidth, if the network
of the storage system suffers from bandwidth heterogeneity. In
our tree-structured scheme, we denearegenerationtreeasa
spanning tree covering the newcomer and all the providers.
In the regeneration tree, the child n od e sends data to its
parent node, and the parent node encodes the received data
with the data it stores and then sends the encoded data to its

parent node. If the transmission is pipelined, the bandwidth
bottleneck is the edge with the narrowest bandwidt h in the tree.
We prov e a maximum spanning tree is an optimal regeneration
tree.
In this paper, we present the tree-structured regeneration
scheme and analyze its performance mathematically. We rst
show ho w the tree-structured scheme regenerates redundant
data at the newcomer. Then we prove a maximum spanning
tree is an optimal regeneration tree. By analysis based on
probability theory and order statistics, we show our scheme
can reduce the regeneration time by improving the transmis-
sion rate, and can improve the adaptability to the bandwidth
heterogeneity, while not increasing the regeneration trafc. We
evaluate our scheme by a trace-based simulation. The simula-
tion results show that our scheme can reduce regeneration time
by 75%-82% and improve data av ailability by 73%-124%.
The remainder of the paper is orga nized as follows. In
Section II we introduce the related work. We introduce some
basic concepts of distributed storage systems using linear
network coding and present the network model in Section III.
In Section IV, we present the tree-structu red regeneration
scheme and analyze its performance. We show the simulation
results in Section V. Finally, Section VI concludes this paper.
II. R
ELATED WORK
Many papers discussed how to improve data reliability from
the perspective of redundant data. The forms of redundant data
include replica, erasure codes and linear network coding. Some
distributed storage systems u se replica, such as BitVault [4].
In OceanStore [1], ho wever, the original data are encoded at
the source node by erasure codes. Lin et al. [5] investigated
and compared some decentralized replication a lgorithms for
improving le availability in P2P network. Compared with
replica, erasure codes pro vide higher data availability, because
in the storage systems using (n, k)erasure codes, any k nodes
of n storage nodes are sufcient to recover the original data.
However, erasure codes incur more storage space at the source
node than replica, when disseminating the encoded data [6].
What’s more, Rodrigues et al. [7] pointed out that in some
cases, the benets from erasure codes might not be worth its
disadvantages.
Ahlswede et al. [8] introduced the idea of network coding
that the intermediate nodes can encode the data they have
received and send out the encoded data. It has been proved
that network coding can utilize the network resource optimally.
Yang et al. [9] presented a le sharing scheme based on
network coding, which used the combination network as the
network topology . Taking (n, k)linear network coding for
example, the data are divided into k blocks, F
i
, i =1, 2,...,k
and n encoded block s, B
1
,B
2
,...,B
n
,n > k, are generated
as linear combinations of F
1
,F
2
,...,F
k
on Galois Field F
q
,
where q is the size of the Galois Field. B
j
=
k
P
i=1
α
ji
F
i
,
α
ji
F
q
, where (α
j1
j2
,...,α
jk
)
T
is a coefcient vector,
j =1, 2,...,k. When a node wishes to access the original
data, it has to receive m encoded blocks, m k. Then
decoding becomes sol ving a linear system with k unknowns
and m equations. The m encoded blocks can be decoded if and
only if the linear system is solvable, i.e. k of the m coefcient
vectors are linear independent. Random linear coding [10]
is a form of linear network coding, which encodes data
at the intermediate node linearly using randomly generated
coefcient vector. If all the coefcient vectors are random ly
generated, more than k encoded blocks may be required to
decode.However,whenq is large enough, any k encoded
blocks are sufcient to decode with high probability [11].
Accendanski et al. [6] compared the performance of different
forms of redundant data, includin g replica, erasure codes and
random linear coding. They showed random linear coding
pro vided data availability no worse than erasure codes, but
saved storage cost at the source node when disseminating data
into the network.
For different forms of redundant data, the regeneration
mecha nisms are different. For replica, the newcomer only
needs to download one replica from one active storage node.
Chun et al. [12] proposed a Carbonite replication algorithm
to schedule the regeneration of ne w replica. For erasure codes
and linear network coding, every bit of new data is encoded
fromthedatastoredintheproviders,soitwillincurmore
network trafc than replica. The simplest way is to recover
the original data from providers and encode the original data
into a new block. Duminuco et al. [13] proposed a n ew
class of erasure codes, aiming to achieve the tradeoff between
regeneration trafc and data reliability. Dim akis et al. showed
that linear network cod ing can reduce the network trafc
in the regen eration than erasure codes [2]. They proposed
Regeneration Codes, a new form of linear network coding,
which achie ved the optimal trad eoff between storage cost
and network trafc. Wu et al. [3] showed further analysis
of the relation between storage cost and network trafc, and
presented a construction method of Regeneration Codes.
Previous works mainly considered the form of redundant
data and tried to reduce the regeneration trafc, but did not
take the bandwidth capacity between two nodes into account.
Lee et al. [14] proposed a bandwidth-aware routing scheme
in ove rlay networks, which measured bandwidth capacity
between hosts in the overlay networks and selected th e best
paths so as to bypass the problematic path in the networks. In
this paper, we will consider the bandwidth heterogeneity and
propose a tree-structured regeneration scheme to reduce the
regeneration time and hence to improve data availability. The
primary part of our work can be found in [15].
III. P
RELIMINARIES
A. Node and Redundant Data
A distributed storage system provides its service based on a
distributed network containing a large number of nodes, which
may play different roles in the system. A source node is a node
which sends data into other nodes, and a stora ge node is a node
which stores data for source nodes. In some distributed storage
systems, one node may function as a source node as well as a
storage node at the same time. When a source node wishes to

save data into the storage system, it generates redundant data
and sends them to one or more storage nodes. For replica, the
storage node stores one replica of the original le. For erasure
codes, the original le is divided into a number of blocks.
Redundant blocks are generated at the source node by erasure
codes, such as Reed-Solomon codes and fountain codes. Each
storage node stores one redundant block.
In a distributed storage system using linear network cod-
ing, each block saved in the distributed storage system is
generated by linear network coding. We take (n, k)linear
network coding for example. The source node divides the
original data into k blocks, F
1
,F
2
,...,F
k
, and encodes them
into n encoded blocks, B
1
,B
2
,...,B
n
, which are all linear
combinations of F
1
,F
2
,...,F
k
. The coefcient vector of B
i
is
(a
i1
,a
i2
,...,a
ik
)
T
,a
ij
F
q
,j=1, 2,...,k,i=1, 2,...,n,
where q is the size of the Galois eld F
q
. Thus we can get
a
11
··· a
1k
.
.
.
.
.
.
.
.
.
a
n1
··· a
nk
·
F
1
.
.
.
F
k
=
B
1
.
.
.
B
n
. (1)
The coefcient vectors form an encoding matrix C,
C =
a
11
··· a
1k
.
.
.
.
.
.
.
.
.
a
n1
··· a
nk
. (2)
A download node is a node which wishes to access data
saved in the distrib uted storage system. For replica, the down-
load node needs to download data from only one storage
node. For (n, k)erasure codes or (n, k)linear network
coding, the download node can recover data as soon as it has
received k redundant blocks or k linear independent encoded
blocks respectively. For linear network coding, we assume the
k encoded blocks are B
0
1
,B
0
2
,...,B
0
k
, {B
0
1
,B
0
2
,...,B
0
k
}
{B
1
,B
2
,...,B
n
}. Let C
0
be the encoding matrix formed
by the coefcient vectors of B
0
1
,B
0
2
,...,B
0
k
. Then decoding
becom es a linear transformation as follows:
F
1
.
.
.
F
k
= C
01
B
0
1
.
.
.
B
0
k
. (3)
If the coefcient vectors are randomly generated, i.e. the
system uses random linear coding, the encoding matrix C
0
is non-singular with high probability when q is large enough
[11]. Conventionally q =2
8
, so the encoded block can be
genera ted byte by byte, and it is guaranteed with very high
probability that any k encoded blocks are sufcient to decode.
B. Regeneration
In distribu ted storage systems, redundan t data can improve
data availability and provide data reliability. However, it
cannot guarantee data reliability and availab ility fore ver. Data
saved in a stora ge node may get lost due to accidental
deletions, hardware failures, or permanent node departures.
Therefor e if data loss is detected in the storage system by a
data loss detection mechanism, such as Carbonite algorithm
proposed in [12], the distributed storage system will generate
new redundant data and save them into another node, referred
to as a newcomer.
During the regeneration, the newcomer must receive data
from one or more existing storage nodes to become a new
storage node. We dene providers as storage nodes providing
data for the ne wcomer in the regeneration. For replica, the
newcomer needs only one provider. For erasure codes, the
newcomer must recover the original data from providers and
then encode the original data into a new redundant block.
For (n, k)linear network coding, th e newcomer also needs
to receive data from at least k providers. However, the
newcomer can directly generate a new encoded block. We
assume there are k providers. The k encoded blocks they
store are B
0
1
,B
0
2
,...,B
0
k
. The encoding matrix formed by
the coefcient vectors of B
0
1
,B
0
2
,...,B
0
k
is C
0
. Similar to
decoding, generating a new encoded block is also a linear
transforma tion, if C
0
is non-singular. Let the coefcient vector
of the new encoded block is (σ
1
2
,...,σ
k
)
T
j
F
q
,
j =1, 2,...,k. We assume that
(σ
1
··· σ
k
)
F
1
.
.
.
F
k
=(r
1
··· r
k
)
B
0
1
.
.
.
B
0
k
, (4)
where r
j
F
q
,j=1, 2,...,k. According to Eq. (3),
(r
1
··· r
k
)=(σ
1
··· σ
k
) C
01
. (5)
If C
0
is non-singular, (r
1
··· r
k
)
T
is a random vector if and
only if (σ
1
··· σ
k
)
T
is a random vector. For random linear
coding, (r
1
··· r
k
)
T
can be randomly generated rather than
computed according to Eq. (5). Therefore the newcomer can
encode B
0
1
,B
0
2
,...,B
0
k
directly into a new encoded block by
the coefcient vector (r
1
··· r
k
)
T
.
Dimakis et al. [2] and Wu et al. [3] analyzed the lower
bound of network trafc in the regeneration for distributed
storage systems using linear network codin g. If the size of
the original data is M bytes, and each storage node stores
M
k
bytes, the minimal regeneration trafcis
d
k(dk+1)
M bytes
if d providers are required, otherwise the new encoded block
will not be equivalent to other encoded blocks in decodability.
It is clear that the regeneration scheme showed in Eq. (4)
and Eq. (5) achieves the optimal regeneration trafc when the
number of providers is k.
C. Network Model
In this paper, we focus on how to regenerate new redundant
data quickly u sing linear network coding. The distributed
storage system uses (n, k)linear network coding, n>k.
Thus the newcomer requires at least k providers in the regen-
eration. It will become more difcult to nd more providers
in order to start the regeneration, and the regeneration is more
likely to be interrupted by node departures, when it requires
more providers. Therefore in this paper, we only d iscuss the
case that the re generation scheme requires k providers. A
regeneration scheme can transfer data from k providers to the

newcomer, regenerate new redundan t data and save them at
the newcomer. Different from conventional schemes, we also
consider bandwidth heterogeneity in the network.
Assume that one le has been saved in the distributed
storage system. The size of the origin al le is M bytes. Each
storage node stores an encoded block of
M
k
bytes. Our network
model focuses on the regeneration in the system. In one
regeneration, we assume k active storage nodes are required
as providers. The node set V = {V
0
,V
1
,...,V
k
},whereV
0
is the new comer, and V
1
,...,V
k
are providers. V
0
recei ves
data from the k providers. Node departures are ignored, i.e.
the newcomer and provider s are assumed to be stable during
the regeneration process. Ed ge set E = {(V
i
,V
j
)|i, j =
0, 1,...,k,i < j}. ω(V
i
,V
j
) denotes the bandwidth capacity
between V
i
and V
j
. Thus the weighted undirected complete
graph G
k
=(V, E,ω) denotes the network model of the regen-
eration, where k is the number of providers, i.e. k = |V | 1.
30KB/s
40KB/s
50KB/s
0
1
2
30KB/s
40KB/s
50KB/s
0
1
2
Fig. 1. Comparison between the star-structured and the tree-structured
regeneration scheme in an example of the network model containing 3 nodes.
Fig. 1 shows an example of the network model described
above. (n, 2)linear network coding is employed in this
model, n>2. When a regeneration starts, the newcomer
receives data from 2 storage nodes. Conventionally, the new-
comer receives data from each provider directly. Fig. 1(a)
illustrates the conventional regeneration scheme. In this re-
generation scheme, the topology of newcomer and providers
is like a star, so this scheme is referred to as a star-structured
regeneration scheme in this paper. In Fig. 1(a), the newcomer
V
0
receives encoded block s directly from V
1
and V
2
and then
encodes them again to obtain an encoded block with a new
coefcient vector. In the star-structured regeneration scheme,
the regeneration time depends on the minimal edge connecting
to the newcomer V
0
. In Fig. 1, ω(V
0
,V
1
)(V
0
,V
2
)(V
1
,V
2
)
is 30KB/s, 50KB/s and 40KB/s respectively, so the bandwidth
bottleneck is (V
0
,V
1
) and the available bandwidth capacity, i.e.
the actual transmission rate during the regeneration process is
30KB/s.
In this paper, we propose a tree-structured regeneration
scheme, which constructs a spanning tree in the network model
G
k
. Our regeneration scheme does not incur more regeneration
trafc than the star-structured scheme, but it can improve
available bandwidth capacity and thus reduce the regenera-
tion time. Each node in the m odel, no matter newcomer or
provider, can receive data from other nodes. To preve nt from
increasing the regeneration trafc and thus aggravating the
bandwidth bottleneck, we assume each node can receive data
from multiple nodes, but can send data to only one node.
Encodin g operation can be execu ted on the newcomer and
the provi ders, and the encoding delay is ignored, since the
transmission delay is usually much more critical. Fig. 1(b)
is an example of the tree-structured regeneration scheme. V
1
sends its data to V
2
. V
2
encodes the received data with the
data it stores and sends the encoded data to V
0
. As we will
show in Section IV, the bandwidth bottleneck is (V
1
,V
2
),and
the available bandwidth capacity is ω(V
1
,V
2
)=40KB/s. We
can see the tree-structured regeneration scheme can regenerate
new redundant data faster.
IV. T
REE-STRUCTURED REGENERATION SCHEME
In this section, we present our tree-structured regeneration
scheme, based on the network model above. First, we show
how the tree-structured scheme can regenerate new redundant
data at the newcomer and prove that a maximum spanning
tree is an optimal regeneration tree. Then we give the encoding
scheme for linear network coding, especially for random linear
coding. We analyze the available bandwidth capacity of the
tree-structured and star-structured regener ation scheme based
on probability theory and order statistics. We compare the
available band width capacity of the two schemes at last.
A. Re generation Tree
Lemma 1: AnyspanningtreeT in G
k
=(V, E, ω),whose
root is V
0
, corresponds to one and only one regeneration
scheme in which V
0
is the newcomer.
Proo f: Given a spanning tree T , we can build a regener-
ation scheme as follows. For any node in T, it receives data
from its children if it is not a leaf node, encodes the received
data with the data it stores, and sen ds the encoded data to
its parent node if it is not the newcomer. In this case, the
newcomer can get the data or its linear combin ation of the
providers, and then become a new storage node.
Giv en a regeneration scheme of G
k
=(V, E,ω),wecan
build a graph T =(V, E
0
),where(V
i
,V
j
) E
0
when data are
transferred on (V
i
,V
j
),i,j=0, 1, 2,...,k, i < j. For each
edge in T , it can be mapped to one and only one provider
which sends out data on this edge, since one node can send
data to only one node and the newcomer does not send data
to other nodes. Because there are k providers, |E
0
| k.On
the other hand, because the newcomer can receive encoded
blocks or their linear combinations from all providers, T is a
connected graph. So |E
0
| k. Because |E
0
| = k and T is a
connected graph, T is a spanning tree of G
k
.
Notice that T and G
k
are both undirected graphs, but the
transmission is always directed. From the proof of Lemma 1
we can see that for each edge in T , the transmission direction
is from the child node to the parent node. In th is sense, all
edges in T can be regarded as “directed”. The incoming edge
of a node is the edge whose other endpoint is the child of this
node, and the outgoing edge of a node is the edge whose other
endpoint is its parent node.
Lemma 1 shows that we can use a spanning tree to represent
a regeneration scheme of G
k
. Howe ver, it does not show how

to encode the data at each provider. In Sec tio n IV-B, we will
discuss this question.
Denition 1: A regeneration tree is a spanning tree in G
k
.
Lemma 2: For each edge in the regeneration tree T ,the
amount of transferred data on it is
M
k
bytes, where M is the
size of the original le.
Proo f: According to the proof of Lemma 1, each edge in
T corresponds to one and only one provider, so we give the
proof from the perspective of the providers.
For each leaf node in the regeneration tree, the size of data
it sends is
M
k
bytes, because the size of the data it s tores is
M
k
bytes.
Assume for each non-leaf nod e except the new comer, the
trafc on each incoming edge is
M
k
bytes. The amount of data
it stores is also
M
k
bytes. So after linear encoding, the trafc
on its outgoing edge is
M
k
bytes.
Since the outgoing edges of all the providers have the same
trafc on them, we can say that the trafc on each edge is
uniform and is equal to
M
k
bytes.
From Lemma 2, we can see the regeneration trafcofa
tree-struc tured regeneration scheme is M bytes, so the optimal
re generation traf c shown in Section III-B has been achieved.
Lemma 3: For each regeneration tree T in G
k
, the regen-
eration time depends on the edge with the minimal weight.
Proo f: We have known that the weight of each edge in T,
ω(V
i
,V
j
),i,j=0, 1, 2, ...,k, i < j, denotes the bandwidth
capacity between V
i
and V
j
. According to Lemma 2, the trafc
on each edge in T is uniform. If a node sends data after it
has receiv ed all the data from its children, it will waste a
substantial amount of time. The optimal transmission method
is to use the principle of pipelining. The node encodes and
sends data to its parent node immediately after it has receiv ed
one byte/packet from all of its children. So the bandwidth
bottleneck is the minimal edge in the regeneration tree.
FromtheproofofLemma3,wegivethedenition of the
available bandwidth capacity of a regeneration tree.
Denition 2: The ava ilable bandwidth capacity of a regen-
eration tree T in G
k
is the weight of the minimum edge in
T .
Lemma 4: [16] In a weighted undirected graph, a minimum
(maximum) spanning tree is a bottleneck spanning tree, i.e.
the weight of whose largest (smallest) edge is the minimum
(maximum) over all spanning trees in this graph.
Theor em 1: A maximum spanning tree in G
k
is a regener-
ation tree with the maximal available bandwidth capacity.
Proo f: The proof is clear according to Lemma 3 and
Lemma 4.
Theorem 1 shows how to nd an optimal reg eneration tree
in G
k
. We can see the star-structured regeneration scheme is
a special form of the tree-structured regeneration scheme and
sometimes it is the optimal. However, the tree-structured re-
generation scheme is always no worse than the star-structured
scheme.
Since the bandwidth capacity is time-sensitiv e, its measure-
ment should be triggered before each regeneration, after which
the regeneration tree can be constructed. However, because the
regeneration tree is spanned over the newcomer and all the
providers, the bandwidth measures are made between these
nodes rather than all nodes in the network and thus are limited.
Theorem 2: Let B(T ) be the available bandw idth capacity
of a regeneration tree T in G
k
. Then the regeneration time is
M
kB(T )
, where M is the size of the original le.
Proo f: According to Lemma 2, the amount of trafcon
each edge in T is
M
k
bytes. From Lemma 3, we know the
regeneration time depends on the minimal weighted edge in
T .Accordingtothedenition of B(T ), the regeneration time
is
M
k
B(T )
=
M
kB(T )
.
B. Encoding Scheme
In G
k
, assume that the encoded block stor ed in V
i
is
B
0
i
,i=1, 2,...,k. According to Eq. (4) and Eq. (5), if
(σ
1
2
,...,σ
k
)
T
is a random vector, (r
1
,r
2
,...,r
k
)
T
is also
a random v ector . Thus (r
1
,r
2
,...,r
k
)
T
can be generated
randomly o n the encoding nodes in a distributed fashion. V
i
is responsible to generate r
i
randomly, i =1, 2,...,k. In one
regeneration tree, if V
i
does not receive data from other nodes,
it sends r
i
B
0
i
.IfV
i
recei ves data from V
i
1
,V
i
2
,...,V
i
in(V
i
)
,
where in(V
i
) is the indegree of V
i
, assuming the data received
from V
i
j
is B
00
j
, it sends r
i
B
0
i
+
in(V
i
)
P
j=1
B
00
j
. Therefore, the
newcomer can get
k
P
i=1
r
i
B
0
i
, which is equal to
k
P
i=1
σ
i
F
i
.
C. Available Bandwidth Capacity
We analyze the available bandwidth capacity of tree-
structured and star-structured regeneration scheme by order
statistics. First, we introduce a basic theorem of order statistics
in Lemma 5.
Lemma 5: [17] Assume X
1
,X
2
,...,X
n
are n independent
random variab les, for each of which the cumulative distribu-
tionfunctionisF (x) an d the probability density function is
f(x). Let f
(r:n)
(x) denote the probability density function of
the r
th
variable X
(r:n)
, X
(1:n)
X
(2:n)
··· X
(n:n)
. If
X
i
is with continuous distribution,
f
(r:n)
(x)=
n!F
nr
(x)[1 F ( x)]
r1
f(x)
(n r)!(r 1)!
. (6)
Let E = {e
1
,e
2
,...,e
k(k+1)
2
} in G
k
=(V,E,ω), where
ω(e
1
) (e
2
) > ··· (e
k(k+1)
2
). The bandwidth capacity
on each edge is assumed to be different from each other, as it
realistically reects real-world networ ks with high probability.
Denition 3: MST(G
k
)=r if and only if the minimal edge
in the maximum spanning tree of G
k
=(V, E, ω) is the r
th
maximal edge of E.
Property 1: k MST(G
k
) M
k+1
k +1, where M
k
=
k(k1)
2
.
Proo f: Let E
1
= {e
1
,e
2
,...,e
k1
}. If (V, E
1
) is con-
nected, it must be a maximum spanning tree of G
k
. Since no
other spanning tree in G
k
whose m inimal edge is larger than
(V,E
0
), we can say r k.
Because G
k
is a com plete graph, it is k-edge-connected.
Thus it is always connected after removing k 1 edges. Let

Citations
More filters
Proceedings ArticleDOI

Tree-structured Data Regeneration in Distributed Storage Systems with Regenerating Codes

TL;DR: A new design is proposed, referred to as RCTREE, that combines the advantage of regenerating codes with a tree-structured regeneration topology and is able to achieve a both fast and stable regeneration, even with departures of storage nodes during the regeneration.
Journal ArticleDOI

A learning automata-based heuristic algorithm for solving the minimum spanning tree problem in stochastic graphs

TL;DR: A learning automata-based heuristic algorithm to solve the minimum spanning tree problem in stochastic graphs wherein the probability distribution function of the edge weight is unknown and the superiority of the proposed algorithm over the well-known existing methods both in terms of the number of samples and the running time of algorithm is shown.
Proceedings ArticleDOI

Heterogeneity-aware data regeneration in distributed storage systems

TL;DR: It is proved that building an optimal regeneration tree is NP-complete and a heuristic algorithm for a near-optimal solution is proposed, which allows providers to generate different amount of coded data.
Journal ArticleDOI

Learning automata-based algorithms for solving stochastic minimum spanning tree problem

TL;DR: A learning automata-based sampling algorithm to solve the MST problem in stochastic graphs where the PDF of the edge weight is assumed to be unknown and it is shown that by a proper choice of the learning rate the spanning tree with the minimum expected weight can be found with a probability close to unity.
Proceedings ArticleDOI

T-Update: A tree-structured update scheme with top-down transmission in erasure-coded systems

TL;DR: This paper proposes T-Update, a tree-structured update scheme with top-down transmission that minimizes the update time for erasure-coded data with no additional network traffic, and proposes a rack-aware tree construction technique to construct an update tree to organize the data connections.
References
More filters
Journal ArticleDOI

Network information flow

TL;DR: This work reveals that it is in general not optimal to regard the information to be multicast as a "fluid" which can simply be routed or replicated, and by employing coding at the nodes, which the work refers to as network coding, bandwidth can in general be saved.
Journal ArticleDOI

OceanStore: an architecture for global-scale persistent storage

TL;DR: OceanStore monitoring of usage patterns allows adaptation to regional outages and denial of service attacks; monitoring also enhances performance through pro-active movement of data.
Posted Content

Network Coding for Distributed Storage Systems

TL;DR: In this paper, the authors introduce a general technique to analyze storage architectures that combine any form of coding and replication, as well as presenting two new schemes for maintaining redundancy using erasure codes.
Book

The Algorithm Design Manual

TL;DR: This newly expanded and updated second edition of the best-selling classic continues to take the "mystery" out of designing algorithms, and analyzing their efficacy and efficiency.
Related Papers (5)
Frequently Asked Questions (17)
Q1. What are the contributions in "Tree-structured data regeneration with network coding in distributed storage systems" ?

However, previous regeneration schemes are all star-structured regeneration schemes, in which data are transferred directly from existing storage nodes, referred to as providers, to the newcomer, so the regeneration time is always limited by the path with the narrowest bandwidth between newcomer and provider, due to bandwidth heterogeneity. In this paper, the authors exploit the bandwidth between providers and propose a tree-structured regeneration scheme using linear network coding. In their scheme, data can be transferred from providers to the newcomer through a regeneration tree, defined as a spanning tree covering the newcomer and all the providers. In a regeneration tree, a provider can receive data from other providers, then encode the received data with the data this provider stores, and finally send the encoded data to another provider or to the newcomer. The authors prove that a maximum spanning tree is an optimal regeneration tree and analyze its performance. 

For each leaf node in the regeneration tree, the size of data it sends is Mk bytes, because the size of the data it stores is M k bytes. 

In the trace file, a node is considered to be up at t time if and only if at least half pings in the batch of pings immediately prior to t are sent to the node successfully. 

The incoming edge of a node is the edge whose other endpoint is the child of this node, and the outgoing edge of a node is the edge whose other endpoint is its parent node. 

The authors notice that when the bandwidth heterogeneity, i.e. the variance of the bandwidth distribution increases, the expected value of the available bandwidth capacity of the tree-structured regeneration scheme increases, but the expected value of the available bandwidth capacity of the star-structured regeneration scheme decreases. 

The authors measure the performance of regeneration schemes from three aspects: (i) regeneration time: how much time is spent from the start of a regeneration to the end; (ii) probability of the successful regeneration: the probability that a regeneration finishes successfully, not interrupted by the node departures; (iii) data availability: the probability that a file is available. 

Since the outgoing edges of all the providers have the same traffic on them, the authors can say that the traffic on each edge is uniform and is equal to Mk bytes. 

For (n, k)−erasure codes or (n, k)−linear network coding, the download node can recover data as soon as it has received k redundant blocks or k linear independent encoded blocks respectively. 

The Y-axis shows E(Gk), the available bandwidth capacity of the corresponding regeneration scheme in Gk. Because Estar(Gk) = (b−a)k+1 + a, the expected value of the available bandwidth capacity of the star-structured regeneration scheme decreases and converges to a, the lower bound of uniformly distribution U [a, b], with the increasing of k. 

In a weighted undirected graph, a minimum (maximum) spanning tree is a bottleneck spanning tree, i.e. the weight of whose largest (smallest) edge is the minimum (maximum) over all spanning trees in this graph. 

What’s more, when k ≥ 10, the data availability of the tree-structured regeneration scheme is always more than 90%, while the availability of the starstructured scheme is less than 60%. 

According to Lemma 5, the probability density function of ω(ei) isf(i:Mk+1)(x) = Mk+1!F Mk+1−i(x)(1− F (x))i−1f(x) (i− 1)!(Mk+1 − i)! . (7)Let E(i:Mk+1) be the expected value of ω(ei),E(i:Mk+1) = Z +∞ 0 xf(i,Mk+1)(x)dx. (8)Let p(k+1, i) be the probability that MST(Gk) = i. 

the authors show how the tree-structured scheme can regenerate new redundant data at the newcomer and prove that a maximum spanning tree is an optimal regeneration tree. 

From Lemma 2, the authors can see the regeneration traffic of a tree-structured regeneration scheme isM bytes, so the optimal regeneration traffic shown in Section III-B has been achieved. 

Their mathematical analysis shows that the tree-structured regeneration scheme can improve the available bandwidth capacity and the adaptability to the bandwidth heterogeneity, compared with the conventional star-structured regeneration scheme. 

Assume the probability density function of the distribution of the weight of the edge in E is f(x) and F (x) is the cumulative distribution function. 

InFig. 5, the authors show that the tree-structured regeneration scheme can save regeneration time by at least 75% when k ≥ 4, and by 82% at most when k = 20.