scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Explicit Constructions of High-Rate MDS Array Codes With Optimal Repair Bandwidth

01 Apr 2017-IEEE Transactions on Information Theory (IEEE)-Vol. 63, Iss: 4, pp 2001-2014
TL;DR: This paper studies high-rate MDS array codes with the optimal repair property (also known as minimum storage regenerating codes, or MSR codes), and presents two explicit constructions of such codes with this property.
Abstract: Maximum distance separable (MDS) codes are optimal error-correcting codes in the sense that they provide the maximum failure tolerance for a given number of parity nodes. Suppose that an MDS code with $k$ information nodes and $r=n-k$ parity nodes is used to encode data in a distributed storage system. It is known that if $h$ out of the $n$ nodes are inaccessible and $d$ surviving (helper) nodes are used to recover the lost data, then we need to download at least $h/(d+h-k)$ fraction of the data stored in each of the helper nodes (Dimakis et al. , 2010 and Cadambe et al. , 2013). If this lower bound is achieved for the repair of any $h$ erased nodes from any $d$ helper nodes, we say that the MDS code has the $(h,d)$ -optimal repair property. We study high-rate MDS array codes with the optimal repair property (also known as minimum storage regenerating codes, or MSR codes). Explicit constructions of such codes in the literature are only available for the cases where there are at most three parity nodes, and these existing constructions can only optimally repair a single node failure by accessing all the surviving nodes. In this paper, given any $r$ and $n$ , we present two explicit constructions of MDS array codes with the $(h,d)$ -optimal repair property for all $h\le r$ and $k\le d\le n-h$ simultaneously. Codes in the first family can be constructed over any base field $F$ as long as $|F|\ge sn$ , where $s=\text {lcm}(1,2, {\dots },r)$ . The encoding, decoding, repair of failed nodes, and update procedures of these codes all have low complexity. Codes in the second family have the optimal access property and can be constructed over any base field $F$ as long as $|F|\ge n+1$ . Moreover, both code families have the optimal error resilience capability when repairing failed nodes. We also construct several other related families of MDS codes with the optimal repair property.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, an explicit construction of optimal-access MDS codes with sub-packetization is presented, which differs from the optimal value by at most a factor of $r^{2}$.
Abstract: An $(n,k,l)$ maximum distance separable (MDS) array code of length $n$ , dimension $k=n-r$ , and sub-packetization $l$ is formed of $l\times n$ matrices over a finite field $F$ , with every column of the matrix stored on a separate node in the distributed storage system and viewed as a coordinate of the codeword. Repair of a failed node (recovery of one erased column) can be performed by accessing a set of $d\le n-1$ surviving (helper) nodes. The code is said to have the optimal access property if the amount of data accessed at each of the helper nodes meets a lower bound on this quantity. For optimal-access MDS codes with $d=n-1$ , the sub-packetization $l$ satisfies the bound $l\ge r^{(k-1)/r}$ . In our previous work (IEEE Trans. Inf. Theory, vol. 63, no. 4, 2017), for any $n$ and $r$ , we presented an explicit construction of optimal-access MDS codes with sub-packetization $l=r^{n-1}$ . In this paper, we take up the question of reducing the sub-packetization value $l$ to make it to approach the lower bound. We construct an explicit family of optimal-access codes with $l=r^{\lceil n/r\rceil }$ , which differs from the optimal value by at most a factor of $r^{2}$ . These codes can be constructed over any finite field $F$ as long as $|F|\ge r\lceil n/r\rceil $ , and afford low-complexity encoding and decoding procedures. We also define a version of the repair problem that bridges the context of regenerating codes and codes with locality constraints (LRC codes), which we call group repair with optimal access . In this variation, we assume that the set of $n=sm$ nodes is partitioned into $m$ repair groups of size $s$ , and require that the amount of accessed data for repair is the smallest possible whenever the $d=s+k-1$ helper nodes include all the other $s-1$ nodes from the same group as the failed node. For this problem, we construct a family of codes with the group optimal access property. These codes can be constructed over any field $F$ of size $|F|\ge n$ , and also afford low-complexity encoding and decoding procedures.

185 citations

Journal ArticleDOI
TL;DR: Regenerating codes for distributed storage have attracted much research interest in the past decade and can be relaxed to requiring the optimal repair bandwidth for systematic nodes only.
Abstract: Regenerating codes for distributed storage have attracted much research interest in the past decade. Such codes trade the bandwidth needed to repair a failed node with the overall amount of data stored in the network. Minimum storage regenerating (MSR) codes are an important class of optimal regenerating codes that minimize (first) the amount of data stored per node and (then) the repair bandwidth. Specifically, an $[n,k,d]$ - $(\alpha )$ MSR code $ \mathbb {C}$ over $ \smash {\mathbb {F}_{\!q}}$ stores a file $ {\mathcal{ F}}$ consisting of $\alpha k$ symbols over $ \smash {\mathbb {F}_{\!q}}$ among $n$ nodes, each storing $\alpha $ symbols, in such a way that: 1) the file $ {\mathcal{ F}}$ can be recovered by downloading the content of any $k$ of the $n$ nodes and 2) the content of any failed node can be reconstructed by accessing any $d$ of the remaining $n-1$ nodes and downloading $\alpha /(d{-}k{+}1)$ symbols from each of these nodes. In practice, the file $ {\mathcal{ F}}$ is typically available in uncoded form on some $k$ of the $n$ nodes, known as systematic nodes , and the defining node-repair condition above can be relaxed to requiring the optimal repair bandwidth for systematic nodes only . Such codes are called systematic–repair MSR codes . Unfortunately, finite– $\alpha $ constructions of $[n,k,d]$ MSR codes are known only for certain special cases: either low rate, namely $k/n \leqslant 0.5$ , or high repair connectivity, namely $d = n-1$ . Our main result in this paper is a finite– $\alpha $ construction of systematic-repair $[n,k,d]$ MSR codes for all possible values of parameters $n,k,d$ . We also introduce a generalized construction for $[n,k]$ MSR codes to achieve the optimal repair bandwidth for all values of $d$ simultaneously.

107 citations


Cites background from "Explicit Constructions of High-Rate..."

  • ...We also refer the reader to [22]–[24], where MSR constructions with...

    [...]

  • ...Most recently, Ye and Barg [22], [23] show that [n, k, d] MSR codes can be explicitly constructed4 over a small finite field and with a near optimal sub-packetization α. Sasidharan et al. [24] also construct explicit [n, k, d = n − 1] MSR codes with these properties....

    [...]

  • ...Most recently, Ye and Barg [22], [23] show that [n, k, d] MSR codes can be explicitly constructed4 over a small finite field and with a near optimal sub-packetization α....

    [...]

Journal ArticleDOI
TL;DR: This survey provides an overview of the efforts in this direction by introducing two new classes of erasure codes, namely regenerating codes and locally recoverable codes as well as by coming up with novel ways to repair the ubiquitous Reed-Solomon code.
Abstract: In a distributed storage system, code symbols are dispersed across space in nodes or storage units as opposed to time. In settings such as that of a large data center, an important consideration is the efficient repair of a failed node. Efficient repair calls for erasure codes that in the face of node failure, are efficient in terms of minimizing the amount of repair data transferred over the network, the amount of data accessed at a helper node as well as the number of helper nodes contacted. Coding theory has evolved to handle these challenges by introducing two new classes of erasure codes, namely regenerating codes and locally recoverable codes as well as by coming up with novel ways to repair the ubiquitous Reed-Solomon code. This survey provides an overview of the efforts in this direction that have taken place over the past decade.

81 citations

Journal ArticleDOI
TL;DR: A generic transformation is proposed that can transform any nonbinary MDS code with the optimal repair bandwidth or the optimal rebuilding access for the systematic nodes only, into a new M DS code which possesses the corresponding repair optimality for all nodes.
Abstract: We propose a generic transformation that can convert any nonbinary $(n=k{+}r,k)$ maximum distance separable (MDS) code into another $(n,k)$ MDS code over the same field such that: 1) some arbitrarily chosen $r$ nodes have the optimal repair bandwidth and the optimal rebuilding access; 2) for the remaining $k$ nodes, the normalized repair bandwidth and the normalized rebuilding access (over the file size) are preserved; and 3) the sub-packetization level is increased only by a factor of $r$ . Two immediate applications of this generic transformation are then presented. The first application is that we can transform any nonbinary MDS code with the optimal repair bandwidth or the optimal rebuilding access for the systematic nodes only, into a new MDS code which possesses the corresponding repair optimality for all nodes. The second application is that by applying the transformation multiple times, any nonbinary $(n,k)$ scalar MDS code can be converted into an $(n,k)$ MDS code with the optimal repair bandwidth and the optimal rebuilding access for all nodes, or only a subset of nodes, whose sub-packetization level is also optimal.

65 citations


Cites background or methods or result from "Explicit Constructions of High-Rate..."

  • ...As a result, the optimal repair bandwidth and the optimal rebuilding access1 were subsequently established [6], [7]....

    [...]

  • ...One key new ingredient in [7] and [22]–[24], in contrast to most previous efforts, is that these constructions are given in terms of parity-check matrix, and as a consequence they do not distinguish between the systematic nodes and the parity nodes at all....

    [...]

  • ...Independent and parallel to our work, Ye and Barg [7], [22] proposed several explicit constructions of high-rate MDS codes that can optimally repair all nodes....

    [...]

  • ...A comparison between the piggyback codes in [27] and [29] and the resultant MDS codes obtained from the first application in Section IV is provided in Table I, a comparison between the MDS codes proposed by Ye and Barg and the codes obtained from the first application in Section IV is provided in Table II, and a comparison between the MDS codes proposed in [22] and [23] and the codes obtained from the second application in Section IV is provided in Table III....

    [...]

  • ...A COMPARISON OF SOME PARAMETERS BETWEEN THE (n, k) MDS CODES IN [7], [22] AND THE EXPLICIT (n, k) MDS CODES OBTAINED...

    [...]

Proceedings ArticleDOI
01 Jun 2017
TL;DR: It is shown that any non-binary MDS code with optimal repair bandwidth, or optimal rebuilding access, for only systematic nodes can be converted into an M DS code with the corresponding repair optimality for all nodes.
Abstract: We propose a generic transformation on maximum distance separable (MDS) codes, which can convert any non-binary (k+r, k) MDS code into another (k+r, k) MDS code with the following properties: 1) An arbitrarily chosen r nodes will have the optimal repair bandwidth and the optimal rebuilding access, 2) the repair bandwidth and rebuilding access efficiencies of all other nodes are maintained as in the code before the transformation, 3) it uses the same finite field as the code before the transformation, and 4) the sub-packetization is increased only by a factor of r. As two immediate applications of this powerful transformation, we show that 1) any non-binary MDS code with optimal repair bandwidth, or optimal rebuilding access, for only systematic nodes can be converted into an MDS code with the corresponding repair optimality for all nodes; and 2) any non-binary scalar MDS code can be converted to an MDS code with optimal repair bandwidth and rebuilding access for all nodes, or to an MDS code with optimal rebuilding access for all systematic nodes and moreover with the optimal sub-packatization, by applying the transformation multiple times.

64 citations

References
More filters
Book
01 Jan 1977
TL;DR: This book presents an introduction to BCH Codes and Finite Fields, and methods for Combining Codes, and discusses self-dual Codes and Invariant Theory, as well as nonlinear Codes, Hadamard Matrices, Designs and the Golay Code.
Abstract: Linear Codes. Nonlinear Codes, Hadamard Matrices, Designs and the Golay Code. An Introduction to BCH Codes and Finite Fields. Finite Fields. Dual Codes and Their Weight Distribution. Codes, Designs and Perfect Codes. Cyclic Codes. Cyclic Codes: Idempotents and Mattson-Solomon Polynomials. BCH Codes. Reed-Solomon and Justesen Codes. MDS Codes. Alternant, Goppa and Other Generalized BCH Codes. Reed-Muller Codes. First-Order Reed-Muller Codes. Second-Order Reed-Muller, Kerdock and Preparata Codes. Quadratic-Residue Codes. Bounds on the Size of a Code. Methods for Combining Codes. Self-dual Codes and Invariant Theory. The Golay Codes. Association Schemes. Appendix A. Tables of the Best Codes Known. Appendix B. Finite Geometries. Bibliography. Index.

10,083 citations

Journal ArticleDOI
TL;DR: Facebook usage was found to interact with measures of psychological well-being, suggesting that it might provide greater benefits for users experiencing low self-esteem and low life satisfaction.
Abstract: This study examines the relationship between use of Facebook, a popular online social network site, and the formation and maintenance of social capital. In addition to assessing bonding and bridging social capital, we explore a dimension of social capital that assesses one’s ability to stay connected with members of a previously inhabited community, which we call maintained social capital. Regression analyses conducted on results from a survey of undergraduate students (N = 286) suggest a strong association between use of Facebook and the three types of social capital, with the strongest relationship being to bridging social capital. In addition, Facebook usage was found to interact with measures of psychological well-being, suggesting that it might provide greater benefits for users experiencing low self-esteem and low life satisfaction.

9,001 citations


"Explicit Constructions of High-Rate..." refers methods in this paper

  • ...[2] N. B. Ellison, C. Steinfield, and C. Lampe, “The benefits ofFacebook friends: Social capital and college students use of online social network sites,”Journal of Computer-Mediated Communication, vol. 12, no. 4, pp. 1143–1168, 2007....

    [...]

  • ...I NTRODUCTION Distributed storage systems, such as those run by Google [1]and Facebook [2], are widely used for data storage, with applications ranging from social networks to file and vieo sharing....

    [...]

  • ...D ISTRIBUTED storage systems, such as those run by Google [1] and Facebook [2], are widely used for data storage, with applications ranging from social networks to file and video sharing....

    [...]

Journal ArticleDOI
19 Oct 2003
TL;DR: This paper presents file system interface extensions designed to support distributed applications, discusses many aspects of the design, and reports measurements from both micro-benchmarks and real world use.
Abstract: We have designed and implemented the Google File System, a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients. While sharing many of the same goals as previous distributed file systems, our design has been driven by observations of our application workloads and technological environment, both current and anticipated, that reflect a marked departure from some earlier file system assumptions. This has led us to reexamine traditional choices and explore radically different design points. The file system has successfully met our storage needs. It is widely deployed within Google as the storage platform for the generation and processing of data used by our service as well as research and development efforts that require large data sets. The largest cluster to date provides hundreds of terabytes of storage across thousands of disks on over a thousand machines, and it is concurrently accessed by hundreds of clients. In this paper, we present file system interface extensions designed to support distributed applications, discuss many aspects of our design, and report measurements from both micro-benchmarks and real world use.

5,429 citations

Journal ArticleDOI
TL;DR: It is shown that there is a fundamental tradeoff between storage and repair bandwidth which is theoretically characterize using flow arguments on an appropriately constructed graph and regenerating codes are introduced that can achieve any point in this optimal tradeoff.
Abstract: Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate encoded fragments in a distributed way while transferring as little data as possible across the network. For an erasure coded system, a common practice to repair from a single node failure is for a new node to reconstruct the whole encoded data object to generate just one encoded block. We show that this procedure is sub-optimal. We introduce the notion of regenerating codes, which allow a new node to communicate functions of the stored data from the surviving nodes. We show that regenerating codes can significantly reduce the repair bandwidth. Further, we show that there is a fundamental tradeoff between storage and repair bandwidth which we theoretically characterize using flow arguments on an appropriately constructed graph. By invoking constructive results in network coding, we introduce regenerating codes that can achieve any point in this optimal tradeoff.

1,919 citations


"Explicit Constructions of High-Rate..." refers background or methods in this paper

  • ...[1] showed that NpC,F,Rq ě |F||R|l |F| ` |R| ́ k (1) for any pn, k, lq MDS array codeC and any two disjoint subsets F,R Ď rns such that|F| “ 1 and |R| ě k....

    [...]

  • ...In this case, as shown in [1], [2], the recovery of failed nodes requires to download at least anh{pd ` h ́ kq fraction of the data stored in each of the helper nodes....

    [...]

  • ...It is known that ifh out of the n nodes are inaccessible andd surviving (helper) nodes are used to recover the lost data, then we need to download at leasth{pd ` h ´ kq fraction of the data stored in each of the helper nodes (Dimakis et al., 2010 and Cadambe et al., 2013)....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors presented optimal constructions of minimum bandwidth regenerating (MBR) codes for all values of [n, k, d] and (b) minimum storage regeneration (MSR) code for all value n ≥ 2k-2, using a new product-matrix framework.
Abstract: Regenerating codes are a class of distributed storage codes that allow for efficient repair of failed nodes, as compared to traditional erasure codes. An [n, k, d] regenerating code permits the data to be recovered by connecting to any k of the n nodes in the network, while requiring that a failed node be repaired by connecting to any d nodes. The amount of data downloaded for repair is typically much smaller than the size of the source data. Previous constructions of exact-regenerating codes have been confined to the case n=d+1 . In this paper, we present optimal, explicit constructions of (a) Minimum Bandwidth Regenerating (MBR) codes for all values of [n, k, d] and (b) Minimum Storage Regenerating (MSR) codes for all [n, k, d ≥ 2k-2], using a new product-matrix framework. The product-matrix framework is also shown to significantly simplify system operation. To the best of our knowledge, these are the first constructions of exact-regenerating codes that allow the number n of nodes in the network, to be chosen independent of the other parameters. The paper also contains a simpler description, in the product-matrix framework, of a previously constructed MSR code with [n=d+1, k, d ≥ 2k-1].

698 citations