scispace - formally typeset
Search or ask a question

Showing papers by "Chao Tian published in 2014"


Journal ArticleDOI
Chao Tian1
TL;DR: A computer-aided proof approach based on primal and dual relation is developed, which extends Yeung's linear programming method, which was previously only used on information theoretic problems with a few random variables due to the exponential growth of the number of variables in the corresponding LP problem.
Abstract: Exact-repair regenerating codes are considered for the case (n,k,d)=(4,3,3), for which a complete characterization of the rate region is provided. This characterization answers in the affirmative the open question whether there exists a non-vanishing gap between the optimal bandwidth-storage tradeoff of the functional-repair regenerating codes (i.e., the cut-set bound) and that of the exact-repair regenerating codes. To obtain an explicit information theoretic converse, a computer-aided proof (CAP) approach based on primal and dual relation is developed. This CAP approach extends Yeung's linear programming (LP) method, which was previously only used on information theoretic problems with a few random variables due to the exponential growth of the number of variables in the corresponding LP problem. The symmetry in the exact-repair regenerating code problem allows an effective reduction of the number of variables, and together with several other problem-specific reductions, the LP problem is reduced to a manageable scale. For the achievability, only one non-trivial corner point of the rate region needs to be addressed in this case, for which an explicit binary code construction is given.

130 citations


Journal ArticleDOI
TL;DR: It is shown that the separation approach is optimal in two general scenarios and is approximately optimal in a third scenario, which generalizes the second scenario by allowing each source to be reconstructed at multiple destinations with different distortions.
Abstract: We consider the source-channel separation architecture for lossy source coding in communication networks. It is shown that the separation approach is optimal in two general scenarios and is approximately optimal in a third scenario. The two scenarios for which separation is optimal complement each other: the first is when the memoryless sources at source nodes are arbitrarily correlated, each of which is to be reconstructed at possibly multiple destinations within certain distortions, but the channels in this network are synchronized, orthogonal, and memoryless point-to-point channels; the second is when the memoryless sources are mutually independent, each of which is to be reconstructed only at one destination within a certain distortion, but the channels are general, including multi-user channels, such as multiple access, broadcast, interference, and relay channels, possibly with feedback. The third scenario, for which we demonstrate approximate optimality of source-channel separation, generalizes the second scenario by allowing each source to be reconstructed at multiple destinations with different distortions. For this case, the loss from optimality using the separation approach can be upper-bounded when a difference distortion measure is taken, and in the special case of quadratic distortion measure, this leads to universal constant bounds.

51 citations


Proceedings ArticleDOI
08 Jul 2014
TL;DR: This work shows that there exists a threshold on the storage overhead below which such an opportunistic approach does not lose any efficiency from the optimal storage-repair-bandwidth tradeoff, and derives the MTTDL improvement for two repair models: one with limited total repair bandwidth and the other with limited individual-node repair bandwidth.
Abstract: The reliability of erasure-coded distributed storage systems, as measured by the mean time to data loss (MTTDL), depends on the repair bandwidth of the code. Repair-efficient codes provide reliability values several orders of magnitude better than conventional erasure codes. Current state of the art codes fix the number of helper nodes (nodes participating in repair) a priori. In practice, however, it is desirable to allow the number of helper nodes to be adaptively determined by the network traffic conditions. In this work, we propose an opportunistic repair framework to address this issue. It is shown that there exists a threshold on the storage overhead, below which such an opportunistic approach does not lose any efficiency from the optimal storage-repair-bandwidth tradeoff; i.e. it is possible to construct a code simultaneously optimal for different numbers of helper nodes. We further examine the benefits of such opportunistic codes, and derive the MTTDL improvement for two repair models: one with limited total repair bandwidth and the other with limited individual-node repair bandwidth. In both settings, we show orders of magnitude improvement in MTTDL. Finally, the proposed framework is examined in a network setting where a significant improvement in MTTDL is observed.

23 citations


Patent
Chao Tian1
24 Jul 2014
TL;DR: In this paper, multi-reliability regenerating (MRR) erasure codes are disclosed, which can be used to encode and regenerate data in the event of failures of one or more nodes of a distributed storage system.
Abstract: Multi-reliability regenerating (MRR) erasure codes are disclosed. The erasure codes can be used to encode and regenerate data. In particular, the regenerating erasure codes can be used to encode data included in at least one of two or more data messages to satisfy respective reliability requirements for the data. Encoded portions of data from one data message can be mixed with encoded or unencoded portions of data from a second data message and stored at a distributed storage system. This approach can be used to improve efficiency and performance of data storage and recovery in the event of failures of one or more nodes of a distributed storage system.

8 citations


Journal ArticleDOI
TL;DR: In this paper, a new class of exact-repair regenerating codes is constructed by stitching together shorter erasure correction codes, where the stitching pattern can be viewed as block designs, and the proposed codes have the "help-by-transfer" property where the helper nodes simply transfer part of the stored data directly, without performing any computation.
Abstract: A new class of exact-repair regenerating codes is constructed by stitching together shorter erasure correction codes, where the stitching pattern can be viewed as block designs. The proposed codes have the "help-by-transfer" property where the helper nodes simply transfer part of the stored data directly, without performing any computation. This embedded error correction structure makes the decoding process straightforward, and in some cases the complexity is very low. We show that this construction is able to achieve performance better than space-sharing between the minimum storage regenerating codes and the minimum repair-bandwidth regenerating codes, and it is the first class of codes to achieve this performance. In fact, it is shown that the proposed construction can achieve a non-trivial point on the optimal functional-repair tradeoff, and it is asymptotically optimal at high rate, i.e., it asymptotically approaches the minimum storage and the minimum repair-bandwidth simultaneously.

8 citations


Proceedings ArticleDOI
01 Sep 2014
TL;DR: One key question is whether contents with different reliability requirements need to be “mixed” in the optimal solution, for which it is shown that such a mixing can strictly improve upon the non-mixing solution.
Abstract: The digital contents in a large scale distributed storage systems usually have different reliability requirements (e.g., new customer billing records vs. 10-year-old office document backup), and for this reason, erasure codes with different strengths should be utilized to achieve the best storage efficiency. On the other hand, in such large scale distributed storage systems, nodes fail on a regular basis, and the contents stored on them need to be regenerated and stored on other healthy nodes, the efficiency of which is an important factor affecting the overall quality of service. In this work, repair-efficient data storage codes are considered in systems with heterogeneous reliability requirements. We formulate the problem of multi-reliability regenerating (MRR) codes and investigate the optimal storage vs. repair-bandwidth tradeoff. One key question is whether contents with different reliability requirements need to be “mixed” in the optimal solution, for which we show that such a mixing can strictly improve upon the non-mixing solution.

4 citations


Proceedings ArticleDOI
Qi Shi1, Lin Song1, Chao Tian2, Jun Chen1, Sorina Dumitrescu1 
11 Aug 2014
TL;DR: A polar coding scheme is proposed for the multiple description coding (MDC) problem and is shown to be able to achieve a certain rate pair on the dominant line of the achievable rate region determined by El Gamal and Cover.
Abstract: A polar coding scheme is proposed for the multiple description coding (MDC) problem and is shown to be able to achieve a certain rate pair on the dominant line of the achievable rate region determined by El Gamal and Cover This scheme is an adaptation of the one developed by ¸ Sasou glu et al for the multiple access channel (MAC) to the MDC setting The analysis of the proposed scheme contains two new ingredients: 1) a certain MDC-MAC duality and 2) an auxiliary random process that involves both the mutual information and the Bhattacharyya parameter The decorrelation effect of the polar transform is also investigated

2 citations


Patent
Chao Tian1
14 Jul 2014
TL;DR: In this paper, the authors propose a method and apparatus for providing transmission of data on a channel in a network by determining a ratio of a number of channel uses of the channel to a number number of source samples, divides a channel bandwidth into a plurality of subbands of equal bandwidth in accordance with the ratio, receives a source sample block, determines a channel input for each of the plurality of subsampledges from the source sample blocks, and transmits the channel input that is determined over the network.
Abstract: A method and apparatus for providing transmission of data on a channel in a network. For example, the method determines a ratio of a number of channel uses of the channel to a number of source samples, divides a channel bandwidth into a plurality of subbands of equal bandwidth in accordance with the ratio, receives a source sample block, determines a channel input for each of the plurality of subbands from the source sample block in accordance with a hybrid coding scheme, and transmits, for each of the plurality of subbands, the channel input that is determined over the network.

1 citations


Proceedings ArticleDOI
01 Oct 2014
TL;DR: An open-source C library of repair-efficient erasure codes for distributed data storage systems, which includes five classes of such codes in the literature, and discusses the basic principles of the data arrangement, choice of algorithms, and the coding speed.
Abstract: We developed an open-source C library of repair-efficient erasure codes for distributed data storage systems, which includes five classes of such codes in the literature. Because of the more involved coding structures of these codes, they are more suitable to be viewed as array codes. From this perspective, all these codes can have a uniform encoding and decoding interface. We also discuss the basic principles of the data arrangement, choice of algorithms, and then evaluate the coding speed of these encoding and decoding functions.