A Realistic Distributed Storage System That Minimizes Data Storage and Repair Bandwidth
Bernat Gaston,Jaume Pujol,Mercè Villanueva +2 more
- pp 491-491
TLDR
In this article, a new mathematical model for a distributed storage environment where the storage nodes are placed in two racks is presented and analyzed, where the communication (bandwidth) cost between nodes which are in the same rack is much lower than between nodes that are in different racks.Abstract:
In a realistic distributed storage environment, like the ones used in companies dedicated to the task of storing information over a network, storage nodes are usually placed in racks, a metallic support designed to accommodate electronic equipment. It is known that the communication (bandwidth) cost between nodes which are in the same rack is much lower than between nodes which are in different racks. In this paper, a new mathematical model for a distributed storage environment where the storage nodes are placed in two racks is presented and analyzed.read more
Citations
More filters
Journal ArticleDOI
Optimal Repair Layering for Erasure-Coded Data Centers: From Theory to Practice
TL;DR: DoubleR as discussed by the authors proposes a repair layering framework to minimize the cross-rack repair traffic and improve the repair performance of regenerating codes in both node recovery and degraded read operations.
Proceedings ArticleDOI
Double Regenerating Codes for hierarchical data centers
TL;DR: It is proved the existence of a DRC construction, and it is shown via quantitative comparisons that DRC significantly reduces the cross-rack repair bandwidth of state-of-the-art minimum storage regenerating codes.
Posted Content
Optimal Repair Layering for Erasure-Coded Data Centers: From Theory to Practice
TL;DR: A practical repair layering framework called DoubleR is proposed and implemented atop the Hadoop Distributed File System (HDFS) and it is shown that DoubleR maintains the theoretical guarantees of DRC and improves the repair performance of regenerating codes in both node recovery and degraded read operations.
Journal ArticleDOI
Explicit Constructions of MSR Codes for Clustered Distributed Storage: The Rack-Aware Storage Model
Zitan Chen,Alexander Barg +1 more
TL;DR: In this paper, the problem of erasure coding in distributed storage was studied, where nodes are organized into equally sized groups, called racks, and within each group the nodes can communicate freely without taxing the system bandwidth.
Journal ArticleDOI
Capacity of Clustered Distributed Storage
TL;DR: A new system model reflecting the clustered structure of distributed storage is suggested to investigate interplay between storage overhead and repair bandwidth as storage node failures occur, and it is shown that the cross-cluster traffic can be minimized to zero.
References
More filters
Journal ArticleDOI
Network Coding for Distributed Storage Systems
TL;DR: It is shown that there is a fundamental tradeoff between storage and repair bandwidth which is theoretically characterize using flow arguments on an appropriately constructed graph and regenerating codes are introduced that can achieve any point in this optimal tradeoff.
Posted Content
Network Coding for Distributed Storage Systems
TL;DR: In this paper, the authors introduce a general technique to analyze storage architectures that combine any form of coding and replication, as well as presenting two new schemes for maintaining redundancy using erasure codes.
Book ChapterDOI
Erasure Coding Vs. Replication: A Quantitative Comparison
TL;DR: It is shown that systems employing erasure codes have mean time to failures many orders of magnitude higher than replicated systems with similar storage and bandwidth requirements and erasure-resilient systems use an order of magnitude less bandwidth and storage to provide similar system durability.
Book ChapterDOI
High availability in DHTs: erasure coding vs. replication
Rodrigo Rodrigues,Barbara Liskov +1 more
TL;DR: This paper compares two popular redundancy schemes: replication and erasure coding, and concludes that in some cases the benefits from coding are limited, and may not be worth its disadvantages.
Journal ArticleDOI
Cost-bandwidth tradeoff in distributed storage systems
TL;DR: This paper aims at investigating the theoretical cost-bandwidth tradeoff, and it is demonstrated that any point on this curve can be achieved through the use of the so called generalized regenerating codes which is an enhancement of the regeneration codes introduced by Dimakis et al. in [1].