scispace - formally typeset
Open AccessProceedings ArticleDOI

A Realistic Distributed Storage System That Minimizes Data Storage and Repair Bandwidth

Bernat Gaston, +2 more
- pp 491-491
TLDR
In this article, a new mathematical model for a distributed storage environment where the storage nodes are placed in two racks is presented and analyzed, where the communication (bandwidth) cost between nodes which are in the same rack is much lower than between nodes that are in different racks.
Abstract
In a realistic distributed storage environment, like the ones used in companies dedicated to the task of storing information over a network, storage nodes are usually placed in racks, a metallic support designed to accommodate electronic equipment. It is known that the communication (bandwidth) cost between nodes which are in the same rack is much lower than between nodes which are in different racks. In this paper, a new mathematical model for a distributed storage environment where the storage nodes are placed in two racks is presented and analyzed.

read more

Citations
More filters
Journal ArticleDOI

Optimal Repair Layering for Erasure-Coded Data Centers: From Theory to Practice

TL;DR: DoubleR as discussed by the authors proposes a repair layering framework to minimize the cross-rack repair traffic and improve the repair performance of regenerating codes in both node recovery and degraded read operations.
Proceedings ArticleDOI

Double Regenerating Codes for hierarchical data centers

TL;DR: It is proved the existence of a DRC construction, and it is shown via quantitative comparisons that DRC significantly reduces the cross-rack repair bandwidth of state-of-the-art minimum storage regenerating codes.
Posted Content

Optimal Repair Layering for Erasure-Coded Data Centers: From Theory to Practice

TL;DR: A practical repair layering framework called DoubleR is proposed and implemented atop the Hadoop Distributed File System (HDFS) and it is shown that DoubleR maintains the theoretical guarantees of DRC and improves the repair performance of regenerating codes in both node recovery and degraded read operations.
Journal ArticleDOI

Explicit Constructions of MSR Codes for Clustered Distributed Storage: The Rack-Aware Storage Model

TL;DR: In this paper, the problem of erasure coding in distributed storage was studied, where nodes are organized into equally sized groups, called racks, and within each group the nodes can communicate freely without taxing the system bandwidth.
Journal ArticleDOI

Capacity of Clustered Distributed Storage

TL;DR: A new system model reflecting the clustered structure of distributed storage is suggested to investigate interplay between storage overhead and repair bandwidth as storage node failures occur, and it is shown that the cross-cluster traffic can be minimized to zero.
References
More filters
Journal ArticleDOI

Network Coding for Distributed Storage Systems

TL;DR: It is shown that there is a fundamental tradeoff between storage and repair bandwidth which is theoretically characterize using flow arguments on an appropriately constructed graph and regenerating codes are introduced that can achieve any point in this optimal tradeoff.
Posted Content

Network Coding for Distributed Storage Systems

TL;DR: In this paper, the authors introduce a general technique to analyze storage architectures that combine any form of coding and replication, as well as presenting two new schemes for maintaining redundancy using erasure codes.
Book ChapterDOI

Erasure Coding Vs. Replication: A Quantitative Comparison

TL;DR: It is shown that systems employing erasure codes have mean time to failures many orders of magnitude higher than replicated systems with similar storage and bandwidth requirements and erasure-resilient systems use an order of magnitude less bandwidth and storage to provide similar system durability.
Book ChapterDOI

High availability in DHTs: erasure coding vs. replication

TL;DR: This paper compares two popular redundancy schemes: replication and erasure coding, and concludes that in some cases the benefits from coding are limited, and may not be worth its disadvantages.
Journal ArticleDOI

Cost-bandwidth tradeoff in distributed storage systems

TL;DR: This paper aims at investigating the theoretical cost-bandwidth tradeoff, and it is demonstrated that any point on this curve can be achieved through the use of the so called generalized regenerating codes which is an enhancement of the regeneration codes introduced by Dimakis et al. in [1].
Related Papers (5)