Open Access
Design and Evaluation of Dynamic Replication Strategies for a High-Performance Data Grid
TLDR
A simulation framework that is developed to model a grid scenario, which enables comparative studies of alternative dynamic replication strategies for three different kinds of access patterns, and shows that the best strategy has significant savings in latency and bandwidth consumption if the access patterns contain a moderate amount of geographical locality.Abstract:Â
Physics experiments that generate large amounts of data need to be able to share it with researchers around the world. High performance grids facilitate the distribution of such data to geographically remote places. Dynamic replication can be used as a technique to reduce bandwidth consumption and access latency in accessing these huge amounts of data. We describe a simulation framework that we have developed to model a grid scenario, which enables comparative studies of alternative dynamic replication strategies. We present preliminary results obtained with this simulator, in which we evaluate the performance of six different replication strategies for three different kinds of access patterns. The simulation results show that the best strategy has significant savings in latency and bandwidth consumption if the access patterns contain a moderate amount of geographical locality.read more
Citations
More filters
Proceedings ArticleDOI
Chameleon: a resource scheduler in a data grid environment
Sang-Min Park,Jair-Hoom Kim +1 more
TL;DR: A scheduler, called Chameleon, is implemented, based on the proposed application scheduling model, that considers both amount of computational resources and data availability in Data Grid environment and shows performance improvements in data intensive applications.
Patent
System and method for dividing computations
Steven Neiman,Roman Sulzhyk +1 more
TL;DR: In this paper, a scheduler server is configured to selectively reschedule computation of a job other than a parent job from any one of the plurality of node computing devices to another of the nodes.
Proceedings ArticleDOI
The virtual data grid: a new model and architecture for data-intensive collaboration
TL;DR: Chimera as mentioned in this paper is a model and architecture for a virtual data grid capable of addressing the requirements of collaborative analysis and transformation of large quantities of data over extended periods of time, and it defines a broadly applicable model of a "typed dataset" as the unit of derivation tracking, and simple constructs for describing how datasets are derived from transformations and from other datasets.
Journal ArticleDOI
Data Replication in Data Intensive Scientific Applications with Performance Guarantee
TL;DR: This paper designs a polynomial time centralized replication algorithm that reduces the total data file access delay by at least half of that reduced by the optimal replication solution, and designs a distributed caching algorithm, which can be easily adopted in a distributed environment such as Data Grids.
Journal ArticleDOI
A survey of dynamic replication strategies for improving data availability in data grids
TL;DR: Different issues involved in data replication are identified and different replication techniques are studied to find out which attributes are addressed in a given technique and which are ignored to facilitate the future comparison of dynamic replication techniques.
References
More filters
Journal ArticleDOI
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
TL;DR: The authors present an extensible and open Grid architecture, in which protocols, services, application programming interfaces, and software development kits are categorized according to their roles in enabling resource sharing.
Posted Content
The Anatomy of the Grid - Enabling Scalable Virtual Organizations
TL;DR: This article reviews the "Grid problem," and presents an extensible and open Grid architecture, in which protocols, services, application programming interfaces, and software development kits are categorized according to their roles in enabling resource sharing.
Journal ArticleDOI
Summary cache: a scalable wide-area web cache sharing protocol
TL;DR: This paper demonstrates the benefits of cache sharing, measures the overhead of the existing protocols, and proposes a new protocol called "summary cache", which reduces the number of intercache protocol messages, reduces the bandwidth consumption, and eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP.
Journal ArticleDOI
The data grid
TL;DR: In this paper, the authors introduce design principles for a data management architecture called the data grid, and describe two basic services that are fundamental to the design of a data grid: storage systems and metadata management.
Proceedings ArticleDOI
Summary cache: a scalable wide-area Web cache sharing protocol
TL;DR: This paper proposes a new protocol called "Summary Cache"; each proxy keeps a summary of the URLs of cached documents of each participating proxy and checks these summaries for potential hits before sending any queries, which enables cache sharing among a large number of proxies.