Open Access
Design and Evaluation of Dynamic Replication Strategies for a High-Performance Data Grid
TLDR
A simulation framework that is developed to model a grid scenario, which enables comparative studies of alternative dynamic replication strategies for three different kinds of access patterns, and shows that the best strategy has significant savings in latency and bandwidth consumption if the access patterns contain a moderate amount of geographical locality.Abstract:
Physics experiments that generate large amounts of data need to be able to share it with researchers around the world. High performance grids facilitate the distribution of such data to geographically remote places. Dynamic replication can be used as a technique to reduce bandwidth consumption and access latency in accessing these huge amounts of data. We describe a simulation framework that we have developed to model a grid scenario, which enables comparative studies of alternative dynamic replication strategies. We present preliminary results obtained with this simulator, in which we evaluate the performance of six different replication strategies for three different kinds of access patterns. The simulation results show that the best strategy has significant savings in latency and bandwidth consumption if the access patterns contain a moderate amount of geographical locality.read more
Citations
More filters
Journal ArticleDOI
Mapping Abstract Complex Workflows onto Grid Environments
Ewa Deelman,Jim Blythe,Yolanda Gil,Carl Kesselman,Gaurang Mehta,Karan Vahi,Kent Blackburn,Albert Lazzarini,Adam Arbree,Richard Cavanaugh,Scott Koranda +10 more
TL;DR: The current ACWG based on AI planning technologies is described and it is outlined how these technologies can play a crucial role in developing complex application workflows in Grid environments.
Proceedings Article
The Virtual Data Grid: A New Model and Architecture for Data-Intensive Collaboration.
TL;DR: A broadly applicable model of a "typed dataset" is defined as the unit of derivation tracking, and simple constructs for describing how datasets are derived from transformations and from other datasets are defined.
Proceedings ArticleDOI
Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources
A. Ramakrishnan,Gurmeet Singh,Henan Zhao,Ewa Deelman,Rizos Sakellariou,Karan Vahi,Kent Blackburn,D. Meyers,M. Samidi +8 more
TL;DR: This paper examines the issue of optimizing disk usage and of scheduling large-scale scientific workflows onto distributed resources where the workflows are data- intensive, requiring large amounts of data storage, and where the resources have limited storage resources and designed an algorithm that can improve the overall workflow performance.
Book ChapterDOI
Dynamic Data Grid Replication Strategy Based on Internet Hierarchy
TL;DR: A novel dynamic replication strategy, called BHR, which reduces data access time by avoiding network congestions in a data grid network by taking benefits from ‘network-level locality’ which represents that required file is located in the site which has broad bandwidth to the site of job execution.
Journal ArticleDOI
Dynamic replication in a data grid using a Modified BHR Region Based Algorithm
TL;DR: A Modified B HR algorithm is proposed to overcome the limitations of the standard BHR algorithm and is simulated using a data grid simulator, OptorSim, developed by European Data Grid projects.
References
More filters
Journal ArticleDOI
An adaptive data replication algorithm
TL;DR: An algorithm for dynamic replication of an object in distributed systems is presented and it is shown that the algorithm can be combined with the concurrency control and recovery mechanisms of ta distributed database management system.
Proceedings ArticleDOI
The case for geographical push-caching
James S. Gwertzman,Margo Seltzer +1 more
TL;DR: This work presents an architecture that allows a Web server to autonomously replicate HTML pages and proposes geographical push-caching as a way of bringing the server back into the loop.
Journal ArticleDOI
Adaptive web caching: towards a new global caching architecture
TL;DR: Equipped with the URL routing table and neighbor cache contents, a cache in the revised design can now search the local group, and forward all missing queries quickly and efficiently, thus eliminating both the waiting delay and the overhead associated with multicast queries.
Posted Content
Replica Selection in the Globus Data Grid
TL;DR: In this article, a high-level replica selection service that uses information regarding replica location and user preferences to guide selection from among storage replica alternatives is presented, and the use of Condor's ClassAds resource description and matchmaking mechanism as an efficient tool for representing and matching storage resource capabilities and policies against application requirements.
Proceedings ArticleDOI
A dynamic object replication and migration protocol for an Internet hosting service
TL;DR: A simulation study using synthetic workloads and the network backbone of UUNET, one of the largest Internet service providers, shows that the proposed protocol is effective in eliminating hot spots and achieves a significant reduction in backbone traffic and server response time at the expense of creating only a small number of extra replicas.