scispace - formally typeset
Search or ask a question

Design and Evaluation of Dynamic Replication Strategies for a High-Performance Data Grid

01 Jan 2001-
TL;DR: A simulation framework that is developed to model a grid scenario, which enables comparative studies of alternative dynamic replication strategies for three different kinds of access patterns, and shows that the best strategy has significant savings in latency and bandwidth consumption if the access patterns contain a moderate amount of geographical locality.
Abstract: Physics experiments that generate large amounts of data need to be able to share it with researchers around the world. High performance grids facilitate the distribution of such data to geographically remote places. Dynamic replication can be used as a technique to reduce bandwidth consumption and access latency in accessing these huge amounts of data. We describe a simulation framework that we have developed to model a grid scenario, which enables comparative studies of alternative dynamic replication strategies. We present preliminary results obtained with this simulator, in which we evaluate the performance of six different replication strategies for three different kinds of access patterns. The simulation results show that the best strategy has significant savings in latency and bandwidth consumption if the access patterns contain a moderate amount of geographical locality.
Citations
More filters
Journal ArticleDOI
TL;DR: The current ACWG based on AI planning technologies is described and it is outlined how these technologies can play a crucial role in developing complex application workflows in Grid environments.
Abstract: In this paper we address the problem of automatically generating job workflows for the Grid. These workflows describe the execution of a complex application built from individual application components. In our work we have developed two workflow generators: the first (the Concrete Workflow Generator CWG) maps an abstract workflow defined in terms of application-level components to the set of available Grid resources. The second generator (Abstract and Concrete Workflow Generator, ACWG) takes a wider perspective and not only performs the abstract to concrete mapping but also enables the construction of the abstract workflow based on the available components. This system operates in the application domain and chooses application components based on the application metadata attributes. We describe our current ACWG based on AI planning technologies and outline how these technologies can play a crucial role in developing complex application workflows in Grid environments. Although our work is preliminary, CWG has already been used to map high energy physics applications onto the Grid. In one particular experiment, a set of production runs lasted 7 days and resulted in the generation of 167,500 events by 678 jobs. Additionally, ACWG was used to map gravitational physics workflows, with hundreds of nodes onto the available resources, resulting in 975 tasks, 1365 data transfers and 975 output files produced.

517 citations


Cites background from "Design and Evaluation of Dynamic Re..."

  • ...Others [37, 38] look at the data in the Grid as a tiered system and use dynamic replication strategies to improve data access....

    [...]

Proceedings Article
01 Jan 2003
TL;DR: A broadly applicable model of a "typed dataset" is defined as the unit of derivation tracking, and simple constructs for describing how datasets are derived from transformations and from other datasets are defined.

169 citations


Cites background from "Design and Evaluation of Dynamic Re..."

  • ...via pre-staging [18, 19]....

    [...]

  • ...issues relating for example to data movement [18, 19]....

    [...]

Proceedings ArticleDOI
14 May 2007
TL;DR: This paper examines the issue of optimizing disk usage and of scheduling large-scale scientific workflows onto distributed resources where the workflows are data- intensive, requiring large amounts of data storage, and where the resources have limited storage resources and designed an algorithm that can improve the overall workflow performance.
Abstract: In this paper we examine the issue of optimizing disk usage and of scheduling large-scale scientific workflows onto distributed resources where the workflows are data- intensive, requiring large amounts of data storage, and where the resources have limited storage resources. Our approach is two-fold: we minimize the amount of space a workflow requires during execution by removing data files at runtime when they are no longer required and we schedule the workflows in a way that assures that the amount of data required and generated by the workflow fits onto the individual resources. For a workflow used by gravitational- wave physicists, we were able to improve the amount of storage required by the workflow by up to 57 %. We also designed an algorithm that can not only find feasible solutions for workflow task assignment to resources in disk- space constrained environments, but can also improve the overall workflow performance.

142 citations


Cites background from "Design and Evaluation of Dynamic Re..."

  • ...The most interesting work in the context of this paper, which considers data placement, has been presented in [30, 31]....

    [...]

Book ChapterDOI
07 Dec 2003
TL;DR: A novel dynamic replication strategy, called BHR, which reduces data access time by avoiding network congestions in a data grid network by taking benefits from ‘network-level locality’ which represents that required file is located in the site which has broad bandwidth to the site of job execution.
Abstract: In data grid, large quantity of data files is produced and data replication is applied to reduce data access time. Efficiently utilizing grid resources becomes important research issue since available resources in grid are limited while large number of workloads and large size of data files are produced. Dynamic replication in data grid aims to reduce data access time and to utilize network and storage resources efficiently. This paper proposes a novel dynamic replication strategy, called BHR, which reduces data access time by avoiding network congestions in a data grid network. With BHR strategy, we can take benefits from ‘network-level locality’ which represents that required file is located in the site which has broad bandwidth to the site of job execution. We evaluate BHR strategy by implementing it in an OptorSim, a data grid simulator initially developed by European Data Grid Projects. The simulation results show that BHR strategy can outperform other optimization techniques in terms of data access time when hierarchy of bandwidth appears in Internet. BHR extends current site-level replica optimization study to the network-level.

136 citations

Journal ArticleDOI
TL;DR: A Modified B HR algorithm is proposed to overcome the limitations of the standard BHR algorithm and is simulated using a data grid simulator, OptorSim, developed by European Data Grid projects.

102 citations

References
More filters
Journal ArticleDOI
01 Aug 2001
TL;DR: The authors present an extensible and open Grid architecture, in which protocols, services, application programming interfaces, and software development kits are categorized according to their roles in enabling resource sharing.
Abstract: "Grid" computing has emerged as an important new field, distinguished from conventional distributed computing by its focus on large-scale resource sharing, innovative applications, and, in some cases, high performance orientation. In this article, the authors define this new field. First, they review the "Grid problem," which is defined as flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources--what is referred to as virtual organizations. In such settings, unique authentication, authorization, resource access, resource discovery, and other challenges are encountered. It is this class of problem that is addressed by Grid technologies. Next, the authors present an extensible and open Grid architecture, in which protocols, services, application programming interfaces, and software development kits are categorized according to their roles in enabling resource sharing. The authors describe requirements that they believe any such mechanisms must satisfy and discuss the importance of defining a compact set of intergrid protocols to enable interoperability among different Grid systems. Finally, the authors discuss how Grid technologies relate to other contemporary technologies, including enterprise integration, application service provider, storage service provider, and peer-to-peer computing. They maintain that Grid concepts and technologies complement and have much to contribute to these other approaches.

6,716 citations

Posted Content
TL;DR: This article reviews the "Grid problem," and presents an extensible and open Grid architecture, in which protocols, services, application programming interfaces, and software development kits are categorized according to their roles in enabling resource sharing.
Abstract: "Grid" computing has emerged as an important new field, distinguished from conventional distributed computing by its focus on large-scale resource sharing, innovative applications, and, in some cases, high-performance orientation. In this article, we define this new field. First, we review the "Grid problem," which we define as flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources-what we refer to as virtual organizations. In such settings, we encounter unique authentication, authorization, resource access, resource discovery, and other challenges. It is this class of problem that is addressed by Grid technologies. Next, we present an extensible and open Grid architecture, in which protocols, services, application programming interfaces, and software development kits are categorized according to their roles in enabling resource sharing. We describe requirements that we believe any such mechanisms must satisfy, and we discuss the central role played by the intergrid protocols that enable interoperability among different Grid systems. Finally, we discuss how Grid technologies relate to other contemporary technologies, including enterprise integration, application service provider, storage service provider, and peer-to-peer computing. We maintain that Grid concepts and technologies complement and have much to contribute to these other approaches.

3,595 citations

Journal ArticleDOI
TL;DR: This paper demonstrates the benefits of cache sharing, measures the overhead of the existing protocols, and proposes a new protocol called "summary cache", which reduces the number of intercache protocol messages, reduces the bandwidth consumption, and eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP.
Abstract: The sharing of caches among Web proxies is an important technique to reduce Web traffic and alleviate network bottlenecks. Nevertheless it is not widely deployed due to the overhead of existing protocols. In this paper we demonstrate the benefits of cache sharing, measure the overhead of the existing protocols, and propose a new protocol called "summary cache". In this new protocol, each proxy keeps a summary of the cache directory of each participating proxy, and checks these summaries for potential hits before sending any queries. Two factors contribute to our protocol's low overhead: the summaries are updated only periodically, and the directory representations are very economical, as low as 8 bits per entry. Using trace-driven simulations and a prototype implementation, we show that, compared to existing protocols such as the Internet cache protocol (ICP), summary cache reduces the number of intercache protocol messages by a factor of 25 to 60, reduces the bandwidth consumption by over 50%, eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP. Hence summary cache scales to a large number of proxies. (This paper is a revision of Fan et al. 1998; we add more data and analysis in this version.).

2,174 citations

Journal ArticleDOI
TL;DR: In this paper, the authors introduce design principles for a data management architecture called the data grid, and describe two basic services that are fundamental to the design of a data grid: storage systems and metadata management.

1,198 citations

Proceedings ArticleDOI
01 Oct 1998
TL;DR: This paper proposes a new protocol called "Summary Cache"; each proxy keeps a summary of the URLs of cached documents of each participating proxy and checks these summaries for potential hits before sending any queries, which enables cache sharing among a large number of proxies.
Abstract: The sharing of caches among Web proxies is an important technique to reduce Web traffic and alleviate network bottlenecks. Nevertheless it is not widely deployed due to the overhead of existing protocols. In this paper we propose a new protocol called "Summary Cache"; each proxy keeps a summary of the URLs of cached documents of each participating proxy and checks these summaries for potential hits before sending any queries. Two factors contribute to the low overhead: the summaries are updated only periodically, and the summary representations are economical --- as low as 8 bits per entry. Using trace-driven simulations and a prototype implementation, we show that compared to the existing Internet Cache Protocol (ICP), Summary Cache reduces the number of inter-cache messages by a factor of 25 to 60, reduces the bandwidth consumption by over 50%, and eliminates between 30% to 95% of the CPU overhead, while at the same time maintaining almost the same hit ratio as ICP. Hence Summary Cache enables cache sharing among a large number of proxies.

446 citations