scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Strategies for replica consistency in data grid – a comprehensive survey

TL;DR: Several asynchronous replica consistencies are classified and analyzed based on various strategies such as topology, level of abstraction, update propagation, and locality to enhance the performance and ensure the fault tolerant results to the users.
Abstract: Summary Data grid provides an efficient solution for data-oriented applications that need to manage and process large data sets located at geographically distributed storage resources. Data grid relies on data replicas to enhance the performance and to ensure the fault tolerant results to the users. Replicas are developed to increase the availability of data and to provide better data access. Replicas have their own advantages, but there are a number of issues that must be resolved. Among various existing issues, the critical concern is replica consistency. Various replica consistency strategies are available in the literature. These strategies rationalize and investigate various parameters like bandwidth consumption, access cost, scalability, execution time, storage consumption, staleness, and freshness of replicas. In this paper, several asynchronous replica consistencies are classified and analyzed based on various strategies such as topology, level of abstraction, update propagation, and locality. Some other strategies are also discussed and analyzed like adaptive consistency, quorum-based consistency, load balancing, and agent-based economically efficient, check-pointing, fault tolerance, and conflict management. Parameters on which these strategies are analyzed are methodology, replication classification, consistency, grid topology, environment, evaluation parameters, and performance. Copyright © 2016 John Wiley & Sons, Ltd.
Citations
More filters
Journal ArticleDOI
TL;DR: The objective of this paper is to review the data replication techniques in these two main groups systematically as well as a discussing the main features of each group.

54 citations

01 Jan 2018
TL;DR: In this article, the authors describe a few of the different consistency models that have been proposed, and sketch a framework for thinking about consistency models, and also propose some axes of variation among the consistency models.
Abstract: There are many different replica control techniques, used in different research communities. To understand when one replica management algorithm can be replaced by another, we need to describe more abstractly the consistency model, which captures the set of properties that an algorithm provides, and on which the clients rely (whether the clients are people or other programs). In this chapter we describe a few of the different consistency models that have been proposed, and we sketch a framework for thinking about consistency models. In particular, we show that there are several styles in which consistency models can be expressed, and we also propose some axes of variation among the consistency models.

30 citations

Journal ArticleDOI
TL;DR: This paper provides a quantitative analysis and performance evaluation of target-oriented replication strategies based on target objectives to find out which target objective is most addressed, which are average addressed, and which are least addressed in target- oriented replication strategies.
Abstract: Data replications effectively replicate the same data to various multiple locations to accomplish the objective of zero loss of information in case of failures without any downtown. Dynamic data replication strategies (providing run time location of replicas) in clouds should optimize the key performance indicator parameters, like response time, reliability, availability, scalability, cost, availability, performance, etc. To fulfill these objectives, various state-of-the-art dynamic data replication strategies has been proposed, based on several criteria and reported in the literature along with advantages and disadvantages. This paper provides a quantitative analysis and performance evaluation of target-oriented replication strategies based on target objectives. In this paper, we will try to find out which target objective is most addressed, which are average addressed, and which are least addressed in target-oriented replication strategies. The paper also includes a detailed discussion about the challenges, issues, and future research directions. This comprehensive analysis and performance evaluation based-work will open a new door for researchers in the field of cloud computing and will be helpful for further development of cloud-based dynamic data replication strategies to develop a technique that will address all attributes (Target Objectives) effectively in one replication strategy.

5 citations


Cites background from "Strategies for replica consistency ..."

  • ...[38] classified and analyzed various asynchronous replica consistencies which were classified based on different criterion, such as the level of abstraction, load balancing, update propagation, fault tolerance, topology, location, check-pointing, and many more strategies [38]....

    [...]

Journal ArticleDOI
18 Nov 2019-Sensors
TL;DR: The way to respond to local task allocation requirements without the need to communicate with remote nodes overcomes the disadvantages of centralized task allocation in large-scale sensor networks with significant communication overheads and considerable delay, and has better scalability.
Abstract: Task assignment is a crucial problem in wireless sensor networks (WSNs) that may affect the completion quality of sensing tasks. From the perspective of global optimization, a transmission-oriented reliable and energy-efficient task allocation (TRETA) is proposed, which is based on a comprehensive multi-level view of the network and an evaluation model for transmission in WSNs. To deliver better fault tolerance, TRETA dynamically adjusts in event-driven mode. Aiming to solve the reliable and efficient distributed task allocation problem in WSNs, two distributed task assignments for WSNs based on TRETA are proposed. In the former, the sink assigns reliability to all cluster heads according to the reliability requirements, so the cluster head performs local task allocation according to the assigned phase target reliability constraints. Simulation results show the reduction of the communication cost and latency of task allocation compared to centralized task assignments. Like the latter, the global view is obtained by fetching local views from multiple sink nodes, as well as multiple sinks having a consistent comprehensive view for global optimization. The way to respond to local task allocation requirements without the need to communicate with remote nodes overcomes the disadvantages of centralized task allocation in large-scale sensor networks with significant communication overheads and considerable delay, and has better scalability.

5 citations

Journal ArticleDOI
TL;DR: This paper introduces a new dynamic quorum protocol called the elementary permutation protocol, which permits the dynamic reconfiguration of a tree-structured coterie in function of the load of the machines that possess the data replicas to limit the overhead due to the data consistency protocols.
Abstract: Data replication permits a better network bandwidth utilization and minimizes the effect of latency in large-scale systems such as computing grids. However, the cost of maintaining the data consistent between replicas may become difficult if the read/write system has to ensure sequential consistency. In this paper, we limit the overhead due to the data consistency protocols by introducing a new dynamic quorum protocol called the elementary permutation protocol .This protocol permits the dynamic reconfiguration of a tree-structured coterie \cite{Agrawal91Efficient} in function of the load of the machines that possess the data replicas. It applies a tree transformation in order to obtain a new less loaded coterie.This permutation is based on the load information of a small group of machines possessing the copies. The implementation and the evaluation of our algorithm have been based on the existing atomic read/write service of \cite{Lynch97Robust}. We demonstrate that the elementary permutation ameliorates the system's throughput upto 50\% in the best case. The results of our simulation show that the tree reconfiguration based on the elementary permutation is more efficient for a relatively small number of copies.

5 citations

References
More filters
Proceedings ArticleDOI
27 Aug 2001
TL;DR: The concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales is introduced and its scalability, robustness and low-latency properties are demonstrated through simulation.
Abstract: Hash tables - which map "keys" onto "values" - are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales. The CAN is scalable, fault-tolerant and completely self-organizing, and we demonstrate its scalability, robustness and low-latency properties through simulation.

6,703 citations

Journal ArticleDOI
TL;DR: Results from theoretical analysis and simulations show that Chord is scalable: Communication cost and the state maintained by each node scale logarithmically with the number of Chord nodes.
Abstract: A fundamental problem that confronts peer-to-peer applications is the efficient location of the node that stores a desired data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis and simulations show that Chord is scalable: Communication cost and the state maintained by each node scale logarithmically with the number of Chord nodes.

3,518 citations


"Strategies for replica consistency ..." refers background or methods in this paper

  • ...Structured P2P topology, such as chord-based [37] network, is fabricated based on a certain rule....

    [...]

  • ...In extreme cases, all the nodes are treated as server node [37]....

    [...]

  • ...Centralized approaches are employed in [36, 37], whereas decentralized approaches are employed in [38]....

    [...]

Book
01 Aug 1990
TL;DR: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels and concentrates on fundamental theories as well as techniques and algorithms in distributed data management.
Abstract: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels. The material concentrates on fundamental theories as well as techniques and algorithms. The advent of the Internet and the World Wide Web, and, more recently, the emergence of cloud computing and streaming data applications, has forced a renewal of interest in distributed and parallel data management, while, at the same time, requiring a rethinking of some of the traditional techniques. This book covers the breadth and depth of this re-emerging field. The coverage consists of two parts. The first part discusses the fundamental principles of distributed data management and includes distribution design, data integration, distributed query processing and optimization, distributed transaction management, and replication. The second part focuses on more advanced topics and includes discussion of parallel database systems, distributed object management, peer-to-peer data management, web data management, data stream systems, and cloud computing. New in this Edition: New chapters, covering database replication, database integration, multidatabase query processing, peer-to-peer data management, and web data management. Coverage of emerging topics such as data streams and cloud computing Extensive revisions and updates based on years of class testing and feedback Ancillary teaching materials are available.

2,395 citations


"Strategies for replica consistency ..." refers background in this paper

  • ...The pull-based approach is client-based scenario, which means the client will initiate the changes by pulling the data from the server [58]....

    [...]

  • ...node contains all the files where updates can be performed and communicates with other nodes so as to make two replicas consistent by transferring the changes quickly [58]....

    [...]

  • ...Servers are participating very actively to keep all the replicas identical [58]....

    [...]

Proceedings ArticleDOI
04 May 1997
TL;DR: A family of caching protocols for distrib-uted networks that can be used to decrease or eliminate the occurrence of hot spots in the network, based on a special kind of hashing that is called consistent hashing.
Abstract: We describe a family of caching protocols for distrib-uted networks that can be used to decrease or eliminate the occurrence of hot spots in the network. Our protocols are particularly designed for use with very large networks such as the Internet, where delays caused by hot spots can be severe, and where it is not feasible for every server to have complete information about the current state of the entire network. The protocols are easy to implement using existing network protocols such as TCP/IP, and require very little overhead. The protocols work with local control, make efficient use of existing resources, and scale gracefully as the network grows. Our caching protocols are based on a special kind of hashing that we call consistent hashing. Roughly speaking, a consistent hash function is one which changes minimally as the range of the function changes. Through the development of good consistent hash functions, we are able to develop caching protocols which do not require users to have a current or even consistent view of the network. We believe that consistent hash functions may eventually prove to be useful in other applications such as distributed name servers and/or quorum systems.

2,179 citations


Additional excerpts

  • ...Grid can have any of the following topologies: free scale [26, 27], graph [28, 29], hierarchical [30, 31], P2P [32–34], and hybrid [35] as illustrated in Figure 2....

    [...]

Journal ArticleDOI
TL;DR: This survey covers rollback-recovery techniques that do not require special language constructs and distinguishes between checkpoint-based and log-based protocols, which rely solely on checkpointing for system state restoration.
Abstract: This survey covers rollback-recovery techniques that do not require special language constructs. In the first part of the survey we classify rollback-recovery protocols into checkpoint-based and log-based.Checkpoint-based protocols rely solely on checkpointing for system state restoration. Checkpointing can be coordinated, uncoordinated, or communication-induced. Log-based protocols combine checkpointing with logging of nondeterministic events, encoded in tuples called determinants. Depending on how determinants are logged, log-based protocols can be pessimistic, optimistic, or causal. Throughout the survey, we highlight the research issues that are at the core of rollback-recovery and present the solutions that currently address them. We also compare the performance of different rollback-recovery protocols with respect to a series of desirable properties and discuss the issues that arise in the practical implementations of these protocols.

1,772 citations


"Strategies for replica consistency ..." refers background in this paper

  • ...Three check pointing strategies were described for concurrent processes [95]....

    [...]