scispace - formally typeset
Open AccessJournal ArticleDOI

Safe replication through bounded concurrency verification

TLDR
A novel programming framework for replicated data types (RDTs) equipped with an automatic (bounded) verification technique that discovers and fixes weak consistency anomalies and shows that in practice, proving bounded safety guarantees typically generalize to the unbounded case.
Abstract
High-level data types are often associated with semantic invariants that must be preserved by any correct implementation. While having implementations enforce strong guarantees such as linearizability or serializability can often be used to prevent invariant violations in concurrent settings, such mechanisms are impractical in geo-distributed replicated environments, the platform of choice for many scalable Web services. To achieve high-availability essential to this domain, these environments admit various forms of weak consistency that do not guarantee all replicas have a consistent view of an application's state. Consequently, they often admit difficult-to-understand anomalous behaviors that violate a data type's invariants, but which are extremely challenging, even for experts, to understand and debug. In this paper, we propose a novel programming framework for replicated data types (RDTs) equipped with an automatic (bounded) verification technique that discovers and fixes weak consistency anomalies. Our approach, implemented in a tool called Q9, involves systematically exploring the state space of an application executing on top of an eventually consistent data store, under an unrestricted consistency model but with a finite concurrency bound. Q9 uncovers anomalies (i.e., invariant violations) that manifest as finite counterexamples, and automatically generates repairs for such anamolies by selectively strengthening consistency guarantees for specific operations. Using Q9, we have uncovered a range of subtle anomalies in implementations of well-known benchmarks, and have been able to apply the repairs it mandates to effectively eliminate them. Notably, these benchmarks were written adopting best practices suggested to manage distributed replicated state (e.g., they are composed of provably convergent RDTs (CRDTs), avoid mutable state, etc.). While the safety guarantees offered by our technique are constrained by the concurrency bound, we show that in practice, proving bounded safety guarantees typically generalize to the unbounded case.

read more

Citations
More filters
Book ChapterDOI

Proving the Safety of Highly-Available Distributed Objects

TL;DR: This work proposes a proof methodology for establishing that a given object maintains a given invariant, taking into account any concurrency control, for the subclass of state-based distributed systems.
Journal ArticleDOI

CLOTHO: directed test generation for weakly consistent database systems

TL;DR: This paper presents a novel testing framework for detecting serializability violations in (SQL) database-backed Java applications executing on weakly-consistent storage systems and is the first automated test generation facility for identifyingserializability anomalies of Java applications intended to operate in geo-replicated distributed environments.
Posted Content

Automated Parameterized Verification of CRDTs

TL;DR: A framework for automatically verifying convergence of CRDTs under different weak-consistency policies is presented and a proof rule parameterized by a consistency specification based on the concepts of commutativity modulo consistency policy and non-interference to Commutativity is developed.
Proceedings ArticleDOI

Abstraction for conflict-free replicated data types

TL;DR: The Abstract Converging Consistency (ACC) as discussed by the authors is a new correctness formulation for Conflict-Free Replicated Data Types (CRDTs) to specify both data consistency and functional correctness.
References
More filters
Proceedings ArticleDOI

Transactional storage for geo-replicated systems

TL;DR: The design and implementation of Walter is described, a key feature behind Walter is a new property called Parallel Snapshot Isolation (PSI), which allows Walter to replicate data asynchronously, while providing strong guarantees within each site.
Book ChapterDOI

Conflict-free replicated data types

TL;DR: The Conflict-free Replicated Data Type (CRDT) as discussed by the authors is a data type that is guaranteed to converge in a self-stabilising manner, despite any number of failures.
Proceedings ArticleDOI

Verdi: a framework for implementing and formally verifying distributed systems

TL;DR: Verdi, a framework for implementing and formally verifying distributed systems in Coq, formalizes various network semantics with different faults, and enables the developer to first verify their system under an idealized fault model then transfer the resulting correctness guarantees to a more realistic fault model without any additional proof burden.
Proceedings ArticleDOI

Bolt-on causal consistency

TL;DR: This work considers the problem of separating consistency-related safety properties from availability and durability in distributed data stores via the application of a "bolt-on" shim layer that upgrades the safety of an underlying general-purpose data store.
Proceedings Article

Life, death, and the critical transition: finding liveness bugs in systems code

TL;DR: This work argues that checking liveness properties offers both a richer and more natural way to search for errors, particularly in complex concurrent and distributed systems, and presents heuristics to find a large class of liveness violations and the critical transition of the execution.
Related Papers (5)