Safe replication through bounded concurrency verification

doi:10.1145/3276534

Open AccessJournal ArticleDOI

Safe replication through bounded concurrency verification

- Vol. 2, pp 164

TLDR

A novel programming framework for replicated data types (RDTs) equipped with an automatic (bounded) verification technique that discovers and fixes weak consistency anomalies and shows that in practice, proving bounded safety guarantees typically generalize to the unbounded case.

Abstract:

High-level data types are often associated with semantic invariants that must be preserved by any correct implementation. While having implementations enforce strong guarantees such as linearizability or serializability can often be used to prevent invariant violations in concurrent settings, such mechanisms are impractical in geo-distributed replicated environments, the platform of choice for many scalable Web services. To achieve high-availability essential to this domain, these environments admit various forms of weak consistency that do not guarantee all replicas have a consistent view of an application's state. Consequently, they often admit difficult-to-understand anomalous behaviors that violate a data type's invariants, but which are extremely challenging, even for experts, to understand and debug. In this paper, we propose a novel programming framework for replicated data types (RDTs) equipped with an automatic (bounded) verification technique that discovers and fixes weak consistency anomalies. Our approach, implemented in a tool called Q9, involves systematically exploring the state space of an application executing on top of an eventually consistent data store, under an unrestricted consistency model but with a finite concurrency bound. Q9 uncovers anomalies (i.e., invariant violations) that manifest as finite counterexamples, and automatically generates repairs for such anamolies by selectively strengthening consistency guarantees for specific operations. Using Q9, we have uncovered a range of subtle anomalies in implementations of well-known benchmarks, and have been able to apply the repairs it mandates to effectively eliminate them. Notably, these benchmarks were written adopting best practices suggested to manage distributed replicated state (e.g., they are composed of provably convergent RDTs (CRDTs), avoid mutable state, etc.). While the safety guarantees offered by our technique are constrained by the concurrency bound, we show that in practice, proving bounded safety guarantees typically generalize to the unbounded case.

Safe replication through bounded concurrency verification

Citations

UniStore: A fault-tolerant marriage of causal and strong consistency (extended version).

CISE3: Verifying Weakly Consistent Applications with Why3.

Implementation Correctness for Replicated Data Types, Categorically

A coordination-free, convergent, and safe replicated tree

A categorical account of replicated data types

References

Cassandra: a decentralized structured storage system

Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services

Managing update conflicts in Bayou, a weakly connected replicated storage system

我的台灣, 看見心靈的故鄉 =2009林磐聳藝術與設計展

Towards robust distributed systems (abstract)

Related Papers (5)

'Cause I'm strong enough: Reasoning about consistency choices in distributed systems

Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services

Replicated data types: specification, verification, optimality

Don't settle for eventual: scalable causal consistency for wide-area storage with COPS

Conflict-free replicated data types