Banyan: Coordination-Free Distributed Transactions over Mergeable Types

doi:10.1007/978-3-030-64437-6_12

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Certified mergeable replicated data types

[...]

Vimala Soundarapandian, Adharsh Kamath, Kartik Nagar, KC Sivaramakrishnan

28 Mar 2022

TL;DR: PEEPUL is implemented as an F* library that discharges proof obligations to an SMT solver and develops a replication-aware simulation relation to relate RDT specifications to their efficient purely functional implementations.

...read moreread less

Abstract: Replicated data types (RDTs) are data structures that permit concurrent modification of multiple, potentially geo-distributed, replicas without coordination between them. RDTs are designed in such a way that conflicting operations are eventually deterministically reconciled ensuring convergence. Constructing correct RDTs remains a difficult endeavour due to the complexity of reasoning about independently evolving states of the replicas. With the focus on the correctness of RDTs (and rightly so), existing approaches to RDTs are less efficient compared to their sequential counterparts in terms of the time and space complexity of local operations. This is unfortunate since RDTs are often used in a local-first setting where the local operations far outweigh remote communication. This paper presents PEEPUL, a pragmatic approach to building and verifying efficient RDTs. To make reasoning about correctness easier, we cast RDTs in the mould of the distributed version control system, and equip it with a three-way merge function for reconciling conflicting versions. Further, we go beyond just verifying convergence, and provide a methodology to verify arbitrarily complex specifications. We develop a replication-aware simulation relation to relate RDT specifications to their efficient purely functional implementations. We implement PEEPUL as an F* library that discharges proof obligations to an SMT solver. The verified efficient RDTs are extracted as OCaml code and used in Irmin, a distributed database built on the principles of Git.

...read moreread less

4 citations

References

PDF

Open Access

More filters

Journal Article•DOI•

Linearizability: a correctness condition for concurrent objects

[...]

Maurice Herlihy¹, Jeannette M. Wing¹•Institutions (1)

Carnegie Mellon University¹

01 Jul 1990-ACM Transactions on Programming Languages and Systems

TL;DR: This paper defines linearizability, compares it to other correctness conditions, presents and demonstrates a method for proving the correctness of implementations, and shows how to reason about concurrent objects, given they are linearizable.

...read moreread less

Abstract: A concurrent object is a data object shared by concurrent processes. Linearizability is a correctness condition for concurrent objects that exploits the semantics of abstract data types. It permits a high degree of concurrency, yet it permits programmers to specify and reason about concurrent objects using known techniques from the sequential domain. Linearizability provides the illusion that each operation applied by concurrent processes takes effect instantaneously at some point between its invocation and its response, implying that the meaning of a concurrent object's operations can be given by pre- and post-conditions. This paper defines linearizability, compares it to other correctness conditions, presents and demonstrates a method for proving the correctness of implementations, and shows how to reason about concurrent objects, given they are linearizable.

...read moreread less

3,396 citations

"Banyan: Coordination-Free Distribut..." refers background in this paper

...Strong consistency properties such as Linearizability [20] and Serializability [9] makes it easier to design correct applications....
[...]

Journal Article•DOI•

Cassandra: a decentralized structured storage system

[...]

Avinash Lakshman¹, Prashant Malik¹•Institutions (1)

Facebook¹

14 Apr 2010-Operating Systems Review

TL;DR: Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure.

...read moreread less

Abstract: Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure. Cassandra aims to run on top of an infrastructure of hundreds of nodes (possibly spread across different data centers). At this scale, small and large components fail continuously. The way Cassandra manages the persistent state in the face of these failures drives the reliability and scalability of the software systems relying on this service. While in many ways Cassandra resembles a database and shares many design and implementation strategies therewith, Cassandra does not support a full relational data model; instead, it provides clients with a simple data model that supports dynamic control over data layout and format. Cassandra system was designed to run on cheap commodity hardware and handle high write throughput while not sacrificing read efficiency.

...read moreread less

2,870 citations

"Banyan: Coordination-Free Distribut..." refers methods in this paper

...Rather than relying on Git remote protocol for dissemination across replicas, we instantiate Banyan on top of Cassandra, an industrial-strength, off-the-shelf distributed store [26]....
[...]

Paxos Made Simple

[...]

Leslie Lamport

15 Dec 2001

TL;DR: The Paxos algorithm, when presented in plain English, is very simple and straightforward to understand.

...read moreread less

Abstract: At the PODC 2001 conference, I got tired of everyone saying how difficult it was to understand the Paxos algorithm, published in [122]. Although people got so hung up in the pseudo-Greek names that they found the paper hard to understand, the algorithm itself is very simple. So, I cornered a couple of people at the conference and explained the algorithm to them orally, with no paper. When I got home, I wrote down the explanation as a short note, which I later revised based on comments from Fred Schneider and Butler Lampson. The current version is 13 pages long, and contains no formula more complicated than n1 > n2.

...read moreread less

1,492 citations

"Banyan: Coordination-Free Distribut..." refers methods in this paper

...Cassandra also offers lightweight transactions (distributed compare-and-update) implemented using the Paxos consensus protocol [27]....
[...]

Journal Article•DOI•

Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services

[...]

Seth Gilbert¹, Nancy Lynch¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jun 2002-Sigact News

TL;DR: In this paper, it is shown that it is impossible to achieve consistency, availability, and partition tolerance in the asynchronous network model, and then solutions to this dilemma in the partially synchronous model are discussed.

...read moreread less

Abstract: When designing distributed web services, there are three properties that are commonly desired: consistency, availability, and partition tolerance. It is impossible to achieve all three. In this note, we prove this conjecture in the asynchronous network model, and then discuss solutions to this dilemma in the partially synchronous model.

...read moreread less

1,456 citations

Proceedings Article•DOI•

A critique of ANSI SQL isolation levels

[...]

Hal Berenson¹, Phil Bernstein¹, Jim Gray², Jim Melton³, Elizabeth O'Neil, Patrick O'Neil - Show less +2 more•Institutions (3)

Microsoft¹, University of California, Berkeley², Sybase³

22 May 1995

TL;DR: It is shown that these phenomena and the ANSI SQL definitions fail to properly characterize several popular isolation levels, including the standard locking implementations of the levels covered, and new phenomena that better characterize isolation types are introduced.

...read moreread less

Abstract: ANSI SQL-92 [MS, ANSI] defines Isolation Levels in terms of phenomena: Dirty Reads, Non-Repeatable Reads, and Phantoms. This paper shows that these phenomena and the ANSI SQL definitions fail to properly characterize several popular isolation levels, including the standard locking implementations of the levels covered. Ambiguity in the statement of the phenomena is investigated and a more formal statement is arrived at; in addition new phenomena that better characterize isolation types are introduced. Finally, an important multiversion isolation type, called Snapshot Isolation, is defined.

...read moreread less

1,086 citations

"Banyan: Coordination-Free Distribut..." refers background in this paper

...Such built-in conflict resolution leads to anomalies such as write-skew [8] which makes it difficult (and often impossible) to develop complex applications with rich behaviours....
[...]
...For example, consider parallel snapshot isolation (PSI) [35], which is an extension of snapshot isolation (SI) [8] for geo-replicated systems....
[...]

Collapse

Banyan: Coordination-Free Distributed Transactions over Mergeable Types

Citations

References

"Banyan: Coordination-Free Distribut..." refers background in this paper

"Banyan: Coordination-Free Distribut..." refers methods in this paper

"Banyan: Coordination-Free Distribut..." refers methods in this paper

"Banyan: Coordination-Free Distribut..." refers background in this paper

Related Papers (5)