scispace - formally typeset
Search or ask a question
Book ChapterDOI

Banyan: Coordination-Free Distributed Transactions over Mergeable Types

TL;DR: Banyan is described, a distributed programming model for developing loosely connected distributed applications equipped with a three-way merge function à la Git to handle conflicts and provides isolated transactions for grouping together individual operations which do not require coordination among different replicas.
Abstract: Programming loosely connected distributed applications is a challenging endeavour. Loosely connected distributed applications such as geo-distributed stores and intermittently reachable IoT devices cannot afford to coordinate among all of the replicas in order to ensure data consistency due to prohibitive latency costs and the impossibility of coordination if availability is to be ensured. Thus, the state of the replicas evolves independently, making it difficult to develop correct applications. Existing solutions to this problem limit the data types that can be used in these applications, which neither offer the ability to compose them to construct more complex data types nor offer transactions.
Citations
More filters
Proceedings ArticleDOI
28 Mar 2022
TL;DR: PEEPUL is implemented as an F* library that discharges proof obligations to an SMT solver and develops a replication-aware simulation relation to relate RDT specifications to their efficient purely functional implementations.
Abstract: Replicated data types (RDTs) are data structures that permit concurrent modification of multiple, potentially geo-distributed, replicas without coordination between them. RDTs are designed in such a way that conflicting operations are eventually deterministically reconciled ensuring convergence. Constructing correct RDTs remains a difficult endeavour due to the complexity of reasoning about independently evolving states of the replicas. With the focus on the correctness of RDTs (and rightly so), existing approaches to RDTs are less efficient compared to their sequential counterparts in terms of the time and space complexity of local operations. This is unfortunate since RDTs are often used in a local-first setting where the local operations far outweigh remote communication. This paper presents PEEPUL, a pragmatic approach to building and verifying efficient RDTs. To make reasoning about correctness easier, we cast RDTs in the mould of the distributed version control system, and equip it with a three-way merge function for reconciling conflicting versions. Further, we go beyond just verifying convergence, and provide a methodology to verify arbitrarily complex specifications. We develop a replication-aware simulation relation to relate RDT specifications to their efficient purely functional implementations. We implement PEEPUL as an F* library that discharges proof obligations to an SMT solver. The verified efficient RDTs are extracted as OCaml code and used in Irmin, a distributed database built on the principles of Git.

4 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper defines linearizability, compares it to other correctness conditions, presents and demonstrates a method for proving the correctness of implementations, and shows how to reason about concurrent objects, given they are linearizable.
Abstract: A concurrent object is a data object shared by concurrent processes. Linearizability is a correctness condition for concurrent objects that exploits the semantics of abstract data types. It permits a high degree of concurrency, yet it permits programmers to specify and reason about concurrent objects using known techniques from the sequential domain. Linearizability provides the illusion that each operation applied by concurrent processes takes effect instantaneously at some point between its invocation and its response, implying that the meaning of a concurrent object's operations can be given by pre- and post-conditions. This paper defines linearizability, compares it to other correctness conditions, presents and demonstrates a method for proving the correctness of implementations, and shows how to reason about concurrent objects, given they are linearizable.

3,396 citations


"Banyan: Coordination-Free Distribut..." refers background in this paper

  • ...Strong consistency properties such as Linearizability [20] and Serializability [9] makes it easier to design correct applications....

    [...]

Journal ArticleDOI
TL;DR: Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure.
Abstract: Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure. Cassandra aims to run on top of an infrastructure of hundreds of nodes (possibly spread across different data centers). At this scale, small and large components fail continuously. The way Cassandra manages the persistent state in the face of these failures drives the reliability and scalability of the software systems relying on this service. While in many ways Cassandra resembles a database and shares many design and implementation strategies therewith, Cassandra does not support a full relational data model; instead, it provides clients with a simple data model that supports dynamic control over data layout and format. Cassandra system was designed to run on cheap commodity hardware and handle high write throughput while not sacrificing read efficiency.

2,870 citations


"Banyan: Coordination-Free Distribut..." refers methods in this paper

  • ...Rather than relying on Git remote protocol for dissemination across replicas, we instantiate Banyan on top of Cassandra, an industrial-strength, off-the-shelf distributed store [26]....

    [...]

15 Dec 2001
TL;DR: The Paxos algorithm, when presented in plain English, is very simple and straightforward to understand.
Abstract: At the PODC 2001 conference, I got tired of everyone saying how difficult it was to understand the Paxos algorithm, published in [122]. Although people got so hung up in the pseudo-Greek names that they found the paper hard to understand, the algorithm itself is very simple. So, I cornered a couple of people at the conference and explained the algorithm to them orally, with no paper. When I got home, I wrote down the explanation as a short note, which I later revised based on comments from Fred Schneider and Butler Lampson. The current version is 13 pages long, and contains no formula more complicated than n1 > n2.

1,492 citations


"Banyan: Coordination-Free Distribut..." refers methods in this paper

  • ...Cassandra also offers lightweight transactions (distributed compare-and-update) implemented using the Paxos consensus protocol [27]....

    [...]

Journal ArticleDOI
TL;DR: In this paper, it is shown that it is impossible to achieve consistency, availability, and partition tolerance in the asynchronous network model, and then solutions to this dilemma in the partially synchronous model are discussed.
Abstract: When designing distributed web services, there are three properties that are commonly desired: consistency, availability, and partition tolerance. It is impossible to achieve all three. In this note, we prove this conjecture in the asynchronous network model, and then discuss solutions to this dilemma in the partially synchronous model.

1,456 citations

Proceedings ArticleDOI
22 May 1995
TL;DR: It is shown that these phenomena and the ANSI SQL definitions fail to properly characterize several popular isolation levels, including the standard locking implementations of the levels covered, and new phenomena that better characterize isolation types are introduced.
Abstract: ANSI SQL-92 [MS, ANSI] defines Isolation Levels in terms of phenomena: Dirty Reads, Non-Repeatable Reads, and Phantoms. This paper shows that these phenomena and the ANSI SQL definitions fail to properly characterize several popular isolation levels, including the standard locking implementations of the levels covered. Ambiguity in the statement of the phenomena is investigated and a more formal statement is arrived at; in addition new phenomena that better characterize isolation types are introduced. Finally, an important multiversion isolation type, called Snapshot Isolation, is defined.

1,086 citations


"Banyan: Coordination-Free Distribut..." refers background in this paper

  • ...Such built-in conflict resolution leads to anomalies such as write-skew [8] which makes it difficult (and often impossible) to develop complex applications with rich behaviours....

    [...]

  • ...For example, consider parallel snapshot isolation (PSI) [35], which is an extension of snapshot isolation (SI) [8] for geo-replicated systems....

    [...]