scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Declarative programming over eventually consistent data stores

03 Jun 2015-Vol. 50, Iss: 6, pp 413-424
TL;DR: QUELEA is presented, a declarative programming model for eventually consistent data stores (ECDS), equipped with a contract language, capable of specifying fine-grained application - level consistency properties, and an implementation of QUEleA on top of an off-the-shelf ECDS that provides support for coordination-free transactions.
Abstract: User-facing online services utilize geo-distributed data stores to minimize latency and tolerate partial failures, with the intention of providing a fast, always-on experience. However, geo-distribution does not come for free; application developers have to contend with weak consistency behaviors, and the lack of abstractions to composably construct high-level replicated data types, necessitating the need for complex application logic and invariably exposing inconsistencies to the user. Some commercial distributed data stores and several academic proposals provide a lattice of consistency levels, with stronger consistency guarantees incurring increased latency and throughput costs. However, correctly assigning the right consistency level for an operation requires subtle reasoning and is often an error-prone task. In this paper, we present QUELEA, a declarative programming model for eventually consistent data stores (ECDS), equipped with a contract language, capable of specifying fine-grained application - level consistency properties. A contract enforcement system analyses contracts, and automatically generates the appropriate consistency protocol for the method protected by the contract. We describe an implementation of QUELEA on top of an off-the-shelf ECDS that provides support for coordination-free transactions. Several benchmarks including two large web applications, illustrate the effectiveness of our approach.

Summary (7 min read)

1. Introduction

  • Many real-world web services — such as those built and maintained by Amazon, Facebook, Google, Twitter, etc. — replicate application state and logic across multiple replicas within and across data centers.
  • Indeed, modern web services, which aim to provide an "always-on" experience, overwhelmingly favor availability and partition tolerance over strong consistency.
  • Contracts are used to specify fine-grained application-level consistency properties, and are statically analyzed to assign the most efficient and sound store consistency level to the corresponding operation.
  • The rest of the paper is organized as follows.

2. System Model

  • The authors describe the system model and introduce the primitive relations that their contract language is seeded with.
  • Each object is associated with a set of operations.
  • The sequence of operations invoked by a particular client on the store is called a session.
  • The effectswx1 andwx2 are visible towx4 , written logically as vis(wx1 , wx4 ) ∧ vis(wx2 , wx4 ), where vis is the visibility relation between effects.
  • For simplicity, the authors assume all operation names across all object are distinct.

3. Motivation

  • Consider how the authors might implement a highly available bank account on top of an ECDS, with the integrity constraint that the balance must be non-negative.
  • The authors begin by implementing a bank account replicated data type (RDT) in QUELEA, and then describe the mechanisms to obtain the desired correctness guarantees.

3.1 RDT Specification

  • A key novelty in QUELEA is that it allows the addition of new RDTs to the store, which obviates the need for coercing application logic to utilize store-provided data types.
  • Concerns permits operational reasoning for conflict resolution, and declarative reasoning for consistency.
  • The implementation of the bank account operations in QUELEA is given in Figure 2.
  • The datatype Acc represents the effect type for the bank account.
  • For each operation, hist is a snapshot of the state of the object at some replica.

3.1.1 Summarization

  • Observe that the definition of getBalance reduces over the entire history of updates to an account.
  • If the authors are to realize an efficient implementation of this bank account RDT, they need a summary of the account history.
  • Intuitively, the current account balance summarizes the state of an account.
  • This notion of observable equivalence can be generalized to other RDTs as well.
  • Since the notion of observable equivalence is specific to each RDT, programmers can provide a summarization function - of type [e] -> [e] - as a part of the RDT specification.

3.2 Anomalies under Eventual Consistency

  • The authors goal is to choose the correct consistency level for each of the bank account operations such that (1) the balance remains nonnegative and (2) the getBalance operation never incorrectly returns a negative balance.
  • The withdraw operation witnesses the deposit and succeeds1.
  • Subsequently, session 2 perform a withdraw operation, but importantly, due to eventual consistency, only witnesses the deposit from session 1, but not the subsequent withdraw.
  • This anomaly leads the user to incorrectly conclude that the withdraw operation failed to go through.
  • Finding the appropriate fixes is not readily apparent.

3.3 Contracts

  • QUELEA helps facilitate the mapping of operations to appropriate consistency levels by letting the programmer declare applicationlevel consistency constraints as contracts2 (Figure 4) that axiomatically specify the set of allowed executions involving this operation.
  • In their running example, it is clear that in order to preserve the critical integrity constraint, the withdraw operation must be strongly consistent.
  • Withdraw states that a is an effect emitted by a withdraw operation i.e., oper(a, withdraw) holds, also known as The syntax a.
  • Any execution on a bank account object that preserves the above contract for a withdraw operation is said to be derived from a correct implementation of withdraw.
  • Let η̂ stand for the effect emitted by the getBalance operation.

3.4 From Contracts to Implementation

  • Notice that the contracts for withdraw and getBalance only express application-level consistency requirements, and make no reference to the semantics of the underlying store.
  • The mapping of application-level consistency requirements to appropriate store-level guarantees is done automatically behind-the-scene.
  • One strategy would be to execute operations speculatively.
  • The overhead of state maintenance and the complexity of user-defined contracts is likely to make this technique infeasible in practice.
  • Contracts are analyzed with the help of a theorem prover, and statically mapped to a particular store-level consistency property that the prover guarantees preserves contract semantics.

4.1 Syntax

  • The syntax of their core contract language is shown in Figure 4.
  • The language is based on first-order logic (FOL), and admits prenex universal quantification over typed and untyped effect variables.
  • Notice that η̂ occurs free in the contract.
  • An untyped effect variable ranges over all operation names.
  • Quantifier-free propositions in their contract language are conjunctions, disjunctions and implications of predicates expressing relations between pairs of effect variables.

4.2 Semantics

  • QUELEA contracts are constraints over axiomatic definitions of program executions.
  • Figure 5 summarizes artifacts relevant to define an axiomatic execution.
  • The authors formalize an axiomatic execution as a tuple (A,vis,so,sameobj), where A, called the effect soup, is the set of all effects generated during the program execution, and vis, so, sameobj ⊆ A × A are visibility, session order, and same object relations, respectively, witnessed over generated effects at run-time.
  • Note that the axiomatic definition of an execution (E) provides interpretations for primitive relations (eg: vis) that occur free in contract formulas, and also fixes the domain of quantification to set of all effects (A) observed during the program execution.
  • As such, E is a potential model for any first-order formula (ψ) expressible in their contract language.

4.3 Capturing Store Semantics

  • An important aspect of their contract language is its ability to capture store-level consistency guarantees, along with application-level consistency requirements.
  • Similar to [10], the authors can rigorously define a wide variety of store semantics including those that combine any subset of session and causality guarantees, and multiple consistency levels.
  • Eventually consistent operations can be satisfied as long as the client can reach at least one replica, also known as Eventual consistency.
  • In the bank account example, deposit is an eventually consistent operation.
  • While an ECDS typically offers basic eventual consistency with all possible anomalies, the authors assume that their store 3 Strictly speaking,R+ is not the transitive closure ofR, as transitive closure is not expressible in FOL.

4.4 Contract Classification

  • The authors goal is to map application-level consistency constraints on operations to appropriate store-level consistency guarantees capable of satisfying these constraints.
  • Towards this end, the authors define a binary weaker than relation for their contract language as following: Definition 2.
  • It is safe to execute operation op under a store consistency level captured by ψst.
  • The authors define the corresponding consistency level as the consistency class of the contract.
  • Along with three straightforward rules that classify contracts into consistency classes, the classification scheme also presents a rule that judges well-formedness of a contract.

4.5 Generality of Contracts

  • It is important to note that their contract language provides a generic way to capture application-level consistency properties and is not tied to a particular store semantics.
  • In particular, the same application-level contracts can easily be mapped to a different store with a varied consistency lattice.
  • The consistency level of an operation is any combination of the above guarantees, which form a partially ordered consistency lattice show in Figure 7.
  • An edge from an upper level element to a lower level element corresponds to a weaker-than relation between the corresponding contracts.
  • Classifying a contract under this scheme is a directed search in the lattice, starting from the bottom, and determining the weakest consistency level under which the contract can be satisfied.

4.6 Soundness of Contract Classification

  • The authors now present a meta-theoretic result that certifies the soundness of classification-based contract enforcement.
  • The authors model the system as a tuple E,Σ, where the axiomatic execution E captures the data store’s current state, and session soup Σ is the set of concurrent client sessions interacting with the store.
  • The relation captures the progress of the execution (from E to E′) due to the successful completion of a client operation op from one of the sessions in Σ, generating a new effect η.
  • The theorem states that if a data store correctly enforces ψsc, ψcc, and ψec contracts in all well-formed executions, then the same store, extended with the classification scheme shown in Figure 6, can enforce all well-formed QUELEA contracts.
  • The proof of the theorem is given below: Proof.

5. Operational Semantics

  • The authors now describe operational semantics of a data store that implements strong, causal and eventual consistency guarantees.
  • For technical reasons, the authors tag each session with a session identifier (s) and the sequence number (i) of the next operation in the session.
  • Session order relation (so) relates effects generated by the same session.
  • Rule [OPER] is an auxiliary reduction of the form: Θ ` (E, 〈s, i, op〉) r ↪−→ (E′, η) Under the store configuration Θ, the rule captures the progress in execution (from E to E′) due to the application of operation op to replica r resulting in a new effect η.
  • The rule first constructs a context for the application from the local state (Θ(r)) of the replica, by projecting4 relevant information from effects in Θ(r).

5.1 Soundness of Operational Semantics

  • The authors now prove a meta-theoretic property that establishes the soundness of their operational semantics in enforcing ψec, ψcc, and ψsc consistency guarantees at every reduction step.
  • The following theorem proves that their operational semantics correctly enforce ψec, ψcc, and ψsc guarantees: Theorem 7 (Soundness Modulo Classification).
  • By inversion on H1, the authors get the following hypotheses: Theorem 8 (Causal Consistency Preservation).
  • Therefore, the authors need to prove that new store configuration Θ is causally consistent under E = (A,vis,so,sameobj).

6. Transaction Contracts

  • While contracts on individual operations offer the programmer object-level declarative reasoning, real-world scenarios often involve operations that span multiple objects.
  • In order to address this problem, several recent systems [2, 9, 26] have proposed eventually consistent transactions in order to compose operations on multiple objects.
  • Given that classical transaction models such as serializability and snapshot isolation require inter-replica coordination, these systems espouse coordination-free transactions that remain available under network partitions, but only provide weaker isolation guarantees.
  • Coordination-free transactions have intricate consistency semantics and widely varying runtime overheads.
  • This choice is further complicated by the consistency semantics of individual operations.

6.1 Syntax and Semantics Extensions

  • QUELEA automates the choice of assigning the correct and most efficient transaction isolation level.
  • Similar to contracts on individual operations, the programmer associates contracts with transactions, declaratively expressing consistency specifications.
  • The authors extend the contract language with a new term under quantifier-free propositions - txn S1 S2, where S1 and S2 are sets of effects, and introduce a new primitive equivalence relation sametxn that holds for effects from the same transaction.
  • The authors assume that operations not part of any transaction belong to their own unique transaction.
  • While transactions may have varying isolation guarantees, the authors make the standard assumption that all transactions provide atomicity.

6.2 Transactional Bank Account

  • In order to illustrate the utility of declarative reasoning for transactions, consider an extension of their running example to use two accounts – current (c) and savings (s).
  • The authors goal is to ensure that totalBalance returns the result obtained from a consistent snapshot of the object states.
  • The two getBalance operations in a totalBalance transaction might be served by different replicas with a distinct set of committed save transactions.
  • It is not immediately apparent how to choose the weakest isolation guarantee that would be sufficient to prevent the anomaly.
  • Instead, QUELEA requires the programmer to simply state the consistency requirement as a contract.

6.3 Coordination-free Transactions

  • In order to illustrate the utility of transaction contract classification, the authors identify three well-understood coordination-free transaction semantics – Read Committed (RC) [7], Monotonic Atomic View (MAV) [2] and Repeatable Read (RR) [7], and illustrate the classification strategy.
  • A transaction with ANSI RC semantics only witnesses committed operations.
  • Once all the updates from a transaction are available, the buffered updates are made visible to subsequent client requests.
  • Importantly, RC does not entail any other guarantees.
  • St witnessed by the running transaction, and before performing an operation at some replica, ensure that the replica includes all the transactions in St. Hence, MAV is coordination-free.

6.4 Classification

  • Similar to operation-level contracts, with respect to ≤ relation, the coordination-free transaction semantics described here form a total order: ψrc ≤ ψmav ≤ ψrr.
  • The transaction classification is also similar to the operation-level contract classification presented in Figure 6; given a contract ψ on a transaction, the authors start from the weakest transaction contract ψrc, and progressively compare its strength to the known transaction contracts until they find a isolation level under which ψ can be safely discharged.

7. Implementation

  • QUELEA is implemented as a shallow extension of GHC Haskell and runs on top of Cassandra, an off-the-shelf eventually consistent distributed data (or backing) store responsible for all data management issues (i.e., replication, fault tolerance, availability, and convergence).
  • Template Haskell is used to implement static contract classification, and proof obligations are discharged with the help of the Z3 [30] SMT solver.
  • Figure 10 illustrates the overall system architecture.
  • The authors implementation supports eventual, causal, and strong consistency for data type operations, and RC, MAV, and RR semantics for transactions.
  • This functionality is implemented entirely on top of the standard interface exposed by Cassandra.

7.1 Operation Consistency

  • The shim layer maintains a causally consistent in-memory snapshot of a subset of objects in the backing store, by explicitly tracking dependencies introduced between effects due to visibility, session and same transaction relations.
  • Dependence tracking is similar to the techniques presented in [3] and [20].
  • Similarly, new shim layer nodes can be spawned on demand.
  • Hence, their dependence tracking strategy ensures that QUELEA does not track every effect as the number of writes in the system grows.
  • The shim layer nodes periodically fetch updates from the backing store for eventually consistent operations, and on-demand for causally consistent and strongly consistent operations.

7.2 Transactions

  • QUELEA implements atomic visibility by exploiting shim layer causality guarantees – an effect is included only if all the effects if depends on are also included.
  • The dotted circle represents effects that are not yet inserted into the store.
  • The graph on the left shows that state of the store after executing oper2.
  • This ensures atomicity and satisfies the RC requirement.

7.3 Summarization

  • The authors utilize the summarize function (§ 3.1.1) to summarize the object state both in the shim layer node and the backing store, typically when the number of effects on an object crosses a tunable threshold.
  • Shim layer summarization is straight-forward; a summarization thread takes the local lock on the cached object, and replaces its state with the summarized state.
  • It is essential that concurrent client operations are permitted, but are not allowed to witness the intermediate state of the summarization process.
  • Suppose the original set of effects on an object are o1, o2 and o3.

8. Evaluation

  • The authors present an evaluation study of their implementation, report contract profiles of benchmark programs, and illustrate the performance benefits of fine-grained consistency classification on operations and transactions.
  • The application allows adding new users, adding and replying to tweets, following, unfollowing and blocking users, and fetching a user’s timeline, userline, followers and following.
  • With 512 clients, the QUELEA implementation was within 41% of the latency and 18% of the throughput of EC, whereas SC operations had 162% higher latency and 52% lower throughput than EC operations.
  • The numbers were obtained under a 1DC configuration.
  • The authors use 128 clients and a single QUELEA replica, with all clients operating on the same LWW register to stress test the summarization mechanism.

10. Conclusions

  • This paper presents QUELEA, a shallow Haskell extension for declarative programming over ECDS.
  • The key idea underlying QUELEA’s design is the automatic classification of fine-grained consistency contracts on operations and distributed transactions with respect various consistency and isolation levels offered by the store.
  • The authors contract language is carefully crafted from a decidable subset of first-order logic, enabling the use of automated verification tools to discharge the proof obligations associated with contract classification.
  • The authors realize an instantiation of QUELEA on top of an off-the-shelf distributed store, Cassandra, and illustrate the benefit of fine-grained contract classification by implementing and evaluating several scalable applications.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

C
o
n
s
i
s
t
e
n
t
*
C
o
m
p
l
e
t
e
*
W
e
l
l
D
o
c
u
m
e
n
t
e
d
*
E
a
s
y
t
o
R
e
u
s
e
*
*
E
v
a
l
u
a
t
e
d
*
P
L
D
I
*
A
r
t
i
f
a
c
t
*
A
E
C
Declarative Programming over
Eventually Consistent Data Stores
KC Sivaramakrishnan
University of Cambridge, UK
sk826@cl.cam.ac.uk
Gowtham Kaki
Purdue University, USA
gkaki@cs.purdue.edu
Suresh Jagannathan
Purdue University, USA
suresh@cs.purdue.edu
Abstract
User-facing online services utilize geo-distributed data stores to
minimize latency and tolerate partial failures, with the intention of
providing a fast, always-on experience. However, geo-distribution
does not come for free; application developers have to contend
with weak consistency behaviors, and the lack of abstractions to
composably construct high-level replicated data types, necessitating
the need for complex application logic and invariably exposing
inconsistencies to the user. Some commercial distributed data stores
and several academic proposals provide a lattice of consistency
levels, with stronger consistency guarantees incurring increased
latency and throughput costs. However, correctly assigning the right
consistency level for an operation requires subtle reasoning and is
often an error-prone task.
In this paper, we present QUELEA, a declarative programming
model for eventually consistent data stores (ECDS), equipped with
a contract language, capable of specifying fine-grained application-
level consistency properties. A contract enforcement system analyses
contracts, and automatically generates the appropriate consistency
protocol for the method protected by the contract. We describe
an implementation of QUELEA on top of an off-the-shelf ECDS
that provides support for coordination-free transactions. Several
benchmarks including two large web applications, illustrate the
effectiveness of our approach.
Categories and Subject Descriptors
D.1.3 [Concurrent Program-
ming]: Distributed Programming; C.2.4 [Distributed Systems]: Dis-
tributed databases; D.3.2 [Language Classifications]: Applicative
(Functional) Languages; F.3.1 [Logics and Meanings of Programs]:
Specifying and Verifying and Reasoning about Programs
General Terms Languages, Performance
Keywords
Eventual Consistency, Availability, CRDTs, Axiomatic
Contracts, Contract Classification, Distributed Transactions, SMT
solvers, Decidable Logic, Quelea, Cassandra, Haskell
This work was done at Purdue University, USA.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from permissions@acm.org.
PLDI’15, , June 13–17, 2015, Portland, OR, USA.
Copyright © 2015 ACM 978-1-4503-3468-6/15/06. . . $15.00.
http://dx.doi.org/10.1145/2737924.2737981
1. Introduction
Many real-world web services such as those built and maintained
by Amazon, Facebook, Google, Twitter, etc. replicate applica-
tion state and logic across multiple replicas within and across data
centers. Replication is intended not only to improve application
throughput and reduce user-perceived latency, but also to tolerate
partial failures without compromising overall service availability.
Traditionally programmers have relied on strong consistency guar-
antees such as linearizability [
15
] or serializability [
21
] in order
to build correct applications. While strong consistency is an eas-
ily stated property, it masks the reality underlying large-scale dis-
tributed systems with respect to non-uniform latency, availability,
and network partitions [
8
,
14
]. Indeed, modern web services, which
aim to provide an "always-on" experience, overwhelmingly favor
availability and partition tolerance over strong consistency. To this
end, several weak consistency models such as eventual consistency,
causal consistency, session guarantees, and timeline consistency
have been proposed.
Under weak consistency, the developer needs to be aware of
concurrent conflicting updates, and has to pay careful attention
to avoid unwanted inconsistencies (e.g., negative balances in a
bank account, or having an item appear in a shopping cart after
it has been removed [
13
]). Oftentimes, these inconsistencies leak
from the application and are witnessed by the user. Ultimately,
the developer must decide the consistency level appropriate for a
particular operation; this is understandably an error-prone process
requiring intricate knowledge of both the application as well as the
semantics and implementation of the underlying data store, which
typically have only informal descriptions. Nonetheless, picking the
correct consistency level is critical not only for correctness but
also for scalability of the application. While choosing a weaker
consistency level than required may introduce program errors and
anomalies, choosing a stronger one than necessary can negatively
impact program scalability and performance.
Weak consistency also hinders compositional reasoning about
programs. Although an application might be naturally expressed
in terms of well-understood and expressive data types such as
maps, trees, queues, or graphs, geo-distributed stores typically only
provide a minimal set of data types with in-built conflict resolution
strategies such as last-writer-wins (LWW) registers, counters, and
sets [
16
,
25
]. Furthermore, while traditional database systems
enable composability through transactions, geo-distributed stores
typically lack unrestricted serializable transactional access to the
data. Working in this environment thus requires application state
to be suitably coerced to function using only the capabilities of the
store.
To address these issues, we describe QUELEA, a declarative
programming model and implementation for ECDS. The key novelty
of QUELEA is an expressive contract language to declare and

Eventually Consistent Data Store
Replica
Replica
Replica
n
......
x {w
x
, w
x
}
y {w
y
, w
y
}
.
.
.
w
y
w
x
x {w
x
}
.
.
.
Session
Session
.
.
.
....
Session
Order
v x. foo(arg
); w
x
v x.bar(arg
); w
x
Figure 1: QUELEA system model.
verify fine-grained application-level consistency properties. The
programmer uses the contract language to axiomatically specify
the set of legal executions allowed over the replicated data type.
Contracts are constructed using primitive consistency relations
such as visibility and session order along with standard logical
and relational operators. A contract enforcement system statically
maps operations over the datatype to a particular consistency level
available on the store, and provably validates the correctness of the
mapping. The paper makes the following contributions:
We introduce QUELEA, a shallow extension of Haskell that
supports the description and validation of replicated data types
found in an ECDS. Contracts are used to specify fine-grained
application-level consistency properties, and are statically ana-
lyzed to assign the most efficient and sound store consistency
level to the corresponding operation.
QUELEA supports coordination-free transactions over arbitrary
datatypes. We extend our contract language to express fine-
grained transaction isolation guarantees, and utilize the contract
enforcement system to automatically assign the correct isolation
level for a transaction.
We provide meta-theory that certifies the soundness of our
contract enforcement system, and ensures that an operation is
only executed if the required conditions on consistency are met.
We describe an implementation of QUELEA as a transparent
shim layer over Cassandra [
16
], a well-known general-purpose
data store. Experimental evaluation over a set of real-world
applications, including a Twitter-like micro-blogging site and an
eBay-like auction site illustrates the practicality of our approach.
The rest of the paper is organized as follows. The next section
describes the system model. We describe the challenges in program-
ming under eventual consistency, and introduce QUELEA contracts
as a proposed solution to overcome these issues in § 3. § 4 pro-
vides more details on the contract language, and its mapping to
store consistency levels, along with meta-theory for certifying the
correctness of the mapping. § 6 introduces transaction contracts and
their classification. § 7 describes the implementation of QUELEA on
top of Cassandra. § 8 discusses experimental evaluation. § 9 and 10
present related work and conclusions.
2. System Model
In this section, we describe the system model and introduce the
primitive relations that our contract language is seeded with. Figure 1
presents a schematic diagram of our system model. The distributed
store is composed of a collection of replicas, each of which stores a
set of objects (
x, y, . . .
). We assume that every object is replicated
at every replica in the store. The state of an object at any replica is
the set of all updates (effects) performed on the object. For example,
the state of
x
at replica 1 is the set composed of effects
w
x
1
and
w
x
2
.
Each object is associated with a set of operations. The clients
interact with the store by invoking operations on objects. The
sequence of operations invoked by a particular client on the store
is called a session. The data store is typically accessed by a large
number of clients (and hence sessions) concurrently. Importantly,
the clients are oblivious to which replica an operation is applied
to; the data store may choose to route the operation to any replica
in order to minimize latency, balance load, etc. For example, the
operations foo and bar invoked by the same session on the same
object, might end up being applied to different replicas because
replica 1 (to which foo was applied) might be unreachable when the
client invokes bar.
When foo is invoked on a object
x
with arguments arg
1
at
replica 1, it simply reduces over the current set of effects at that
replica on that object (
w
x
1
and
w
x
2
), produces a result
v1
that is
sent back to the client, and emits a single new effect
w
x
4
that is
appended to the state of
x
at replica 1. Thus, every operation is
evaluated over a snapshot of the state of the object on which it is
invoked. In this case, the effects
w
x
1
and
w
x
2
are visible to
w
x
4
, written
logically as
vis(w
x
1
, w
x
4
) vis(w
x
2
, w
x
4
)
, where
vis
is the visibility
relation between effects. Visibility is an irreflexive and asymmetric
relation, and only relates effects produced by operations on the same
object. Executing a read-only operation is similar except that no
new effects are produced. The effect added to a particular replica
is asynchronously sent to other replicas, and eventually merged
into all other replicas. Observe that this model does not assume
a particular resolution strategy for concurrent conflicting updates,
and instead preserves every update. Update conflicts are resolved
when an operation reduces over the set of effects on an object at a
particular replica.
Two effects
w
x
4
and
w
x
5
that arise from the same session are said
to be in session order (written logically as
so(w
x
4
, w
x
5
)
). Session
order is an irreflexive, transitive relation. The effects
w
x
4
and
w
x
5
arising from operations applied to the same object
x
are said to be
under the same object relation, written
sameobj(w
x
4
, w
x
5
)
. Finally,
we can associate every effect with the operation that generated
the effect with the help of a relation
oper
. In the current example,
oper(w
x
4
, f oo)
and
oper(w
x
5
, bar)
hold. For simplicity, we assume
all operation names across all object are distinct.
This model admits all the inconsistencies associated with even-
tual consistency. The goal of this work is to identify the precise
consistency level for each operation such that application-level con-
straints are not violated. In the next section, we will concretely
describe the challenges associated with constructing a consistent
bank account on top of an ECDS. Subsequently, we will illustrate
how our contract and specification language, armed with the primi-
tive relations vis, so, sameobj and oper, mitigates these challenges.
3. Motivation
Consider how we might implement a highly available bank account
on top of an ECDS, with the integrity constraint that the balance
must be non-negative. We begin by implementing a bank account
replicated data type (RDT) in QUELEA, and then describe the
mechanisms to obtain the desired correctness guarantees.
3.1 RDT Specification
A key novelty in QUELEA is that it allows the addition of new
RDTs to the store, which obviates the need for coercing application
logic to utilize store-provided data types. In addition, QUELEA
treats the convergence semantics (i.e., how conflicting updates
are resolved) of the data type separately from its consistency
properties (i.e., when updates become visible). This separation of

data Acc = Deposit Int | Withdraw Int | GetBal
getBalance :: [Acc] () (Int, Maybe Acc)
getBalance hist _ =
let res = sum [x | Deposit x hist] -
sum [x | Withdraw x hist]
in (res, Nothing)
deposit :: [Acc] Int ((), Maybe Acc)
deposit hist amt = ((), Just $ Deposit amt)
withdraw :: [Acc] Int (Bool, Maybe Acc)
withdraw hist v =
if sel1 $ getBalance hist () v
then (True, Just $ Withdraw v)
else (False, Nothing)
Figure 2: Definition of a bank account expressed in Quelea.
concerns permits operational reasoning for conflict resolution, and
declarative reasoning for consistency. The combination of these
techniques enhances the programmability of the store.
Let us assume that the bank account object provides three opera-
tions:
deposit
,
withdraw
and
getBalance
, with the assumption
that the withdraw fails if the account has insufficient balance. Every
operation in QUELEA is of the following type, written in Haskell
syntax:
type Operation e a r = [e] a (r, Maybe e)
An operation takes a list of effects (the history of updates to that
object), and an input argument, and returns a result along with
an optional effect (read-only operations return
Nothing
). The
new effect (if emitted) is added to the state of the object at the
current replica, and asynchronously sent to other replicas. The
implementation of the bank account operations in QUELEA is given
in Figure 2.
The datatype
Acc
represents the effect type for the bank account.
The function
sum
returns the sum of elements in the list, and
sel1
returns the first element of a tuple. For each operation,
hist
is a
snapshot of the state of the object at some replica. In this sense, every
operation on the RDT is atomic, and thus amenable to sequential
reasoning. Here,
getBalance
is a read-only operation,
deposit
always emits an effect, and
withdraw
only emits an effect if there
is sufficient balance in the account. We have implemented a large
corpus of RDTs for realistic benchmarks including shopping carts,
auction and micro-blogging sites, etc. in a few tens of lines of code,
expressed in this style.
3.1.1 Summarization
Observe that the definition of
getBalance
reduces over the entire
history of updates to an account. If we are to realize an efficient
implementation of this bank account RDT, we need a summary of the
account history. Intuitively, the current account balance summarizes
the state of an account. A bank account with the history
[Deposit
10, Withdraw 5]
is observably equivalent to a bank account with
a single deposit operation
[Deposit 5]
; we can replace the earlier
history with the latter and a client of the store would not able to tell
the difference between the two.
This notion of observable equivalence can be generalized to other
RDTs as well. For example, a last-writer-wins register with multiple
updates is equivalent to a register with only the last write. Similarly,
a set with a collection of add and remove operations is equivalent to
a set with a series of additions of live elements from the original set.
Since the notion of observable equivalence is specific to each RDT,
programmers can provide a summarization function - of type
[e]
-> [e]
- as a part of the RDT specification. The summarization
function for the bank account is:
summarize hist =
[Deposit $ sel1 $ getBalance hist ()]
Given a bank account history
hist
, the
summarize
function returns
a new history with a single deposit of the current account balance.
Our implementation invokes the summarization function associated
with an RDT to reduce the size of the effect sets maintained by
replicas.
3.2 Anomalies under Eventual Consistency
Our goal is to choose the correct consistency level for each of
the bank account operations such that (1) the balance remains non-
negative and (2) the
getBalance
operation never incorrectly returns
a negative balance.
Session 1
withdraw (70)
Session 2
vis
getBalance -50
withdraw (80)
deposit (100)
vis so
vis
vis so
(a) Unsafe withdraw
deposit (100)
Session 1
withdraw (50)
Session 2
getBalance -50
Session 3
vis
vis
(b) Negative balance
deposit (100)
withdraw (50)
getBalance 100
vis, so
so
vis
Session 1
(c) Missing update
Figure 3:
Anomalies possible under eventual consistency for the
get balance operation.
Consider the execution shown in Figure 3(a). Assume that all
operations in the figure are on the same bank account object with
the initial balance being zero. Session 1 performs a
deposit
of 100,
followed by a
withdraw
of 80 in the same session. The
withdraw
operation witnesses the deposit and succeeds
1
. Subsequently, session
2 perform a
withdraw
operation, but importantly, due to eventual
consistency, only witnesses the
deposit
from session 1, but not the
subsequent withdraw. Hence, this
withdraw
also incorrectly suc-
ceeds, violating the integrity constraint. A subsequent
getBalance
operation, that happens to witness all the previous operations, would
report a negative balance.
It is easy to see that preventing concurrent
withdraw
opera-
tions eliminates this anomaly. This can be done by insisting that
withdraw
be executed as a strongly consistent operation. Despite
this strengthening, the getBalance operation may still incorrectly
report a negative balance to the user. Consider the execution shown
in fig. 3(b), which consists of three concurrent sessions performing a
deposit
, a
withdraw
, and a
getBalance
operation, respectively,
on the same bank account object. As the vis edge indicates, operation
withdraw(50)
in session 2 witnesses the effects of
deposit(100)
from session 1, concludes that there is sufficient balance, and com-
pletes successfully. However, the
getBalance
operation may only
witness this successful withdraw, but not the causally preceding
deposit, and reports the balance of negative 50 to the user.
Under eventual consistency, the users may also be exposed to
other forms of inconsistencies. Figure 3(c) shows an execution
where the
getBalance
operation in a session does not witness
the effects of an earlier
withdraw
operation performed in the
same session, possibly because it was served by a replica that has
1
Although visibility and session order relations relate effects, we have abused
the notation in these examples to relate operations, with the idea that the
relations relate the effect emitted by those operations.

not yet merged the
withdraw
effect. This anomaly leads the user
to incorrectly conclude that the
withdraw
operation failed to go
through.
Although it is easy to understand the reasons behind the occur-
rence of the aforementioned anomalies, finding the appropriate fixes
is not readily apparent. Making
getBalance
a strongly consistent
operation is definitely sufficient to avert anomalies, but is it really
necessary? Given the cost of enforcing strong consistency [25, 28],
it is preferable to avoid imposing such stringent conditions unless
there are no viable alternatives. Exploring the space of these alterna-
tives requires understanding the subtle differences in semantics of
various kinds of weak consistency alternatives.
3.3 Contracts
QUELEA helps facilitate the mapping of operations to appropriate
consistency levels by letting the programmer declare application-
level consistency constraints as contracts
2
(Figure 4) that axiomati-
cally specify the set of allowed executions involving this operation.
In the case of the bank account, any execution that does not exhibit
the anomalies described in the previous section is a well-formed
execution on the bank account object. By specifying the set of legal
executions for each data type in terms of a trace of operation invo-
cations on that type, QUELEA ensures that all executions over that
type are well-formed.
In our running example, it is clear that in order to preserve
the critical integrity constraint, the
withdraw
operation must be
strongly consistent. That is, given two
withdraw
operations
a
and
b
, either
a
is visible to
b
or vice-versa. We express this application-
level consistency requirement as a contract (ψ
w
) over withdraw:
(a : withdraw).
sameobj(a, ˆη) a = ˆη vis(a, ˆη) vis(ˆη, a)
Here,
ˆη
stands for the effect emitted by the
withdraw
operation.
The syntax
a : withdraw
states that
a
is an effect emitted by a
withdraw
operation i.e.,
oper(a, withdraw)
holds. The contract
specifies that if the current operation emits an effect
ˆη
, then for
any operation
a
which was emitted by a
withdraw
operation, it
is the case that
a = ˆη
or
a
is visible to
ˆη
, or vice versa. Any
execution on a bank account object that preserves the above contract
for a
withdraw
operation is said to be derived from a correct
implementation of withdraw.
To prevent
getBalance
from ever showing a negative balance,
it is necessary to prevent the scenario depicted in Figure 3(b).
Let
ˆη
stand for the effect emitted by the
getBalance
operation.
If the effect (
b
) of a withdraw operation is visible to
ˆη
, and the
effect (
a
) of a deposit operation is visible to the effect (
b
) of the
withdraw operation, then it must be the case that
a
is also visible
to
ˆη
. A contract (
ψ
1
gb
) for
getBalance
that precisely captures this
application-level consistency requirement can be written thus:
(a : deposit), (b : withdraw).
(vis(a, b) vis(b, ˆη) vis(a, ˆη))
To prevent the missing update anomaly described in Figure 3(c), it is
necessary for a
getBalance
operation on a bank account to witness
the effects of all previous
deposit
and
withdraw
operations
performed on the same bank account in the same session. We can
express an additional contract (
ψ
2
gb
) for
getBalance
that captures
this consistency requirement:
(c : deposit withdraw).
((so sameobj)(c, ˆη) vis(c, ˆη))
Our contract language provides operators to compose relations.
The syntax
(R
1
R
2
)(a, b)
is equivalent to
R
1
(a, b) R
2
(a, b)
.
2
QUELEA exposes the contract construction language as a Haskell library
x, y, ˆη EffVar Op OperName
ψ Contract ::= (x : τ ) | x.ψ | π
τ EffType ::= Op | τ τ
π Prop ::= true | R(x, y) | π π
| π π | π π
R Relation ::= vis | so | sameobj | =
| R R | R R | R
+
Figure 4: Contract language.
The above contract (
ψ
2
gb
) says that if a
deposit
or a
withdraw
operation precedes a
getBalance
operation in session order, and
is applied on the same object as the
getBalance
operation, then
it must be the case that the
getBalance
operation witnesses the
effects of the preceding operations.
The final contract (
ψ
gb
) of the
getBalance
operation is merely
a conjunction of the previous two versions (ψ
1
gb
and ψ
2
gb
):
(a : deposit), (b : withdraw), (c : deposit withdraw).
(vis(a, b) vis(b, ˆη) vis(a, ˆη))
((so sameobj)(c, ˆη) vis(c, ˆη))
Intuitively, this specification prohibits both the
getBalance
anoma-
lies described in Figures. 3(b) and 3(c) from occurring.
Finally, since there are no restrictions on when or how a
deposit
operation can execute, its contract is simply true.
3.4 From Contracts to Implementation
Notice that the contracts for
withdraw
and
getBalance
only
express application-level consistency requirements, and make no
reference to the semantics of the underlying store. To write contracts,
a programmer only needs to reason about the semantics of the
application under the QUELEA system model. The mapping of
application-level consistency requirements to appropriate store-level
guarantees is done automatically behind-the-scene. How might
one go about ensuring that an execution adheres to a contract?
The challenge is that a contract provides a declarative (axiomatic)
specification of an execution, while what is required is an operational
procedure for enforcing its implicit constraints.
One strategy would be to execute operations speculatively. Here,
operations are tentatively applied as they are received from the
client or other replicas. We can maintain a runtime manifestation
of executions, and check well-formedness conditions at runtime,
rolling back executions if they are ill-formed. However, the overhead
of state maintenance and the complexity of user-defined contracts is
likely to make this technique infeasible in practice.
We devise a static approach instead. Contracts are analyzed with
the help of a theorem prover, and statically mapped to a particular
store-level consistency property that the prover guarantees preserves
contract semantics. We call this procedure contract classification.
Given the variety and complexity of store level consistency prop-
erties, the idea is that the system implementer parameterizes the
classification procedure by describing the store semantics in the
same contract language as the one used to express the contract on
the operations. In the next section, we describe the contract language
in detail and describe the classification procedure for a particular
store semantics.
4. Contract Language
4.1 Syntax
The syntax of our core contract language is shown in Figure 4. The
language is based on first-order logic (FOL), and admits prenex
universal quantification over typed and untyped effect variables.
We use a special effect variable (
ˆη
) to denote the effect of current

η Effect ψ Contract η Effect Set
A EffSoup ::= η
vis, so, sameobj Relations ::= A × A
E ExecState ::= (A,vis,so,sameobj)
Figure 5: Axiomatic execution.
operation - the operation for which a contract is being written.
Notice that
ˆη
occurs free in the contract. We fix its scope when
classifying contracts 4.4). The type of an effect is simply the name
of the operation (eg:
withdraw
) that induced the effect. We admit
disjunction in types to let an effect variable range over multiple
operation names. The contract
(a : τ
1
τ
2
). ψ
is just syntactic
sugar for
a.(oper(a, τ
1
) oper(a, τ
2
)) ψ
. An untyped effect
variable ranges over all operation names.
Quantifier-free propositions in our contract language are con-
junctions, disjunctions and implications of predicates expressing
relations between pairs of effect variables. The syntactic class of
relations is seeded with primitive
vis
,
so
, and
sameobj
relations,
and also admits derived relations that are expressible as union,
intersection, or transitive closure
3
of primitive relations. Com-
monly used derived relations are the same object session order
(
soo = so sameobj
), happens-before order (
hb = (so vis)
+
)
and the same object happens-before order (hbo = (soo vis)
+
).
4.2 Semantics
QUELEA contracts are constraints over axiomatic definitions of
program executions. Figure 5 summarizes artifacts relevant to define
an axiomatic execution. We formalize an axiomatic execution as a
tuple
(A,vis,so,sameobj)
, where
A
, called the effect soup, is the
set of all effects generated during the program execution, and
vis, so, sameobj A × A
are visibility, session order, and same
object relations, respectively, witnessed over generated effects at
run-time.
Note that the axiomatic definition of an execution (
E
) provides
interpretations for primitive relations (eg:
vis
) that occur free in
contract formulas, and also fixes the domain of quantification to set
of all effects (
A
) observed during the program execution. As such,
E
is a potential model for any first-order formula (
ψ
) expressible in
our contract language. If
E
is indeed a valid model for
ψ
(written as
E |= ψ), we say that the execution E satisfied the contract ψ:
Definition 1.
An axiomatic execution
E
satisfies a contract
ψ
if and
only if E |= ψ.
4.3 Capturing Store Semantics
An important aspect of our contract language is its ability to capture
store-level consistency guarantees, along with application-level
consistency requirements. Similar to [
10
], we can rigorously define
a wide variety of store semantics including those that combine any
subset of session and causality guarantees, and multiple consistency
levels. However, for our purposes, we identify three particular
consistency levels eventual, causal, and strong, commonly offered
by many distributed stores with tunable consistency, with increasing
overhead in terms of latency and availability.
Eventual consistency
: Eventually consistent operations can be
satisfied as long as the client can reach at least one replica. In
the bank account example,
deposit
is an eventually consistent
operation. While an ECDS typically offers basic eventual con-
sistency with all possible anomalies, we assume that our store
3
Strictly speaking,
R
+
is not the transitive closure of
R
, as transitive closure
is not expressible in FOL. Instead,
R
+
in our language denotes a superset of
transitive closure of
R
. Formally,
R
+
is any relation
R
0
such that forall
x
,
y
,
and z, a) R(x, y) R
0
(x, y), and b) R
0
(x, y) R
0
(y, z) R
0
(x, z).
ψ ψ
sc
WellFormed(ψ)
ψ ψ
ec
EventuallyConsistent(ψ)
ψ 6≤ ψ
ec
ψ ψ
cc
CausallyConsistent(ψ)
ψ 6≤ ψ
cc
ψ ψ
sc
StronglyConsistent(ψ)
Figure 6: Contract classification.
provides stronger semantics that remain highly-available [
2
,
19
];
the store always exposes a causal cut of the updates. This seman-
tics can be formally captured in terms of the following contract
definition:
ψ
ec
= a, b. hbo(a, b) vis(b, ˆη) vis(a, ˆη)
The above contract mandates that an effect
a
must be visible to
the current effect
ˆη
if some effect
b
that causally succeeds
a
is
also visible to
ˆη
. Thus, if every store replica always maintains
and exposes a causal cut of updates to the client, then such a
store will satisfy this contract. In such a system, an eventually
consistent operation, such as
deposit
, which requires weaker
guarantees than offered by the store, can be satisfied as long as
some replica is reachable.
Causal consistency
: Causally consistent operations are required
to see a causally consistent snapshot of the object state, including
the actions performed on the same session. The latter require-
ment implies that if two operations
o
1
and
o
2
from the same
session are applied to two different replicas
r
1
and
r
2
, the second
operation cannot be discharged until the effect of
o
1
is included
in
r
2
. The
getBalance
operation requires causal consistency,
as it requires operations from the same session to be visible,
which cannot be guaranteed under eventual consistency. The
corresponding store semantics is captured by the contract
ψ
cc
defined below:
ψ
cc
= a. hbo(a, ˆη) vis(a, ˆη)
Strong consistency
: Strongly consistent operations may block
indefinitely under network partitions. An example is the total-
order contract on
withdraw
operation. The corresponding store
semantics is captured by the ψ
sc
contract definition:
ψ
sc
= a. sameobj(a, ˆη) vis(a, ˆη) vis(ˆη, a) a = ˆη
4.4 Contract Classification
Our goal is to map application-level consistency constraints on
operations to appropriate store-level consistency guarantees capable
of satisfying these constraints. The ability to express both these
kinds of constraints as contracts in our contract language lets us
compare and determine if contract (
ψ
op
) of an operation (
op
) is
weak enough to be satisfied under a store consistency level identified
by the contract
ψ
st
. Towards this end, we define a binary weaker
than relation for our contract language as following:
Definition 2.
A contract
ψ
op
is said to be weaker than
ψ
st
(written
ψ
op
ψ
st
) if and only if ` ˆη.ψ
st
ψ
op
.
The quantifier in the sequent binds
ˆη
that occurs free in
ψ
st
and
ψ
op
. The context (
) of the sequent is a conjunction of assumptions
about the nature of primitive relations. A well-formed axiomatic
execution (
E
) is expected to satisfy these assumptions (i.e.,
E |=
).
Definition 3.
An axiomatic execution
E = (A,vis,so,sameobj)
is
well-formed if the following axioms () hold:
The happens-before relation is acyclic: a. ¬hbo(a, a).
Visibility only relates actions on the same object:

Citations
More filters
Proceedings ArticleDOI
11 Jan 2016
TL;DR: This work proposes the first proof rule for establishing that a particular choice of consistency guarantees for various operations on a replicated database is enough to ensure the preservation of a given data integrity invariant.
Abstract: Large-scale distributed systems often rely on replicated databases that allow a programmer to request different data consistency guarantees for different operations, and thereby control their performance. Using such databases is far from trivial: requesting stronger consistency in too many places may hurt performance, and requesting it in too few places may violate correctness. To help programmers in this task, we propose the first proof rule for establishing that a particular choice of consistency guarantees for various operations on a replicated database is enough to ensure the preservation of a given data integrity invariant. Our rule is modular: it allows reasoning about the behaviour of every operation separately under some assumption on the behaviour of other operations. This leads to simple reasoning, which we have automated in an SMT-based tool. We present a nontrivial proof of soundness of our rule and illustrate its use on several examples.

136 citations


Cites background from "Declarative programming over eventu..."

  • ...[41] have proposed a static analysis that automatically chooses consistency levels in a replicated database given programmer-supplied contracts....

    [...]

Journal ArticleDOI
TL;DR: This article provides a structured and comprehensive overview of different consistency notions that appeared in distributed systems, and in particular storage systems research, in the last four decades, and defines precisely many of these, in particular where the previous definitions were ambiguous.
Abstract: Over the years, different meanings have been associated with the word consistency in the distributed systems community. While in the ’80s “consistency” typically meant strong consistency, later defined also as linearizability, in recent years, with the advent of highly available and scalable systems, the notion of “consistency” has been at the same time both weakened and blurred. In this article, we aim to fill the void in the literature by providing a structured and comprehensive overview of different consistency notions that appeared in distributed systems, and in particular storage systems research, in the last four decades. We overview more than 50 different consistency notions, ranging from linearizability to eventual and weak consistency, defining precisely many of these, in particular where the previous definitions were ambiguous. We further provide a partial order among different consistency predicates, ordering them by their semantic “strength,” which we believe will be useful in future research. Finally, we map the consistency semantics to different practical systems and research prototypes. The scope of this article is restricted to non-transactional semantics, that is, those that apply to single storage object operations. As such, our article complements the existing surveys done in the context of transactional, database consistency semantics.

118 citations

Journal ArticleDOI
01 Jan 2017
TL;DR: To achieve truly scalable operation, distributed concurrency control solutions must seek a tighter coupling with either novel network hardware or applications (via data modeling and semantically-aware execution), or both.
Abstract: Increasing transaction volumes have led to a resurgence of interest in distributed transaction processing. In particular, partitioning data across several servers can improve throughput by allowing servers to process transactions in parallel. But executing transactions across servers limits the scalability and performance of these systems.In this paper, we quantify the effects of distribution on concurrency control protocols in a distributed environment. We evaluate six classic and modern protocols in an in-memory distributed database evaluation framework called Deneva, providing an apples-to-apples comparison between each. Our results expose severe limitations of distributed transaction processing engines. Moreover, in our analysis, we identify several protocol-specific scalability bottlenecks. We conclude that to achieve truly scalable operation, distributed concurrency control solutions must seek a tighter coupling with either novel network hardware (in the local area) or applications (via data modeling and semantically-aware execution), or both.

105 citations


Cites background from "Declarative programming over eventu..."

  • ...Related results from the systems [8] and programming languages [48] communities show similar promise....

    [...]

Journal ArticleDOI
TL;DR: This article provides a comprehensive taxonomy that covers key aspects of cloud-based data store: data model, data dispersion, data consistency, data transaction service, and data management cost.
Abstract: Storage as a Service (StaaS) is a vital component of cloud computing by offering the vision of a virtually infinite pool of storage resources. It supports a variety of cloud-based data store classes in terms of availability, scalability, ACID (Atomicity, Consistency, Isolation, Durability) properties, data models, and price options. Application providers deploy these storage classes across different cloud-based data stores not only to tackle the challenges arising from reliance on a single cloud-based data store but also to obtain higher availability, lower response time, and more cost efficiency. Hence, in this article, we first discuss the key advantages and challenges of data-intensive applications deployed within and across cloud-based data stores. Then, we provide a comprehensive taxonomy that covers key aspects of cloud-based data store: data model, data dispersion, data consistency, data transaction service, and data management cost. Finally, we map various cloud-based data stores projects to our proposed taxonomy to validate the taxonomy and identify areas for future research.

82 citations


Cites background from "Declarative programming over eventu..."

  • ...Unlike Bloom, QUELEA language (Sivaramakrishnan et al. 2015) maps operations to a fine-grained consistency levels such as eventual, causal, and ordering and transaction isolation levels like read committed (RC), repeatable read (RR) (Berenson et al. 1995), and monotonic atomic view (MAV) (Bailis…...

    [...]

  • ...Unlike Bloom, QUELEA language [Sivaramakrishnan et al. 2015] maps operations to a fine-grained consistency levels such as eventual, causal, and ordering and transaction isolation levels like read committed (RC), repeatable read (RR) [Berenson et al....

    [...]

Proceedings ArticleDOI
14 Jul 2015
TL;DR: Given reasonable models of node-to-node communications and node failures, it is proved formally that a Lasp program can be considered as a functional program that supports functional reasoning and programming techniques.
Abstract: We propose Lasp, a new programming model designed to simplify large-scale distributed programming. Lasp combines ideas from deterministic dataflow programming together with conflict-free replicated data types (CRDTs). This provides support for computations where not all participants are online together at a given moment. The initial design presented here provides powerful primitives for composing CRDTs, which lets us write long-lived fault-tolerant distributed applications with nonmonotonic behavior in a monotonic framework. Given reasonable models of node-to-node communications and node failures, we prove formally that a Lasp program can be considered as a functional program that supports functional reasoning and programming techniques. We have implemented Lasp as an Erlang library built on top of the Riak Core distributed systems framework. We have developed one nontrivial large-scale application, the advertisement counter scenario from the SyncFree research project. We plan to extend our current prototype into a general-purpose language in which synchronization is used as little as possible.

67 citations

References
More filters
Proceedings ArticleDOI
14 Oct 2007
TL;DR: D Dynamo is presented, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience and makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.
Abstract: Reliability at massive scale is one of the biggest challenges we face at Amazon.com, one of the largest e-commerce operations in the world; even the slightest outage has significant financial consequences and impacts customer trust. The Amazon.com platform, which provides services for many web sites worldwide, is implemented on top of an infrastructure of tens of thousands of servers and network components located in many datacenters around the world. At this scale, small and large components fail continuously and the way persistent state is managed in the face of these failures drives the reliability and scalability of the software systems.This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience. To achieve this level of availability, Dynamo sacrifices consistency under certain failure scenarios. It makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.

4,349 citations


"Declarative programming over eventu..." refers background in this paper

  • ..., negative balances in a bank account, or having an item appear in a shopping cart after it has been removed [13])....

    [...]

Journal ArticleDOI
TL;DR: This paper defines linearizability, compares it to other correctness conditions, presents and demonstrates a method for proving the correctness of implementations, and shows how to reason about concurrent objects, given they are linearizable.
Abstract: A concurrent object is a data object shared by concurrent processes. Linearizability is a correctness condition for concurrent objects that exploits the semantics of abstract data types. It permits a high degree of concurrency, yet it permits programmers to specify and reason about concurrent objects using known techniques from the sequential domain. Linearizability provides the illusion that each operation applied by concurrent processes takes effect instantaneously at some point between its invocation and its response, implying that the meaning of a concurrent object's operations can be given by pre- and post-conditions. This paper defines linearizability, compares it to other correctness conditions, presents and demonstrates a method for proving the correctness of implementations, and shows how to reason about concurrent objects, given they are linearizable.

3,396 citations


"Declarative programming over eventu..." refers background in this paper

  • ...Traditionally programmers have relied on strong consistency guarantees such as linearizability [15] or serializability [22] in order to build correct applications....

    [...]

Proceedings ArticleDOI
Brian F. Cooper1, Adam Silberstein1, Erwin Tam1, Raghu Ramakrishnan1, Russell Sears1 
10 Jun 2010
TL;DR: This work presents the "Yahoo! Cloud Serving Benchmark" (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems, and defines a core set of benchmarks and reports results for four widely used systems.
Abstract: While the use of MapReduce systems (such as Hadoop) for large scale data analysis has been widely recognized and studied, we have recently seen an explosion in the number of systems developed for cloud data serving. These newer systems address "cloud OLTP" applications, though they typically do not support ACID transactions. Examples of systems proposed for cloud serving use include BigTable, PNUTS, Cassandra, HBase, Azure, CouchDB, SimpleDB, Voldemort, and many others. Further, they are being applied to a diverse range of applications that differ considerably from traditional (e.g., TPC-C like) serving workloads. The number of emerging cloud serving systems and the wide range of proposed applications, coupled with a lack of apples-to-apples performance comparisons, makes it difficult to understand the tradeoffs between systems and the workloads for which they are suited. We present the "Yahoo! Cloud Serving Benchmark" (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems. We define a core set of benchmarks and report results for four widely used systems: Cassandra, HBase, Yahoo!'s PNUTS, and a simple sharded MySQL implementation. We also hope to foster the development of additional cloud benchmark suites that represent other classes of applications by making our benchmark tool available via open source. In this regard, a key feature of the YCSB framework/tool is that it is extensible--it supports easy definition of new workloads, in addition to making it easy to benchmark new systems.

3,276 citations


"Declarative programming over eventu..." refers methods in this paper

  • ...Our client workload was generated using the YCSB benchmark [12]....

    [...]

Journal ArticleDOI
TL;DR: Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure.
Abstract: Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure. Cassandra aims to run on top of an infrastructure of hundreds of nodes (possibly spread across different data centers). At this scale, small and large components fail continuously. The way Cassandra manages the persistent state in the face of these failures drives the reliability and scalability of the software systems relying on this service. While in many ways Cassandra resembles a database and shares many design and implementation strategies therewith, Cassandra does not support a full relational data model; instead, it provides clients with a simple data model that supports dynamic control over data layout and format. Cassandra system was designed to run on cheap commodity hardware and handle high write throughput while not sacrificing read efficiency.

2,870 citations


"Declarative programming over eventu..." refers background or methods in this paper

  • ...Operation-based RDTs have been widely studied in terms of their algorithmic properties [10, 25], and several systems utilize this model to construct distributed data structures [5, 17, 23]....

    [...]

  • ...For our performance evaluation, we deploy QUELEA applications in clusters, where each cluster is composed of five fully replicated Cassandra replicas within the same datacenter....

    [...]

  • ...The lease mechanism is implemented with the help of Cassandra’s support for conditional updates and expiring columns....

    [...]

  • ...Although an application might be naturally expressed in terms of well-understood and expressive data types such as maps, trees, queues, or graphs, geo-distributed stores typically only provide a minimal set of data types with in-built conflict resolution strategies such as last-writer-wins (LWW) registers, counters, and sets [17, 26]....

    [...]

  • ...Categories and Subject Descriptors D.1.3 [Concurrent Programming]: Distributed Programming; C.2.4 [Distributed Systems]: Distributed databases; D.3.2 [Language Classifications]: Applicative (Functional) Languages; F.3.1 [Logics and Meanings of Programs]: Specifying and Verifying and Reasoning about Programs General Terms Languages, Performance Keywords Eventual Consistency, Availability, CRDTs, Axiomatic Contracts, Contract Classification, Distributed Transactions, SMT solvers, Decidable Logic, Quelea, Cassandra, Haskell ∗ This work was done at Purdue University, USA....

    [...]

Journal ArticleDOI
TL;DR: In this paper, it is shown that it is impossible to achieve consistency, availability, and partition tolerance in the asynchronous network model, and then solutions to this dilemma in the partially synchronous model are discussed.
Abstract: When designing distributed web services, there are three properties that are commonly desired: consistency, availability, and partition tolerance. It is impossible to achieve all three. In this note, we prove this conjecture in the asynchronous network model, and then discuss solutions to this dilemma in the partially synchronous model.

1,456 citations

Frequently Asked Questions (7)
Q1. What are the contributions in "Declarative programming over eventually consistent data stores" ?

In this paper, the authors present QUELEA, a declarative programming model for eventually consistent data stores ( ECDS ), equipped with a contract language, capable of specifying fine-grained applicationlevel consistency properties. The authors describe an implementation of QUELEA on top of an off-the-shelf ECDS that provides support for coordination-free transactions. Several benchmarks including two large web applications, illustrate the effectiveness of their approach. 

Quantifier-free propositions in their contract language are conjunctions, disjunctions and implications of predicates expressing relations between pairs of effect variables. 

The authors can maintain a runtime manifestation of executions, and check well-formedness conditions at runtime, rolling back executions if they are ill-formed. 

The authors use 128 clients and a single QUELEA replica, with all clients operating on the same LWW register to stress test the summarization mechanism. 

The effect added to a particular replica is asynchronously sent to other replicas, and eventually merged into all other replicas. 

In a 2DC configuration (not shown here), the average latency of SC operations with 512 clients increased by 9.4× due to the cost of geo-distributed coordination, whereas QUELEA operations were only 2.2× slower, mainly due to the increased cost of withdraw operations. 

If the contract (ψop) of an operation (op) is weaker than a store contract (ψst), then constraints expressed by the former are implied by guarantees provided by the latter.