What future works have the authors mentioned in the paper "Replicated data types: specification, verification, optimality" ?

Although their work marks a big step forward, it is only a beginning, and creates plenty of opportunities for future research. In the future the authors would also like to study more data types, such as lists used for collaborative editing [ 32 ], and to investigate metadata bounds for data type implementations other than state-based ones, including more detailed overhead metrics capturing optimizations invisible to the worst-case overhead analysis. Finally, by bringing together prior work on shared-memory models and data replication, the authors wish to promote an exchange of ideas and results between the research communities of programming languages and verification on one side and distributed systems on the other.

(Open Access) Replicated data types: specification, verification, optimality (2014) | Sebastian Burckhardt

Q: What are the contributions in "Replicated data types: specification, verification, optimality" ?

To fill in this gap, the authors propose a framework for specifying replicated data types using relations over events and verifying their implementations using replication-aware simulations. The authors also present a novel technique for obtaining lower bounds on the worstcase space overhead of data type implementations and use it to prove optimality of 4 implementations. Finally, the authors show how to specify consistency of replicated stores with multiple objects axiomatically, in analogy to prior work on weak memory models.

HAL Id: hal-00934311

https://hal.inria.fr/hal-00934311

Submitted on 21 Jan 2014

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Replicated Data Types: Specication, Verication,

Optimality

Sebastian Burckhardt, Alexey Gotsman, Hongseok Yang, Marek Zawirski

To cite this version:

Sebastian Burckhardt, Alexey Gotsman, Hongseok Yang, Marek Zawirski. Replicated Data Types:

Specication, Verication, Optimality. POPL 2014: 41st ACM SIGPLAN-SIGACT Symposium

on Principles of Programming Languages, Jan 2014, San Diego, CA, United States. pp.271-284,

�10.1145/2535838.2535848�. �hal-00934311�

Replicated Data Types: Speciﬁcation, Veriﬁcation, Optimality

Sebastian Burckhardt

Microsoft Research

Alexey Gotsman

IMDEA Software Institute

Hongseok Yang

University of Oxford

Marek Zawirski

INRIA & UPMC-LIP6

Abstract

Geographically distributed systems often rely on replicated eventu-

ally consistent data stores to achieve availability and performance.

To resolve conﬂicting updates at different replicas, researchers

and practitioners have proposed specialized consistency protocols,

called replicated data types, that implement objects such as reg-

isters, counters, sets or lists. Reasoning about replicated data types

has however not been on par with comparable work on abstract data

types and concurrent data types, lacking speciﬁcations, correctness

proofs, and optimality results.

To ﬁll in this gap, we propose a framework for specifying repli-

cated data types using relations over events and verifying their im-

plementations using replication-aware simulations. We apply it to

7 existing implementations of 4 data types with nontrivial conﬂict-

resolution strategies and optimizations (last-writer-wins register,

counter, multi-value register and observed-remove set). We also

present a novel technique for obtaining lower bounds on the worst-

case space overhead of data type implementations and use it to

prove optimality of 4 implementations. Finally, we show how to

specify consistency of replicated stores with multiple objects ax-

iomatically, in analogy to prior work on weak memory models.

Overall, our work provides foundational reasoning tools to support

research on replicated eventually consistent stores.

Categories and Subject Descriptors D.2.4 [Software Engineer-

ing]: Software/Program Veriﬁcation; F.3.1 [Logics and Meanings

of Programs]: Specifying and Verifying and Reasoning about Pro-

grams

Keywords Replication; eventual consistency; weak memory

1. Introduction

To achieve availability and scalability, many networked computing

systems rely on replicated stores, allowing multiple clients to issue

operations on shared data on a number of replicas, which commu-

nicate changes to each other using message passing. For example,

large-scale Internet services rely on geo-replication, which places

data replicas in geographically distinct locations, and applications

for mobile devices store replicas locally to support ofﬂine use. One

beneﬁt of such architectures is that the replicas remain locally avail-

able to clients even when network connections fail. Unfortunately,

the famous CAP theorem [19] shows that such high Availability

and tolerance to network Partitions are incompatible with strong

Consistency, i.e., the illusion of a single centralized replica han-

dling all operations. For this reason, modern replicated stores often

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full citation

on the ﬁrst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a

fee. Request permissions from permissions@acm.org.

POPL ’14, January 22–24, 2014, San Diego, CA, USA.

 2014 ACM 978-1-4503-2544-8/14/01. . . $15.00.

http://dx.doi.org/10.1145/2535838.2535848

provide weaker forms of consistency, commonly dubbed eventual

consistency [36]. ‘Eventual’ usually refers to the guarantee that

if clients stop issuing update requests, then the replicas

will eventually reach a consistent state.

(1)

Eventual consistency is a hot research area, and new replicated

stores implementing it appear every year [1, 13, 16, 18, 23, 27,

33, 34, 37]. Unfortunately, their semantics is poorly understood:

the very term eventual consistency is a catch-all buzzword, and

different stores claiming to be eventually consistent actually pro-

vide subtly different guarantees. The property (1), which is a form

of quiescent consistency, is too weak to capture these. Although

it requires the replicas to converge to the same state eventually, it

doesn’t say which one it will be. Furthermore, (1) does not provide

any guarantees in realistic scenarios when updates never stop ar-

riving. The difﬁculty of reasoning about the behavior of eventually

consistent stores comes from a multitude of choices to be made in

their design, some of which we now explain.

Allowing the replicas to be temporarily inconsistent enables

eventually consistent stores to satisfy clients’ requests from the

local replica immediately, and broadcast the changes to the other

replicas only after the fact, when the network connection permits

this. However, this means that clients can concurrently issue con-

ﬂicting operations on the same data item at different replicas; fur-

thermore, if the replicas are out-of-sync, these operations will be

applied to its copies in different states. For example, two users shar-

ing an online store account can write two different zip codes into

the delivery address; the same users connected to replicas with dif-

ferent views of the shopping cart can also add and concurrently

remove the same product. In such situations the store needs to en-

sure that, after the replicas exchange updates, the changes by dif-

ferent clients will be merged and all conﬂicts will be resolved in a

meaningful way. Furthermore, to ensure eventual consistency (1),

the conﬂict resolution has to be uniform across replicas, so that, in

the end, they converge to the same state.

The protocols achieving this are commonly encapsulated within

replicated data types [1, 10, 16, 18, 31, 33, 34] that implement ob-

jects, such as registers, counters, sets or lists, with various conﬂict-

resolution strategies. The strategies can be as simple as establishing

a total order on all operations using timestamps and letting the last

writer win, but can also be much more subtle. Thus, a data type

can detect the presence of a conﬂict and let the client deal with it:

e.g., the multi-value register used in Amazon’s Dynamo key-value

store [18] would return both conﬂicting zip codes in the above ex-

ample. A data type can also resolve the conﬂict in an application-

speciﬁc way. For example, the observed-remove set [7, 32] pro-

cesses concurrent operations trying to add and remove the same

element so that an add always wins, an outcome that may be appro-

priate for a shopping cart.

Replicated data type implementations are often nontrivial, since

they have to maintain not only client-observable object state, but

also metadata needed to detect and resolve conﬂicts and to han-

dle network failures. This makes reasoning about their behavior

challenging. The situation gets only worse if we consider multi-

ple replicated objects: in this case, asynchronous propagation of

updates between replicas may lead to counterintuitive behaviors—

anomalies, in database terminology. The following code illustrates

an anomaly happening in real replicated stores [1, 18]:

Replica r

→ x.wr(post)

i = y.rd // comment ← Replica r

y.wr(comment)

j = x.rd // empty

(2)

We have two clients reading from and writing to register objects x

and y at two different replicas; i and j are client-local variables.

The ﬁrst client makes a post by writing to x at replica r

and then

comments on the post by writing to y. After every write, replica r

might send a message with the update to replica r

. If the messages

carrying the writes of post to x and comment to y arrive to replica r

out of the order they were issued in, the second client can see the

comment, but not the post. Different replicated stores may allow

such an anomaly or not, and this has to be taken into account when

reasoning about them.

In this paper, we propose techniques for reasoning about even-

tually consistent replicated stores in the following three areas.

1. Speciﬁcation. We propose a comprehensive framework for

specifying the semantics of replicated stores. Its key novel com-

ponent is replicated data type speciﬁcations (§3), which provide

the ﬁrst way of specifying the semantics of replicated objects

with advanced conﬂict resolution declaratively, like abstract data

types [25]. We achieve this by deﬁning the result of a data type

operation not by a function of states, but of operation contexts—

sets of events affecting the result of the operation, together with

some relationships between them. We show that our speciﬁcations

are sufﬁciently ﬂexible to handle data types representing a variety

of conﬂict-resolution strategies: last-write-wins register, counter,

multi-value register and observed-remove set.

We then specify the semantics of a whole store with multiple

objects, possibly of different types, by consistency axioms (§7),

which constrain the way the store processes incoming requests in

the style of weak shared-memory models [2] and thus deﬁne the

anomalies allowed. As an illustration, we deﬁne consistency mod-

els used in existing replicated stores, including a weak form of

eventual consistency [1, 18] and different kinds of causal consis-

tency [23, 27, 33, 34]. We ﬁnd that, when specialized to last-writer-

wins registers, these speciﬁcations are very close to fragments of

the C/C++ memory model [5]. Thus, our speciﬁcation framework

generalizes axiomatic shared-memory models to replicated stores

with nontrivial conﬂict resolution.

2. Veriﬁcation. We propose a method for proving the correctness

of replicated data type implementations with respect to our speci-

ﬁcations and apply it to seven existing implementations of the four

data types mentioned above, including those with nontrivial opti-

mizations. Reasoning about the implementations is difﬁcult due to

the highly concurrent nature of a replicated store, with multiple

replicas simultaneously updating their object copies and exchang-

ing messages. We address this challenge by proposing replication-

aware simulations (§5). Like classical simulations from data reﬁne-

ment [21], these associate a concrete state of an implementation

with its abstract description—structures on events, in our case. To

combat the complexity of replication, they consider the state of an

object at a single replica or a message in transit separately and as-

sociate it with abstract descriptions of only those events that led to

it. Verifying an implementation then requires only reasoning about

an instance of its code running at a single replica.

Here, however, we have to deal with another challenge: code at

a single replica can access both the state of an object and a message

at the same time, e.g., when updating the former upon receiving the

latter. To reason about such code, we often need to rely on cer-

tain agreement properties correlating the abstract descriptions of

the message and the object state. Establishing these properties re-

quires global reasoning. Fortunately, we ﬁnd that agreement prop-

erties needed to prove realistic implementations depend only on ba-

sic facts about their messaging behavior and can thus be established

once for broad classes of data types. Then a particular implementa-

tion within such a class can be veriﬁed by reasoning purely locally.

By carefully structuring reasoning in this way, we achieve easy

and intuitive proofs of single data type implementations. We then

lift these results to stores with multiple objects of different types by

showing how consistency axioms can be proved given properties of

the transport layer and data type implementations (§7).

3. Optimality. Replicated data type designers strive to optimize

their implementations; knowing that one is optimal can help guide

such efforts in the most promising direction. However, proving

optimality is challengingly broad as it requires quantifying over all

possible implementations satisfying the same speciﬁcation.

For most data types we studied, the primary optimization target

is the size of the metadata needed to resolve conﬂicts or handle net-

work failures. To establish optimality of metadata size, we present

a novel method for proving lower bounds on the worst-case meta-

data overhead of replicated data types—the proportion of metadata

relative to the client-observable content. The main idea is to ﬁnd a

large family of executions of an arbitrary correct implementation

such that, given the results of data type operations from a certain

ﬁxed point in any of the executions, we can recover the previous

execution history. This implies that, across executions, the states at

this point are distinct and thus must have some minimal size.

Using our method, we prove that four of the implementations

we veriﬁed have an optimal worst-case metadata overhead among

all implementations satisfying the same speciﬁcation. Two of these

(counter, last-writer-wins register) are well-known; one (optimized

observed-remove set [6]) is a recently proposed nontrivial opti-

mization; and one (optimized multi-value register) is a small im-

provement of a known implementation [33] that we discovered dur-

ing a failed attempt to prove optimality of the latter. We summarize

all the bounds we proved in Fig. 10.

We hope that the theoretical foundations we develop will help

in exploring the design space of replicated data types and replicated

eventually consistent stores in a systematic way.

2. Replicated Data Types

We now describe our formal model for replicated stores and intro-

duce replicated data type implementations, which implement op-

erations on a single object at a replica and the protocol used by

replicas to exchange updates to this object. Our formalism follows

closely the models used by replicated data type designers [33].

A replicated store is organized as a collection of named ob-

jects Obj = {x, y, z, . . . }. Each object is hosted at all replicas

r, s ∈ Re p l ic a ID. The sets of objects and replicas may be inﬁnite,

to model their dynamic creation. Clients interact with the store by

performing operations on objects at a speciﬁed replica. Each ob-

ject x ∈ Obj has a type τ = type(x) ∈ Type, whose type signa-

ture (Op

, Val

) determines the set of supported operations Op

(ranged over by o) and the set of their return values Val

(ranged

over by a, b, c, d). We assume that a special value ⊥ ∈ Val

be-

longs to all sets Val

and is used for operations that return no

value. For example, we can deﬁne a counter data type ctr and

an integer register type intreg with operations for reading, incre-

menting or writing an integer a: Val

ctr

= Val

intreg

= Z ∪ {⊥},

ctr

= {rd, inc} and Op

intreg

= {rd}∪ {wr(a) | a ∈ Z}.

We also assume sets Message of messages (ranged over by m)

and timestamps Timestam p (ranged over by t). For simplicity, we

let timestamps be positive integers: Timestamp = N

DEFINITION 1. A replicated data type implementation for a data

type τ is a tuple D

= (Σ, ~σ

, M, do, send, receive), where ~σ

Figure 1. Illustrations of a concrete (a) and two abstract executions (b, c)

1: x.inc

2: send

3: receive

4: x.inc

5: send

6: receive

7: x.rd

8: receive

to r

(a)

1: x.inc

4: x.inc

7: x.rd: 1

vis

(b)

1: x.inc

4: x.inc

7: x.rd: 2

vis

(c)

vis

ReplicaID → Σ, M ⊆ Message and

do : Op

× Σ × Timestamp → Σ × Val

;

send : Σ → Σ × M ; receive : Σ × M → Σ.

We denote a component of D

, such as do, by D

.do. A tuple D

deﬁnes the class of implementations of objects with type τ, meant

to be instantiated for every such object in the store. Σ is the set of

states (ranged over by σ) used to represent the current state of the

object, including metadata, at a single replica. The initial state at

every replica is given by ~σ

provides three methods that the rest of the store implemen-

tation can call at a given replica; we assume that these methods

execute atomically. We visualize store executions resulting from re-

peated calls to the methods as in Fig. 1(a), by arranging the calls on

several vertical timelines corresponding to replicas at which they

occur and denoting the delivery of messages by diagonal arrows. In

§4, we formalize them as sequences of transitions called concrete

executions and deﬁne the store semantics by their sets; the intuition

given by Fig. 1(a) should sufﬁce for the following discussion.

A client request to perform an operation o ∈ Op

triggers the

call do(o, σ, t) (e.g., event 1 in Fig. 1(a)). This takes the current

state σ ∈ Σ of the object at the replica where the request is

issued and a timestamp t ∈ Timestamp provided by the rest of

the store implementation and produces the updated object state and

the return value of the operation. The data type implementation can

use the timestamp provided, e.g., to implement the last-writer-wins

conﬂict-resolution strategy mentioned in §1, but is free to ignore it.

Nondeterministically, in moments when the network is able to

accept messages, a replica calls send. Given the current state of the

object at the replica, send produces a message in M to broadcast to

all other replicas (event 2 in Fig. 1(a)); sometimes send also alters

the state of the object. Using broadcast rather than point-to-point

communication does not limit generality, since we can always tag

messages with the intended receiver. Another replica that receives

the message generated by send calls receive to merge the enclosed

update into its copy of the object state (event 3 in Fig. 1(a)).

We now reproduce three replicated data type implementations

due to Shapiro et al. [33].They fall into two categories: in op-

based implementations, each message carries a description of the

latest operations that the sender has performed, and in state-based

implementations, a description of all operations it knows about.

Op-based counter (ctr). Fig. 2(a) shows an implementation of

the ctr data type. A replica stores a pair ha, di, where a is the

current value of the counter, and d is the number of increments

performed since the last broadcast (we use angle brackets for tuples

representing states and messages). The send method returns d and

resets it; the receive method adds the content of the message to

a. This implementation is correct, as long as each message is

delivered exactly once (we show how to prove this in §5). Since inc

operations commute, they never conﬂict: applying them in different

orders at different replicas yields the same ﬁnal state.

State-based counter (ctr). The implementation in Fig. 2(b)

summarizes the currently known history by recording the contri-

Figure 2. Three replicated data type implementations

(a) Op-based counter (ctr)

Σ = N

× N

M = N

~σ

= λr. h0, 0i

do(rd, ha, di, t) = (ha, di, a)

do(inc, ha, di, t) = (ha + 1, d + 1i, ⊥)

send(ha, di) = (ha, 0i, d)

receive(ha, di, d

′

) = ha + d

′

, di

(b) State-based counter (ctr)

Σ = ReplicaID × (ReplicaID → N

)

= λr. hr, λs. 0i

M = ReplicaID → N

do(rd, hr, vi, t) = (hr, vi,

{v(s) | s ∈ ReplicaID})

do(inc, hr, vi, t) = (hr, v[r 7→ v(r) + 1]i, ⊥)

send(hr, vi) = (hr, vi, v)

receive(hr, vi, v

′

) = hr, (λs. max{v(s), v

′

(s)})i

Σ = Z × (Timestamp ⊎ {0})

~σ

= λr. h0, 0i

M = Σ

do(rd, ha, ti, t

′

) = (ha, ti, a)

do(wr(a

′

), ha, ti, t

′

) = if t < t

′

then (ha

′

, t

′

i, ⊥) else (ha, ti, ⊥)

send(ha, ti) = (ha, ti, ha, ti)

receive(ha, ti, ha

′

, t

′

i) = if t < t

′

then ha

′

, t

′

i else ha, ti

bution of every replica to the counter value separately (reminiscent

of vector clocks [29]). A replica stores its identiﬁer r and a vector

v such that for each replica s the entry v(s) gives the number of

increments made by clients at s that have been received by r. A

rd operation returns the sum of all entries in the vector. An inc

operation increments the entry for the current replica. We denote

by v[i 7→ j] the function that has the same value as v everywhere,

except for i, where it has the value j. The send method returns the

vector, and the receive method takes the maximum of each entry in

the vectors v and v

′

given to it. This is correct because an entry for s

in either vector reﬂects a preﬁx of the sequence of increments done

at replica s. Hence, we know that min{v(s), v

′

(s)} increments by

s are taken into account both in v(s ) and in v

′

(s).

State-based last-writer-wins (LWW) register (intreg). Un-

like counters, registers have update operations that are not com-

mutative. To resolve conﬂicts, the implementation in Fig. 2 uses

the last-writer-wins strategy, creating a total order on writes by as-

sociating a unique timestamp with each of them. A state contains

the current value, returned by rd, and the timestamp at which it was

written (initially, we have 0 instead of a timestamp). A wr(a

′

) com-

pares its timestamp t

′

with the timestamp t of the current value a

and sets the value to the one with the highest timestamp. Note that

here we have to allow for t

′

< t, since we do not make any assump-

tions about timestamps apart from uniqueness: e.g., the rest of the

store implementation can compute them using physical or Lamport

clocks [22]. We show how to state assumptions about timestamps in

§4. The send method just returns the state, and the receive method

chooses the winning value by comparing the timestamps in the cur-

rent state and the message, like wr.

State-based vs. op-based. State-based implementations con-

verge to a consistent state faster than op-based implementations be-

cause they are transitively delivering, meaning that they can prop-

agate updates indirectly. For example, when using the counter in

Fig. 2(b), in the execution in Fig. 1(a) the read at r

(event 7) re-

turns 2, even though the message from r

has not arrived yet, be-

cause r

learns about r

’s update via r

. State-based implementa-

tions are also resilient against transport failures like message loss,

reordering, or duplication. Op-based implementations require the

replicated store using them to mask such failures (e.g., using mes-

sage sequence numbers, retransmission buffers, or reorder buffers).

The potential weakness of state-based implementations is the

size of states and messages, which motivates our examination of

space optimality in §6. For example, we show that the counter

in Fig. 2(b) is optimal, meaning that no counter implementation

satisfying the same requirements (transitive delivery and resilience

against message loss, reordering, and duplication) can do better.

3. Specifying Replicated Data Types and Stores

Consider the concrete execution in Fig. 1(a). What are valid return

values for the read in event 7? Intuitively, 1 or 2 can be justiﬁable,

but not 100. We now present a framework for specifying the ex-

pected outcome declaratively, without referring to implementation

details. For example, we give a speciﬁcation of a replicated counter

that is satisﬁed by both implementations in Fig. 2(a, b).

In presenting the framework, we rely on the intuitive under-

standing of the way a replicated store executes given in §2. Later we

deﬁne the store semantics formally (§4), which lets us state what it

means for a store to satisfy our speciﬁcations (§4 and §7).

3.1 Abstract Executions and Speciﬁcation Structure

We deﬁne our speciﬁcations on abstract executions, which in-

clude only user-visible events (corresponding to do calls) and

describe the other information about the store processing in an

implementation-independent form. Informally, we consider a con-

crete execution correct if it can be justiﬁed by an abstract execution

satisfying the speciﬁcations that is “similar” to it and, in particular,

has the same operations and return values.

Abstract executions are inspired by axiomatic deﬁnitions of

weak shared-memory models [2]. In particular, we use their pre-

viously proposed reformulation with visibility and arbitration rela-

tions [13], which are similar to the reads-from and coherence rela-

tions from weak shared-memory models. We provide a comparison

with shared-memory models in §7 and with [13] in §8.

DEFINITION 2. An abstract execution is a tuple

A = (E, repl, obj, oper, rval, ro, v i s , ar), where

•

E ⊆ Event is a set of events from a countable universe Eve nt;

•

each event e ∈ E describes a replica repl(e) ∈ ReplicaID

performing an operation oper(e) ∈ Op

type(obj(e))

on an object

obj(e) ∈ Obj, which returns the value rval(e) ∈ Val

type(obj(e))

;

•

ro ⊆ E × E is a replica order, which is a union of transitive,

irreﬂexive and total orders on events at each replica;

•

vis ⊆ E × E is an acyclic visibility relation such that

∀e, f ∈ E. e

vis

−→ f =⇒ o b j( e ) = obj(f);

•

ar ⊆ E × E is an arbitration relation, which is a union of

transitive, irreﬂexive and total orders on events on each object.

We also require that ro, vis and ar be well-founded.

In the following, we denote components of A and similar structures

as in A.repl. We also use (e, f ) ∈ r and e

−→ f interchangeably.

Informally, e

vis

−→ f means that f is aware of e and thus e’s

effect can inﬂuence f ’s return value. In implementation terms, this

may be the case if the update performed by e has been delivered to

the replica performing f before f is issued. The exact meaning of

“delivered”, however, depends on how much information messages

carry in the implementation. For example, as we explain in §3.2,

the return value of a read from a counter is equal to the number

of inc operations visible to it. Then, as we formalize in §4, the

abstract execution illustrated in Fig. 1(b) justiﬁes the op-based

implementation in Fig. 2(a) reading 1 in the concrete execution in

Fig. 1(a). The abstract execution in Fig. 1(c) justiﬁes the state-based

implementation in Fig. 2(b) reading 2 due to transitive delivery

(§2). There is no abstract execution that would justify reading 100.

x.wr(empty)

x.wr(post)

y.wr(comment)

y.rd: comment

x.rd: empty

vis

The ar relation represents the

ordering information provided by

the store, e.g., via timestamps.

On the right we show an ab-

stract execution corresponding to

a variant of the anomaly (2). The

ar edge means that any replica

that sees both writes to x should assume that post overwrites empty.

We give a store speciﬁcation by two components, constraining

abstract executions:

1. Replicated data type speciﬁcations determine return values of

operations in an abstract execution in terms of its vis and ar rela-

tions, and thus deﬁne conﬂict-resolution policies for individual

objects in the store. The speciﬁcations are the key novel compo-

nent of our framework, and we discuss them next.

2. Consistency axioms constrain v is and ar and thereby disallow

anomalies and extend the semantics of individual objects to that

of the entire store. We defer their discussion to §7. See Fig. 13 for

their ﬂavor; in particular, COCV prohibits the anomaly above.

Each of these components can be varied separately, and our spec-

iﬁcations will deﬁne the semantics of any possible combination.

Given a speciﬁcation of a store, we can determine whether a set

of events can be observed by its users by checking if there is an

abstract execution with this set of events satisfying the data type

speciﬁcations and consistency axioms.

3.2 Replicated Data Type Speciﬁcations

In a sequential setting, the semantics of a data type τ can be

speciﬁed by a function S

: Op

→ Val

, which, given a non-

empty sequence of operations performed on an object, speciﬁes the

return value of the last operation. For a register, read operations

return the value of the last preceding write, or zero if there is no

prior write. For a counter, read operations return the number of

preceding increments. Thus, for any sequence of operations ξ:

intreg

(ξ rd) = a, if wr(0) ξ = ξ

wr(a) ξ

and

does not contain wr operations;

ctr

(ξ rd) = (the number of inc operations in ξ);

intreg

(ξ inc) = S

ctr

(ξ wr(a)) = ⊥.

In a replicated store, the story is more interesting. We specify

a data type τ by a function F

, generalizing S

. Just like S

, this

determines the return value of an operation based on prior opera-

tions performed on the object. However, F

takes as a parameter

not a sequence, but an operation context, which includes all we

need to know about a store execution to determine the return value

of a given operation o—the set E of all events that are visible to o,

together with the operations performed by the events and visibility

and arbitration relations on them.

DEFINITION 3. An operation context for a data type τ is a tuple

L = (o, E, oper, vis, ar), where o ∈ Op

, E is a ﬁnite subset of

Event, oper : E → Op

, vis ⊆ E ×E is acyclic and ar ⊆ E ×E

is transitive, irreﬂexive and total.

We can extract the context of an event e ∈ A.E in an abstract

execution A by selecting all events visible to it according to A.vis:

ctxt(A, e) = (A.oper(e), G, (A.oper)|

, (A.vis)|

, (A.ar)|

where G = (A.vis)

−1

(e) and ·|

is the restriction to events in G.

Thus, in the abstract execution in Fig. 1(b), the operation context of

the read from x includes only one increment event; in the execution

in Fig. 1(c) it includes two.

DEFINITION 4. A replicated data type speciﬁcation for a type τ is

a function F

that, given an operation context L for τ, speciﬁes a

return value F

(L) ∈ Val

Replicated data types: specification, verification, optimality

Figures

Citations

Herding Cats: Modelling, Simulation, Testing, and Data Mining for Weak Memory

'Cause I'm strong enough: Reasoning about consistency choices in distributed systems

Herding Cats - Modelling, simulation, testing, and data-mining for weak memory

Principles of Eventual Consistency

Cure: Strong Semantics Meets High Availability and Low Latency

References

Time, clocks, and the ordering of events in a distributed system

Time, clocks, and the ordering of events in a distributed system

Dynamo: amazon's highly available key-value store

Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services

Virtual Time and Global States of Distributed Systems

Related Papers (5)

Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services

Don't settle for eventual: scalable causal consistency for wide-area storage with COPS

Dynamo: amazon's highly available key-value store

Time, clocks, and the ordering of events in a distributed system

Linearizability: a correctness condition for concurrent objects

Frequently Asked Questions (2)

Q1. What are the contributions in "Replicated data types: specification, verification, optimality" ?

Q2. What future works have the authors mentioned in the paper "Replicated data types: specification, verification, optimality" ?