scispace - formally typeset
Open AccessProceedings ArticleDOI

What do high-level memory models mean for transactions?

TLDR
The semantics of transactions with respect to a memory model weaker than sequential consistency is considered, and cases where semantics are more subtle than people expect include the actual meaning of both strong and weak atomicity.
Abstract
Many people have proposed adding transactions, or atomic blocks, to type-safe high-level programming languages. However, researchers have not considered the semantics of transactions with respect to a memory model weaker than sequential consistency. The details of such semantics are more subtle than many people realize, and the interaction between compiler transformations and transactions could produce behaviors that many people find surprising. A language's memory model, which determines these interactions, must clearly indicate which behaviors are legal, and which are not. These design decisions affect both the idioms that are useful for designing concurrent software and the compiler transformations that are legal within the language.Cases where semantics are more subtle than people expect include the actual meaning of both strong and weak atomicity; correct idioms for thread safe lazy initialization; compiler transformations of transactions that touch only thread local memory; and whether there is a well-defined notion for transactions that corresponds to the notion of correct and incorrect use of synchronization in Java. Open questions for a high-level memory-model that includes transactions involve both issues of isolation and ordering.

read more

Content maybe subject to copyright    Report

What Do High-Level Memory Models Mean for Transactions?
Dan Grossman
University of Washington
djg@cs.washington.edu
Jeremy Manson
Purdue University
jmanson@cs.purdue.edu
William Pugh
University of Maryland, College Park
pugh@cs.umd.edu
Abstract
Many people have proposed adding transactions, or atomic blocks,
to type-safe high-level programming languages. However, re-
searchers have not considered the semantics of transactions with
respect to a memory model weaker than sequential consistency. The
details of such semantics are more subtle than many people realize,
and the interaction between compiler transformations and trans-
actions could produce behaviors that many people find surprising.
A language’s memory model, which determines these interactions,
must clearly indicate which behaviors are legal, and which are not.
These design decisions affect both the idioms that are useful for
designing concurrent software and the compiler transformations
that are legal within the language.
Cases where semantics are more subtle than people expect in-
clude the actual meaning of both strong and weak atomicity; correct
idioms for thread safe lazy initialization; compiler transformations
of transactions that touch only thread local memory; and whether
there is a well-defined notion for transactions that corresponds to
the notion of correct and incorrect use of synchronization in Java.
Open questions for a high-level memory-model that includes trans-
actions involve both issues of isolation and ordering.
1. Introduction
1.1 Background and Motivation
With multiprocessors, multi-core architectures and multithreaded
programming becoming widespread, shared-memory semantics
and synchronization primitives are affecting many more program-
mers, particularly users of high-level type-safe languages such as
Java. Such languages need precise definitions that balance seman-
tic rigor, ease-of-use, and efficient implementation on a variety of
available hardware.
Java’s shared-memory semantics (i.e., its memory model)
[MPA05] is a state-of-the-art example of the importance and dif-
ficulty of such definitions. The easy-to-describe memory model
of sequential consistency is untenable for modern compilers and
runtime environments, even if we assume sequentially consistent
hardware (which we cannot). It is debatable whether sequential
consistency is easy-to-use, since its availability encourages pro-
grammers to reduce the explicit synchronization in their programs;
even if we assume the unsynchronized code is correct (which it of-
ten is not for reasons of mutual exclusion), the removal of explicit
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specific permission and/or a fee.
MSPC’06 October 22, 2006, San Jose, CA, USA
Copyright
c
2006 ACM 1-59593-578-9/06/0010. . . $5.00.
synchronization can encourage later refactorings which will not be
robust in a multithreaded environment.
Equally untenable is making the meaning of incorrectly syn-
chronized code “completely implementation defined, which would
sacrifice Java’s safety and security guarantees in the presence of
data races. Programs with concurrency errors can be likened to pro-
grams with buffer overflows: if arbitrary results are allowed, then
bugs become less predictable and programs become more vulnera-
ble to external attack.
The Java Memory Model (developed by two of the authors with
broad community input), is carefully constructed so that properly
synchronized code behaves in a sequentially consistent manner and
other code has enough meaning to preserve safety, but not so much
as to prevent efficient optimization. Any changes to concurrency
in Java (or another language that has well defined multithreaded
semantics) must consider its effect on the memory model.
The building blocks of Java synchronization are mutual exclu-
sion locks, condition variables, and non-blocking synchronization
operations (via volatile/atomic variables). These mechanisms are
pervasive primitives for concurrent programming, and program-
mers have long been frustrated by their difficulty. They are widely
considered difficult to use, with under-use causing races and over-
use causing poor performance, even deadlock. Researchers and
programming language designers have been constantly on the look-
out for new and better ways of describing concurrency, but none has
been adopted in a mainstream language.
Recently, researchers (including one of the authors) have pro-
posed complementing or replacing locks in programming lan-
guages with a transactional model. Most of the proposed models
present a relatively clear model, similar to that of a Conditional
Critical Region (CCR) [Hoa02]. Conditional critical regions pro-
vide a way to write multiple statements so that they appear to occur
atomically to the entire system: either all of the CCR is guaran-
teed to have executed, or none of it will have. On this level, the
semantics of these CCRs, or atomic blocks, is very simple.
Much work has been done to date in getting transactional solu-
tions to work efficiently and effectively in a programming language
[HF03, FQ03, CCM
+
06, HPST06, ATLM
+
06, RG05]. However,
only cursory attention has been paid to the detailed semantics of
transactional concurrency, and how it interacts with the code trans-
formations and optimizations.
In this paper, we raise a litany of difficult questions about how
the semantics of atomic blocks interacts with the semantics of a
relaxed memory model. Many of these questions arose as we con-
sidered different ways to formalize the semantics of transactions,
but we believe the important and lasting contribution of this pa-
per is the questions it asks and sample test cases. We believe that
any complete semantics of transactions in a programming language
must be able to address whether the behaviors described in this pa-
per are allowed. To our knowledge, there has been no prior inves-
tigation of these issues (or even acknowledgment of the problem),
but resolving them is crucial for making transactions “ready for the

real world. Adapting prior work on memory models to incorporate
transactions will also raise many interesting intellectual questions.
1.2 Programmers and Implementors
The interaction between the memory model and transactional
mechanisms affects two key constituencies. Programmers must
know what their programs mean and whether a program is correctly
synchronized. Language implementors must know what synchro-
nization barriers must be provided and what compiler optimizations
are legal.
Simple answers to relevant questions are often initially appeal-
ing but can fail to satisfy either group. For example, consider the
seemingly innocuous concept that “all thread-shared mutable mem-
ory must be accessed only within transactions. This approach pre-
cludes programmers from using idioms generally accepted as cor-
rect, such as passing a mutable object between threads via a queue
(where synchronization occurs by accessing the queue within trans-
actions, but the passed object can be accessed outside of a transac-
tion).
Such an approach also leaves undefined what the implementa-
tion can do if the correct sharing policy is violated. Under a weak
model, could a compiler transform
atomic{ if (y == 1) x++;}
to the following?
atomic{ x++; if (y != 1) x--;}
Boehm [Boe05] noted this as an issue for compiler transformations
and multithreading in C and C++. Allowing this transformation
would mean that a concurrent read by another thread could see the
value of x incremented, regardless of whether y was equal to 1. A
language definition must dictate whether such behavior is allowed.
1.3 Key Questions
The specific questions we raise fall into two categories. Questions
of isolation ask what happens when one thread can observe partial
effects of another thread’s transaction (if ever) and what constraints
(if any) that places on the implementation of transactions. More dif-
ficult are questions of ordering, which ask what happens when one
thread can observe effects of another thread out of order. As a syn-
chronization primitive, we expect the correct use of atomic blocks
to reduce such unexpected behavior. (Ideally, we could define such
“correct use” and ensure sequential consistency in such a case.) Un-
fortunately, it is not clear when a pair of transactions should obey
the kinds of ordering relationships that Java synchronized blocks
using the same monitor do. We present several possibilities and ex-
amples, showing none of them are ideal.
2. Isolation
The most essential property of an atomic block is that it appears
to execute all-at-once. Informally, that means it must appear to
other threads as though the thread executing an atomic block does
the entire computation (including all memory reads and writes) at
a single point in time. To understand the implications of this on
the memory model, we first review the relevant notions of actions
in the Java Memory Model [MPA05], then extend this notion for
both strong and weak atomicity, and then discuss some unexpected
implications.
2.1 Actions in the Java Memory Model
A Java thread executes code in program order by performing a se-
quence of actions. For present purposes, we can view these actions
as reading memory locations, writing memory locations, and per-
forming synchronization primitives (which we can extend to in-
clude starting and committing an atomic block).
Thread 1
Thread 2
atomic { x = 1;
x = 0;
if (x == 1)
y = 1;
}
Can y==1?
Figure 1. Simple example: Strong atomicity cannot see conflicting
writes
In some sense, Java already extends the notion of atomicity
to single variables. Every concurrent language has some built-in
notion of atomicity. For example, at the bit-level, it is impossible to
see a partial write a bit must either be 1 or 0. In Java, writes of
32-bit values are atomic; if one thread writes a 32-bit value, other
threads must either see all of the write or none of it. Whether some
threads can see the write “before” other threads is a question of
ordering, deferred to the next section.
More formally, reads and writes are atomic actions that are de-
scribed by (1) the thread executing them, (2) the variable accessed,
and (3) a unique identifier. Read actions also include “which write”
is observed. For read action r, we write W (r) to denote the unique
identifier of the write action that produces the observed value. Im-
plicit in these formal definitions is the atomicity and isolation of
accesses to individual variables.
2.2 Strong vs. Weak Atomicity
Following Blundell et al. [BLM05], we distinguish strong atomicity
(an atomic block is isolated from all other computation) from
weak atomicity (an atomic block is isolated only from other atomic
blocks).
To see the difference between strong and weak atomicity infor-
mally, consider Figure 1. There is no atomic block protecting the
access of x in Thread 2. Under strong atomicity, Thread 1 cannot
execute y=1, since it must appear that no computation occurs be-
tween its write of 0 to x and its subsequent read. More formally,
strong atomicity provides no flexibility on W (r) within a transac-
tion.
1
A key software-engineering advantage of atomicity given this
informal definition is that sequential reasoning inside an atomic
block is sound. For example, we can argue that changing Thread 1
to atomic{x=0;} neither adds to nor subtracts from the observable
behaviors of the program.
With weak atomicity, Thread 1 is allowed to write 1 to y.
Whether this could actually occur depends on how atomic blocks
are implemented (particularly how transaction-local logs are used,
if at all). That is, it depends on low-level details that do not belong
in the language definition.
To provide a slightly more formal way of characterizing the be-
havior of strong and weak atomicity, we note that strong atomicity
effectively gives three separate guarantees. Each of these guaran-
tees can be removed; a precise definition of weak atomicity must
specify which are removed.
The first question asks whether multiple accesses to the same
variable in the same transaction can return the value of different
writes, if no write to that variable intervened. For example, con-
sider Figure 2. Under Java’s semantics, if the atomic block were
replaced by a synchronized block, the stated result of r1 == 1,
r2 == 2 would be allowed. Strong atomicity does not allow this,
because such a result prevents sequential reasoning variables can-
1
Transactions with internal parallelism (i.e., a transaction that spawns
threads that share memory) could reintroduce nondeterminism in W (r).

Initially, x = 0
Thread 1 Thread 2 Thread 3
atomic { x = 1; x = 2;
r1 = x;
r2 = x;
}
Can r1 == 1, r2 == 2?
Figure 2. Potential for multiple incoming values in a relaxed mem-
ory model
Initially, x = 0
Thread 1 Thread 2
atomic { r1 = x;
x = 1;
x = 2;
}
Can r1 == 1 ?
Figure 3. Can intermediate values escape an atomic block?
Initially, x = 0
Thread 1 Thread 2
atomic { r1 = y;
y = 1;
atomic {
if (x == 0) x = 1;
retry; }
}
Can r1 == 1 ?
Figure 4. Is an aborted write visible to another thread?
not be seen to change part of the way through a transaction, because
it makes it impossible to reason about the code in isolation.
The second question asks whether it is possible to see a new
value for a variable after it has been written within an atomic block.
This is the question dealt with by Figure 1. Because it also prevents
sequential reasoning, strong atomicity disallows it as well.
Instead of addressing which writes might be seen inside the
transaction, the final question addresses which writes, performed
by the transaction, might be seen by other threads. In short, the
question asks whether an intermediate write, which is later over-
written within the atomic block, can be seen by another thread. This
is the question raised by the example in Figure 3. Strong atomicity
also disallows this, because it is atomic with respect to other actions
that take place inside the execution.
A closely related question involves whether an intermediate
write that is later revoked can be seen by another thread. Consider
Figure 4. If the atomic block in Thread 2 is executed first, the write
to y in Thread 1 is always aborted. Whether the write to y can be
seen by Thread 2 or not is a very similar issue to that of whether
any intermediate write can be seen by another thread.
In this case, there are two possibilities, each of which depend
on the semantic model for an abort. If the semantics of an abort
are such that the write in the atomic section occurred, but the abort
overwrote it with the original value, then the write can be seen by
another thread (as long as the semantics allows other threads to see
intermediate writes). If, on the other hand, the semantics of an abort
are that any writes the transaction performed never took place, then
other threads must be prevented from seeing any updates that took
place before the abort occurred.
Initially, x = y = 0
Thread 1 Thread 2 Thread 3
atomic { r1 = x; atomic {
x = 1; y = r1; r2 = y;
x = 2; }
}
Can r2 == 1?
Figure 5. Incorrect code can cause correct transactions to be inter-
leaved
Initially, x = y = 0
Thread 1 Thread 2
atomic { r1 = x;
x = 1; r2 = y;
y = 1;
}
Can r1 == 1 and r2 == 0 ?
Figure 6. An ordering issue that appears like an isolation issue
Ultimately, those who design the semantics of transactions will
base this decision heavily on whether intermediate writes are vis-
ible to other threads. We therefore do not consider it a separate
question.
2.3 Implications
Having raised three (and a half) questions to define “how weak is
weak atomicity, we can consider implications of the answers and
problems with even weaker notions.
First, the second and third questions can combine to have impli-
cations that can be surprising to some. Assume that other threads
can see intermediate writes, and consider Figure 5. Can Thread 2
read the value 1 for x and, in the same execution, Thread 3 read
the value 1 for y? If so, then the “incorrect” Thread 2 has the ef-
fect of interleaving the two “correct” transactions. This is true even
though the two transactions do not access the same memory. Al-
though some find this behavior clear, we have spoken to others who
find it surprising.
Second, an unsafe language could allow a read outside a trans-
action that conflicts with writes inside a transaction to lead to ar-
bitrary behavior. In terms of the third question, we could say the
read could return any value, even values never written. In terms of
Figure 3, this would allow r1 to be 3. (Section 1 gave a more real-
istic example of a compiler transformation that could produce such
an ephemeral value.) Allowing such behavior in a safe language
is unacceptable because it allows incorrectly synchronized code to
break the encapsulation and security guarantees of correctly syn-
chronized code.
Finally, we should note that it is easy to confuse questions of
isolation with questions of ordering, which we discuss later. For
example, in Figure 6, we can have r1=1 and r2=0 if we allow
reordering the statements in Thread 2 (which is very much in line
with the Java Memory Model). Even though the “fix” for this code
(assuming such behavior is undesirable) may be to wrap the code in
Thread 2 in an atomic block, the reason is ordering not isolation.
3. Ordering
Ordering determines when actions in one thread can be seen to oc-
cur out of order with respect to another. The ordering constraints of
a language are determined by its memory model, and are crucial in
determining when compilers, runtime environments and hardware

Initially, ready = false, data = 0
Thread 1 Thread 2
atomic {
ready = true; r1 = ready;
data = 1; r2 = data;
}
Can r1 == true, r2 == 0?
Thread 1 Thread 2
atomic {
ready = true; r1 = ready;
data = 1; if (r1 == true)
} r2 = data;
Can r1 == true, r2 == 0?
Thread 1 Thread 2
atomic {
g = o; r1 = g;
o.x = 1; r2 = r1.x;
}
Can r1 == o, r2 == 0?
Figure 7. The reorderings allowed determine which idioms for thread-safe lazy initialization are valid
Thread 1 Thread 2
atomic { r3 = o;
g = o; r4 = r3.x;
o.x = 1; r1 = g;
} if (r1 != null && r1 == r3)
r2 = r1.x;
Can r1 != null, r2 == 0?
Figure 8. Compiler optimizations can remove data dependencies
architectures can perform code transformations. In this section, we
describe some of the decisions it is possible to make about order-
ing among atomic blocks, and describe several models that reflect
these decisions.
The implications of ordering constraints for atomic blocks, and
some of the ways in which they can be subtle, can be seen in
Figure 7. In their respective atomic blocks, all three examples set a
variable and create a marker indicating that that variable is set. The
first two use data for the variable, and a boolean variable for the
marker, and the last uses an object with a field x if the reference
is set, so is the field.
In the leftmost example, it is relatively clear and uncontroversial
to say that compilers can reorder the reads in Thread 2. They are
not dependent on each other. As a result, r1 in that example can
have the value true, while data has the value 0. Note that this does
not change the atomicity properties the atomic block still appears
to occur “all at once”. It is the ordering of events that has changed.
The second example is slightly more subtle. It does not seem
as if the read of ready can be reordered with the read of data,
because there is a control dependence between the two. However,
if Thread 2 performed an earlier read of the variable data, then it
could reuse that value for r2. This has the effect of reordering the
two program statements.
The third example is still more subtle. In this case, there is a
data dependence between the two statements in Thread 2, instead
of a control dependence. You might assume that this couldn’t occur
because the compiler couldn’t possibly read r1.x before it knew
what r1 was. But various situations can cause this behavior, such
as a read from a stale cache line or address speculation. In some
situations, even compiler transformations can perform this apparent
impossible reordering. Consider Figure 8, which is the same as the
third example, except that additional code has been introduced, as
well as an additional conditional test. Here, the compiler would
likely replace the read of r1.x with a reuse of r4.
The distinctions between the different code fragments in Fig-
ure 7 may seem subtle, but they are important. They determine
the valid idioms for lazy initialization and double checked locking
[SH96, BBC
+
]. Version 0.903 of the Fortress [ACL
+
a] memory
model had been specifically crafted to allow certain kinds of lazy
initialization; this type of initialization requires the runtime to dis-
allow the behavior described in the second example of Figure 7,
but places no restriction on the behaviors described in the second
Thread 1 Thread 2
data = 42 tmp = false;
atomic { atomic {
ready = true; tmp = ready;
}
}
if (tmp) {
r1 = data;
}
Must r1==42 if tmp == true?
Figure 9. Example of the common idiom of data handoff
two examples. Version 1.0α of Fortress memory model [ACL
+
b]
removed that guarantee, and allows the behaviors described in all
of the examples in Figure 7.
The influence of control and data dependences on allowable re-
orderings in Java is subtle. In certain cases, they can prevent some
questionable program transformations. For a more full treatment,
the reader should consult the full memory model [MPA05]. How-
ever, the basic building block that allows programmers to control it
is happens-before consistency. This property can be thought of as a
predicate on an execution if an execution obeys them, then it is a
legal execution.
3.1 Accessing Shared Memory Outside of a Transaction
All programming languages need a specific, well-defined way of
communicating updates between threads. There are several possi-
ble idioms. One interesting idiom, which would be possible in a
transactional setting, is to state that all thread-shared mutable mem-
ory must be accessed only within transactions. Under this model,
the atomic blocks are totally ordered, so that each sees the updates
made by the earlier ones.
However, this idiom is unnecessarily limiting. The code in Fig-
ure 9 is an example of where a more broad approach might be use-
ful. The variable ready is used to indicate that the variable data
has been initialized. When ready is set, the first thread never ac-
cesses data again. We say that data is handed off to the second
thread. When the second thread accesses data, there is no reason
for it to incur the overhead of an atomic block if it is aware that no
other thread can update it.
In the Java memory model, happens-before consistency is key to
defining why this kind of handoff works. Synchronization creates
happen-before orderings; if one thread wants to write to a variable
that is later read by another thread, those accesses must be ordered
by a happens-before ordering. In this section, we discuss how we
can use this framework to understand when shared data could be
accessed by multiple threads outside of a transaction.
Happens-before consistency in Java uses a happens-before re-
lation to determine what values can be returned by a given read. It
uses two orderings:

Program Order, which, for each thread, is a total order over the
actions performed in that thread.
Synchronization Order, which is a total order over all synchro-
nization actions in the execution (for the moment, this includes
locks and unlocks).
There is a happens-before relation between any two actions
related by program order. There is also a happens-before relation
between an unlock and subsequent lock in the synchronization
order, as long as the lock and the unlock are on the same variable.
Happens-before order is therefore a partial order over the actions in
an execution. We write a
hb
b to indicate action a comes before
action b in this partial order.
Happens-before consistency says that a read r of a variable v
(where a variable is essentially any memory location) is allowed to
observe a write w to v if, in the happens-before partial order of the
execution:
r does not happen-before w (i.e., it is not the case that r
hb
w)
a read cannot see a write that happens-after it, and
there is no intervening write w
0
to v, ordered by happens-before
(i.e., no write w
0
to v such that w
hb
w
0
hb
r) the write w is
not overwritten along a happens-before path.
Another way of phrasing this would be to say that happens-
before consistency implies that it is legal for a read to return the
value of a write if that write is not ordered by happens-before with
that read (which implies that the read is in a data race with the
write), or if it is the last write to the variable in the happens-before
order.
The principle of happens-before can be generalized from syn-
chronization to transactions. The most obvious extension of happens-
before to transactions involves placing the atomic block, and its
entire contents, in the synchronization order, so that there is a total
order over all actions protected by atomic blocks. For simplicity,
we refer to the total order over these actions as the atomic order.
It then becomes a simple matter to introduce happens-before edges
from the end of each atomic block to the beginning of the next in
the total order. Reads can see writes (or not) according to the rules
for happens-before consistency.
Returning to Figure 9, if the read of ready in Thread 2 occurs
later in the atomic order than the write to it in Thread 1, then it will
return the value true. There will be a happens-before relationship
between the writing atomic block and the reading one; since the
write to data in Thread 1 happens-before the read of it in Thread
2, the read is forced to return the value 42.
3.2 Conflicting Regions
There are subtleties in this definition. For example, happens-before
consistency allows redundant happens-before edges to be removed.
This means, for example, if a lock is only ever obtained by a single
thread, it can be removed. There is no notion of a “lock obtained
by a single thread” for an atomic block. However, an implementor
might desire to remove an atomic block if a compiler analysis
determines that it does not access any shared memory, or is entirely
empty.
Consider Figure 10. In this figure, the atomic sections appear to
do nothing. However, if we assume that there is a happens-before
relationship between them, then the read in Thread 2 must see the
value 1, if it occurs. The implementation must insert the proper
coherence operations and compiler / memory barriers.
If we wish to remove this constraint on implementations, we
must relax our definition of what happens-before edges exist to
include less than the full atomic order. In particular, we must have
no happens-before edge between these two atomic regions. If there
Initially, g=null
Thread 1 Thread 2
o.x = 1; r1 = g;
atomic { } atomic { }
g = o; if (r1 != null) {
r2 = r1.x;
}
Must r1 != null imply r2 == 1?
Figure 10. Can empty atomic blocks be removed?
Initially, g=null
Thread 1 Thread 2
o.x = 1; r1 = g;
atomic {
atomic {
x = 1; x = 2;
} }
g = o; if (r1 != null) {
r2 = r1.x;
}
Must r1 != null imply r2 == 1?
Figure 11. Can single writes in atomic blocks be no-ops?
is no happens-before relationship between them, then the read of
r1.x can return the value 0. There are several possible definitions of
the happens-before relationship that could allow this:
There is a happens-before relationship between any atomic
block that touches shared memory and any subsequent atomic
block that touches shared memory.
There is a way of sorting the various parts of memory into
regions. There is a happens-before relationship between any
two atomic blocks that touch the same region.
There is a happens-before relationship between any two atomic
blocks that touch the same element in shared memory.
In the memory model literature, when we say that two critical
sections “touch” the same piece of memory, we usually mean that
both of them access the same memory, and at least one is a write
this is also called a conflict. The stricter the definition of “same
piece of memory” (whether all of it, some region of it, or a single
memory location), the more optimizations possible, but the fewer
guarantees provided to programmers.
Note that compilers can remove accesses to shared memory.
For example, they can replace the expression 0 * z with the con-
stant value 0, removing the access to shared memory. However, the
happens-before relationships in the original program must be main-
tained; the compiler can get rid of the accesses to shared memory,
but it can’t get rid of the compiler or memory barriers that enforce
the happens-before relationship. For example, if the implementor
uses an atomic machine instruction, such as a compare-and-swap,
to ensure that access to a shared variable is guaranteed to be atomic,
and the compiler eliminates the access to that variable, then it is vi-
tal that some memory synchronization operations be left in to guar-
antee ordering.
3.3 Happens-Before from Writes to Reads
Another point to make is that even when you focus on atomic
blocks that share data, not all idioms require happens-before rela-
tionships between all atomic blocks. As mentioned before, in Fig-
ure 9, the goal is to “hand off the variable data between threads

Citations
More filters
Proceedings ArticleDOI

NOrec: streamlining STM by abolishing ownership records

TL;DR: An ownership-record-free software transactional memory (STM) system that combines extremely low overhead with unusually clean semantics is presented, and the experience suggests that NOrec may be an ideal candidate for such a software system.
Book

Transactional Memory, 2nd Edition

TL;DR: This book presents an overview of the state of the art in the design and implementation of transactional memory systems, as of early spring 2010.
Proceedings ArticleDOI

Atlas: leveraging locks for non-volatile memory consistency

TL;DR: This paper identifies failure-atomic sections of code based on existing critical sections and describes a log-based implementation that can be used to recover a consistent state after a failure, and confirms the ability to rapidly flush CPU caches as a core implementation bottleneck and suggest partial solutions.
Proceedings ArticleDOI

Goldilocks: a race and transaction-aware java runtime

TL;DR: A Java runtime system that monitors program executions and throws a DataRaceException that may be a viable mechanism to enforce the safety of executions of multithreaded Java programs and a precise and efficient algorithm for dynamically verifying that an execution is free of data races.
Journal ArticleDOI

Subtleties of transactional memory atomicity semantics

TL;DR: It is shown that a direct translation of lock-based critical sections into transactions can introduce deadlock into otherwise correct programs, and the terms strong atomicity and weak atomicity are introduced to describe the interaction of transactional and non-transactional code.
References
More filters
Book ChapterDOI

Time, clocks, and the ordering of events in a distributed system

TL;DR: In this paper, the concept of one event happening before another in a distributed system is examined, and a distributed algorithm is given for synchronizing a system of logical clocks which can be used to totally order the events.
Journal ArticleDOI

Time, clocks, and the ordering of events in a distributed system

TL;DR: In this article, the concept of one event happening before another in a distributed system is examined, and a distributed algorithm is given for synchronizing a system of logical clocks which can be used to totally order the events.
Book

The C++ Programming Language

TL;DR: Bjarne Stroustrup makes C even more accessible to those new to the language, while adding advanced information and techniques that even expert C programmers will find invaluable.
Journal ArticleDOI

Shared memory consistency models: a tutorial

TL;DR: This work describes an alternative, programmer-centric view of relaxed consistency models that describes them in terms of program behavior, not system optimizations, and most of these models emphasize the system optimizations they support.
Proceedings ArticleDOI

Memory consistency and event ordering in scalable shared-memory multiprocessors

TL;DR: A new model of memory consistency, called release consistency, that allows for more buffering and pipelining than previously proposed models is introduced and is shown to be equivalent to the sequential consistency model for parallel programs with sufficient synchronization.