Transactional Memory

Home
/
Papers
/
Transactional Memory

Book•

Transactional Memory

12 Jan 2007-

TL;DR: This book presents an overview of the state of the art in the design and implementation of transactional memory systems, as of early summer 2006.

read less

Abstract: The advent of multicore processors has renewed interest in the idea of incorporating transactions into the programming model used to write parallel programs. This approach, known as transactional memory, offers an alternative, and hopefully better, way to coordinate concurrent threads. The ACI (atomicity, consistency, isolation) properties of transactions provide a foundation to ensure that concurrent reads and writes of shared data do not produce inconsistent or incorrect results. At a higher level, a computation wrapped in a transaction executes atomically – either it completes successfully and commits its result in its entirety or it aborts. In addition, isolation ensures the transaction produces the same result as if no other transactions were executing concurrently. Although transactions are not a parallel programming panacea, they shift much of the burden of synchronizing and coordinating parallel computations from a programmer to a compiler, runtime system, and hardware. The challenge for the system implementers is to build an efficient transactional memory infrastructure. This book presents an overview of the state of the art in the design and implementation of transactional memory systems, as of early summer 2006.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

An effective hybrid transactional memory system with strong isolation guarantees

[...]

Chi Cao Minh¹, Martin Trautmann¹, JaeWoong Chung¹, Austen McDonald¹, Nathan Bronson¹, Jared Casper¹, Christos Kozyrakis¹, Kunle Olukotun¹ - Show less +4 more•Institutions (1)

Stanford University¹

09 Jun 2007

TL;DR: For certain workloads, SigTM can match the performance of a full-featured hardware TM system, while for workloads with large read-sets it can be up to two times slower.

...read moreread less

Abstract: We propose signature-accelerated transactional memory (SigTM), ahybrid TM system that reduces the overhead of software transactions. SigTM uses hardware signatures to track the read-set and write-set forpending transactions and perform conflict detection between concurrent threads. All other transactional functionality, including dataversioning, is implemented in software. Unlike previously proposed hybrid TM systems, SigTM requires no modifications to the hardware caches, which reduces hardware cost and simplifies support for nested transactions and multithreaded processor cores. SigTM is also the first hybrid TM system to provide strong isolation guarantees between transactional blocks and non-transactional accesses without additional read and write barriers in non-transactional code.Using a set of parallel programs that make frequent use of coarse-grain transactions, we show that SigTM accelerates software transactions by 30% to 280%. For certain workloads, SigTM can match the performance of a full-featured hardware TM system, while for workloads with large read-sets it can be up to two times slower. Overall, we show that SigTM combines the performance characteristics and strong isolation guarantees of hardware TM implementations with the low cost and flexibility of software TM systems.

...read moreread less

340 citations

Cites background or methods from "Transactional Memory"

...The lack of strong isolation causes this code to behave unpredictably with all STM approaches [18]....
[...]
...Instead of arguing whether or not TM should eliminate the data races in lock-based synchronization, we examine the privatization code in Figure 3 [18]....
[...]
...An important feature for TM systems is strong isolation, which facilitates predictable code behavior [18]....
[...]
...There are several alternative implementations for both approaches [1, 18]....
[...]
...As as result, STM systems may produce incorrect or unpredictable results even for simple parallel programs that would work correctly with lock-based synchronization [18, 12, 24]....
[...]

Proceedings Article•DOI•

CTrigger: exposing atomicity violation bugs from their hiding places

[...]

Soyeon Park¹, Shan Lu¹, Yuanyuan Zhou¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

07 Mar 2009

TL;DR: CTrigger focuses on a special type of interleavings that are inherently correlated to atomicity violation bugs, and uses trace analysis to systematically identify (likely) feasible unserializable interleAVings with low occurrence-probability that are exposed in large programs.

...read moreread less

Abstract: Multicore hardware is making concurrent programs pervasive. Unfortunately, concurrent programs are prone to bugs. Among different types of concurrency bugs, atomicity violation bugs are common and important. Existing techniques to detect atomicity violation bugs suffer from one limitation: requiring bugs to manifest during monitored runs, which is an open problem in concurrent program testing.This paper makes two contributions. First, it studies the interleaving characteristics of the common practice in concurrent program testing (i.e., running a program over and over) to understand why atomicity violation bugs are hard to expose. Second, it proposes CTrigger to effectively and efficiently expose atomicity violation bugs in large programs. CTrigger focuses on a special type of interleavings (i.e., unserializable interleavings) that are inherently correlated to atomicity violation bugs, and uses trace analysis to systematically identify (likely) feasible unserializable interleavings with low occurrence-probability. CTrigger then uses minimum execution perturbation to exercise low-probability interleavings and expose difficult-to-catch atomicity violation.We evaluate CTrigger with real-world atomicity violation bugs from four sever/desktop applications (Apache, MySQL, Mozilla, and PBZIP2) and three SPLASH2 applications on 8-core machines. CTrigger efficiently exposes the tested bugs within 1--235 seconds, two to four orders of magnitude faster than stress testing. Without CTrigger, some of these bugs do not manifest even after 7 full days of stress testing. In addition, without deterministic replay support, once a bug is exposed, CTrigger can help programmers reliably reproduce it for diagnosis. Our tested bugs are reproduced by CTrigger mostly within 5 seconds, 300 to over 60000 times faster than stress testing.

...read moreread less

324 citations

Proceedings Article•DOI•

Atlas: leveraging locks for non-volatile memory consistency

[...]

Dhruva R. Chakrabarti¹, Hans-J. Boehm¹, Kumud Bhandari¹•Institutions (1)

Hewlett-Packard¹

15 Oct 2014

TL;DR: This paper identifies failure-atomic sections of code based on existing critical sections and describes a log-based implementation that can be used to recover a consistent state after a failure, and confirms the ability to rapidly flush CPU caches as a core implementation bottleneck and suggest partial solutions.

...read moreread less

Abstract: Non-volatile main memory, such as memristors or phase change memory, can revolutionize the way programs persist data. In-memory objects can themselves be persistent without the need for a separate persistent data storage format. However, the challenge is to ensure that such data remains consistent if a failure occurs during execution. In this paper, we present our system, called Atlas, which adds durability semantics to lock-based code, typically allowing us to automatically maintain a globally consistent state even in the presence of failures. We identify failure-atomic sections of code based on existing critical sections and describe a log-based implementation that can be used to recover a consistent state after a failure. We discuss several subtle semantic issues and implementation tradeoffs. We confirm the ability to rapidly flush CPU caches as a core implementation bottleneck and suggest partial solutions. Experimental results confirm the practicality of our approach and provide insight into the overheads of such a system.

...read moreread less

271 citations

Proceedings Article•DOI•

Performance pathologies in hardware transactional memory

[...]

Jayaram Bobba¹, Kevin E. Moore¹, Haris Volos¹, Luke Yen¹, Mark D. Hill¹, Michael M. Swift¹, Darien Wood¹ - Show less +3 more•Institutions (1)

University of Wisconsin-Madison¹

09 Jun 2007

TL;DR: The authors identify a set of performance pathologies that could degrade performance in proposed HTM designs and suggest improving conflict resolution could eliminate these pathologies so designers can build robust HTM systems.

...read moreread less

Abstract: Hardware Transactional Memory (HTM) systems reflect choices from three key design dimensions: conflict detection, version management, and conflict resolution. Previously proposed HTMs represent three points in this design space: lazy conflict detection, lazy version management, committer wins (LL); eager conflict detection, lazy version management, requester wins (EL); and eager conflict detection, eager version management, and requester stalls with conservative deadlock avoidance (EE). To isolate the effects of these high-level design decisions, we develop a common framework that abstracts away differences in cache write policies, interconnects, and ISA to compare these three design points. Not surprisingly, the relative performance of these systems depends on the workload. Under light transactional loads they perform similarly, but under heavy loads they differ by up to 80%. None of the systems performs best on all of our benchmarks. We identify seven performance pathologies-interactions between workload and system that degrade performance-as the root cause of many performance differences: FriendlyFire, StarvingWriter, SerializedCommit, FutileStall, StarvingElder, RestartConvoy, and DuelingUpgrades. We discuss when and on which systems these pathologies can occur and show that they actually manifest within TM workloads. The insight provided by these pathologies motivated four enhanced systems that often significantly reduce transactional memory overhead. Importantly, by avoiding transaction pathologies, each enhanced system performs well across our suite of benchmarks.

...read moreread less

256 citations

Journal Article•DOI•

Software transactional memory: why is it only a research toy?

[...]

Calin Cascaval¹, Colin Blundell², Maged M. Michael¹, Harold W. Cain¹, Peng Wu¹, Stefanie R. Chiras¹, Siddhartha Chatterjee¹ - Show less +3 more•Institutions (2)

IBM¹, University of Pennsylvania²

01 Nov 2008-Communications of The ACM

TL;DR: The promise of STM may likely be undermined by its overheads and workload applicabilities.

...read moreread less

Abstract: TM (transactional memory) is a concurrency control paradigm that provides atomic and isolated execution for regions of code. TM is considered by many researchers to be one of the most promising sol...

...read moreread less

252 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Space/time trade-offs in hash coding with allowable errors

[...]

Burton H. Bloom

01 Jul 1970-Communications of The ACM

TL;DR: Analysis of the paradigm problem demonstrates that allowing a small number of test messages to be falsely identified as members of the given set will permit a much smaller hash area to be used without increasing reject time.

...read moreread less

Abstract: In this paper trade-offs among certain computational factors in hash coding are analyzed. The paradigm problem considered is that of testing a series of messages one-by-one for membership in a given set of messages. Two new hash-coding methods are examined and compared with a particular conventional hash-coding method. The computational factors considered are the size of the hash area (space), the time required to identify a message as a nonmember of the given set (reject time), and an allowable error frequency.The new methods are intended to reduce the amount of space required to contain the hash-coded information from that associated with conventional methods. The reduction in space is accomplished by exploiting the possibility that a small fraction of errors of commission may be tolerable in some applications, in particular, applications in which a large amount of data is involved and a core resident hash area is consequently not feasible using conventional methods.In such applications, it is envisaged that overall performance could be improved by using a smaller core resident hash area in conjunction with the new methods and, when necessary, by using some secondary and perhaps time-consuming test to “catch” the small fraction of errors associated with the new methods. An example is discussed which illustrates possible areas of application for the new methods.Analysis of the paradigm problem demonstrates that allowing a small number of test messages to be falsely identified as members of the given set will permit a much smaller hash area to be used without increasing reject time.

...read moreread less

7,390 citations

"Transactional Memory" refers background or methods in this paper

...A number of papers discuss transaction handlers that invoke arbitrary pieces of code when a transaction commits or aborts [3, 20, 29, 30]....
[...]
...A memory load first checks the write set (using a Bloom filter [29]), to determine if the transaction previously updated the object....
[...]

Journal Article•DOI•

The model checker SPIN

[...]

Gerard J. Holzmann¹•Institutions (1)

Bell Labs¹

01 May 1997

TL;DR: An overview of the design and structure of the verifier, its theoretical foundation, and an overview of significant practical applications are given.

...read moreread less

Abstract: SPIN is an efficient verification system for models of distributed software systems. It has been used to detect design errors in applications ranging from high-level descriptions of distributed algorithms to detailed code for controlling telephone exchanges. The paper gives an overview of the design and structure of the verifier, reviews its theoretical foundation, and gives an overview of significant practical applications.

...read moreread less

4,159 citations

"Transactional Memory" refers background or methods in this paper

...Scherer and Scott describe a number of contention resolution policies and provide examples to show that no policy is uniformly better than all other polices [26]....
[...]
...Another novel aspect of this system is that the STM algorithm is written in Promela, so it can be directly verified with the SPIN model checker [26]; an exercise that found several data races....
[...]

Journal Article•DOI•

The notions of consistency and predicate locks in a database system

[...]

Kapali P. Eswaran¹, Jim Gray¹, Raymond A. Lorie¹, Irving L. Traiger¹•Institutions (1)

IBM¹

01 Nov 1976-Communications of The ACM

TL;DR: It is argued that a transaction needs to lock a logical rather than a physical subset of the database, and an implementation of predicate locks which satisfies the consistency condition is suggested.

...read moreread less

Abstract: In database systems, users access shared data under the assumption that the data satisfies certain consistency constraints. This paper defines the concepts of transaction, consistency and schedule and shows that consistency requires that a transaction cannot request new locks after releasing a lock. Then it is argued that a transaction needs to lock a logical rather than a physical subset of the database. These subsets may be specified by predicates. An implementation of predicate locks which satisfies the consistency condition is suggested.

...read moreread less

2,031 citations

"Transactional Memory" refers background in this paper

...This approach has subtleties that make automatic translation a challenge [5, 33]....
[...]
...1 Lomet, LDRS 77 Many of the concepts and implementation principles for STM were anticipated in a paper by Lomet in 1977 [4], which was published soon after the classic paper by Eswaran [5] on two-phase locking and transactions....
[...]
...[5] introduced the terms weak atomicity and strong atomicity....
[...]
...In a concurrent program, combining context-sensitive analysis with the synchronization analysis necessary to understand the communications between even two threads results in an undecidable problem [5]....
[...]
...Knowing the memory locations enabled the STM system to acquire ownership of them with the twophase locking protocol [5]....
[...]

Proceedings Article•DOI•

Software transactional memory

[...]

Nir N. Shavit¹, Dan Touitou¹•Institutions (1)

Tel Aviv University¹

20 Aug 1995

TL;DR: STM is used to provide a general highly concurrent method for translating sequential object implementations to non-blocking ones based on implementing a k-word compare&swap STM-transaction, a novel software method for supporting flexible transactional programming of synchronization operations.

...read moreread less

Abstract: As we learn from the literature, flexibility in choosing synchronization operations greatly simplifies the task of designing highly concurrent programs. Unfortunately, existing hardware is inflexible and is at best on the level of a Load–Linked/Store–Conditional operation on a single word. Building on the hardware based transactional synchronization methodology of Herlihy and Moss, we offer software transactional memory (STM), a novel software method for supporting flexible transactional programming of synchronization operations. STM is non-blocking, and can be implemented on existing machines using only a Load–Linked/Store–Conditional operation. We use STM to provide a general highly concurrent method for translating sequential object implementations to non-blocking ones based on implementing a k-word compare&swap STM-transaction. Empirical evidence collected on simulated multiprocessor architectures shows that our method always outperforms the non-blocking translation methods in the style of Barnes, and outperforms Herlihy’s translation method for sufficiently large numbers of processors. The key to the efficiency of our software-transactional approach is that unlike Barnes style methods, it is not based on a costly “recursive helping” policy.

...read moreread less

1,369 citations

"Transactional Memory" refers background in this paper

...Herlihy and Wing [8] proposed linearizability as a correctness condition for operations on shared concurrent objects....
[...]
...Scientific programming languages, such as High Performance Fortran (HPF) [8], directly support data parallel programming with a collection of operators on matrices and ways to combine these operations....
[...]
...2 Shavit, Touitou, PODC 1995 Shavit and Touitou’s 1995 PODC paper [8] coined the term “software transactional memory” and described the first software implementation of transactional memory....
[...]
...Most papers assume the correctness criteria from database transactions (serializability [7]) or concurrent data structures (linearizability [8])....
[...]
...Scott [8] identifies four practical policies for detecting conflicts (Fig....
[...]

Book•

The Art of Multiprocessor Programming

[...]

Maurice Herlihy¹•Institutions (1)

Brown University¹

14 Mar 2008

TL;DR: Transactional memory as discussed by the authors is a computational model in which threads synchronize by optimistic, lock-free transactions, and there is a growing community of researchers working on both software and hardware support for this approach.

...read moreread less

Abstract: Computer architecture is about to undergo, if not another revolution, then a vigorous shaking-up. The major chip manufacturers have, for the time being, simply given up trying to make processors run faster. Instead, they have recently started shipping "multicore" architectures, in which multiple processors (cores) communicate directly through shared hardware caches, providing increased concurrency instead of increased clock speed.As a result, system designers and software engineers can no longer rely on increasing clock speed to hide software bloat. Instead, they must somehow learn to make effective use of increasing parallelism. This adaptation will not be easy. Conventional synchronization techniques based on locks and conditions are unlikely to be effective in such a demanding environment. Coarse-grained locks, which protect relatively large amounts of data, do not scale, and fine-grained locks introduce substantial software engineering problem.Transactional memory is a computational model in which threads synchronize by optimistic, lock-free transactions. This synchronization model promises to alleviate many (not all) of the problems associated with locking, and there is a growing community of researchers working on both software and hardware support for this approach. This talk will survey the area, with a focus on open research problems.

...read moreread less

1,268 citations