Home
/
Authors
/
Chi Cao Minh

Author

Chi Cao Minh

Other affiliations: Microsoft, Cisco Systems, Inc., Business International Corporation

Bio: Chi Cao Minh is an academic researcher from Stanford University. The author has contributed to research in topics: Transactional memory & Software transactional memory. The author has an hindex of 17, co-authored 25 publications receiving 2911 citations. Previous affiliations of Chi Cao Minh include Microsoft & Cisco Systems, Inc..

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

STAMP: Stanford Transactional Applications for Multi-Processing

[...]

Chi Cao Minh¹, JaeWoong Chung¹, Christos Kozyrakis¹, Kunle Olukotun¹•Institutions (1)

Stanford University¹

30 Sep 2008

TL;DR: This paper introduces the Stanford Transactional Application for Multi-Processing (STAMP), a comprehensive benchmark suite for evaluating TM systems and uses the suite to evaluate six different TM systems, identify their shortcomings, and motivate further research on their performance characteristics.

...read moreread less

Abstract: Transactional Memory (TM) is emerging as a promising technology to simplify parallel programming. While several TM systems have been proposed in the research literature, we are still missing the tools and workloads necessary to analyze and compare the proposals. Most TM systems have been evaluated using microbenchmarks, which may not be representative of any real-world behavior, or individual applications, which do not stress a wide range of execution scenarios. We introduce the Stanford Transactional Application for Multi-Processing (STAMP), a comprehensive benchmark suite for evaluating TM systems. STAMP includes eight applications and thirty variants of input parameters and data sets in order to represent several application domains and cover a wide range of transactional execution cases (frequent or rare use of transactions, large or small transactions, high or low contention, etc.). Moreover, STAMP is portable across many types of TM systems, including hardware, software, and hybrid systems. In this paper, we provide descriptions and a detailed characterization of the applications in STAMP. We also use the suite to evaluate six different TM systems, identify their shortcomings, and motivate further research on their performance characteristics.

...read moreread less

934 citations

Proceedings Article•DOI•

McRT-STM: a high performance software transactional memory system for a multi-core runtime

[...]

Bratin Saha¹, Ali-Reza Adl-Tabatabai¹, Richard L. Hudson¹, Chi Cao Minh², Benjamin C. Hertzberg² - Show less +1 more•Institutions (2)

Intel¹, Stanford University²

29 Mar 2006

TL;DR: McRT-STM as mentioned in this paper is a software transactional memory (STM) system that is part of McRT, an experimental Multi-Core RunTime (MCRT) implementation that supports nested transactions with partial aborts, conditional signaling within a transaction, and object based conflict detection for C/C++ applications.

...read moreread less

Abstract: Applications need to become more concurrent to take advantage of the increased computational power provided by chip level multiprocessing. Programmers have traditionally managed this concurrency using locks (mutex based synchronization). Unfortunately, lock based synchronization often leads to deadlocks, makes fine-grained synchronization difficult, hinders composition of atomic primitives, and provides no support for error recovery. Transactions avoid many of these problems, and therefore, promise to ease concurrent programming.We describe a software transactional memory (STM) system that is part of McRT, an experimental Multi-Core RunTime. The McRT-STM implementation uses a number of novel algorithms, and supports advanced features such as nested transactions with partial aborts, conditional signaling within a transaction, and object based conflict detection for C/C++ applications. The McRT-STM exports interfaces that can be used from C/C++ programs directly or as a target for compilers translating higher level linguistic constructs.We present a detailed performance analysis of various STM design tradeoffs such as pessimistic versus optimistic concurrency, undo logging versus write buffering, and cache line based versus object based conflict detection. We also show a MCAS implementation that works on arbitrary values, coexists with the STM, and can be used as a more efficient form of transactional memory. To provide a baseline we compare the performance of the STM with that of fine-grained and coarse-grained locking using a number of concurrent data structures on a 16-processor SMP system. We also show our STM performance on a non-synthetic workload -- the Linux sendmail application.

...read moreread less

487 citations

Proceedings Article•DOI•

An effective hybrid transactional memory system with strong isolation guarantees

[...]

Chi Cao Minh¹, Martin Trautmann¹, JaeWoong Chung¹, Austen McDonald¹, Nathan Bronson¹, Jared Casper¹, Christos Kozyrakis¹, Kunle Olukotun¹ - Show less +4 more•Institutions (1)

Stanford University¹

09 Jun 2007

TL;DR: For certain workloads, SigTM can match the performance of a full-featured hardware TM system, while for workloads with large read-sets it can be up to two times slower.

...read moreread less

Abstract: We propose signature-accelerated transactional memory (SigTM), ahybrid TM system that reduces the overhead of software transactions. SigTM uses hardware signatures to track the read-set and write-set forpending transactions and perform conflict detection between concurrent threads. All other transactional functionality, including dataversioning, is implemented in software. Unlike previously proposed hybrid TM systems, SigTM requires no modifications to the hardware caches, which reduces hardware cost and simplifies support for nested transactions and multithreaded processor cores. SigTM is also the first hybrid TM system to provide strong isolation guarantees between transactional blocks and non-transactional accesses without additional read and write barriers in non-transactional code.Using a set of parallel programs that make frequent use of coarse-grain transactions, we show that SigTM accelerates software transactions by 30% to 280%. For certain workloads, SigTM can match the performance of a full-featured hardware TM system, while for workloads with large read-sets it can be up to two times slower. Overall, we show that SigTM combines the performance characteristics and strong isolation guarantees of hardware TM implementations with the low cost and flexibility of software TM systems.

...read moreread less

340 citations

Journal Article•DOI•

Architectural Semantics for Practical Transactional Memory

[...]

Austen McDonald¹, JaeWoong Chung¹, Brian D. Carlstrom¹, Chi Cao Minh¹, Hassan Chafi¹, Christos Kozyrakis¹, Kunle Olukotun¹ - Show less +3 more•Institutions (1)

Stanford University¹

01 May 2006

TL;DR: This paper introduces three key mechanisms: two-phase commit; support for software handlers on commit, violation, and abort; and full support for open- and closed-nested transactions with independent rollback, which provide a flexible interface to implement programming language and operating system functionality.

...read moreread less

Abstract: Transactional Memory (TM) simplifies parallel programming by allowing for parallel execution of atomic tasks. Thus far, TM systems have focused on implementing transactional state buffering and conflict resolution. Missing is a robust hardware/software interface, not limited to simplistic instructions defining transaction boundaries. Without rich semantics, current TM systems cannot support basic features of modern programming languages and operating systems such as transparent library calls, conditional synchronization, system calls, I/O, and runtime exceptions. This paper presents a comprehensive instruction set architecture (ISA) for TM systems. Our proposal introduces three key mechanisms: two-phase commit; support for software handlers on commit, violation, and abort; and full support for open- and closed-nested transactions with independent rollback. These mechanisms provide a flexible interface to implement programming language and operating system functionality. We also show that these mechanisms are practical to implement at the ISA and microarchitecture level for various TM systems. Using an execution-driven simulation, we demonstrate both the functionality (e.g., I/O and conditional scheduling within transactions) and performance potential (2.2× improvement for SPECjbb2000) of the proposed mechanisms. Overall, this paper establishes a rich and efficient interface to foster both hardware and software research on transactional memory.

...read moreread less

221 citations

Journal Article•DOI•

The Atomos transactional programming language

[...]

Brian D. Carlstrom¹, Austen McDonald¹, Hassan Chafi¹, JaeWoong Chung¹, Chi Cao Minh¹, Christos Kozyrakis¹, Kunle Olukotun¹ - Show less +3 more•Institutions (1)

Stanford University¹

11 Jun 2006

TL;DR: The implementation of the Atomos scheduler demonstrates the use of open nesting within the virtual machine and introduces the concept of transactional memory violation handlers that allow programs to recover from data dependency violations without rolling back.

...read moreread less

Abstract: Atomos is the first programming language with implicit transactions, strong atomicity, and a scalable multiprocessor implementation. Atomos is derived from Java, but replaces its synchronization and conditional waiting constructs with simpler transactional alternatives.The Atomos watch statement allows programmers to specify fine-grained watch sets used with the Atomos retry conditional waiting statement for efficient transactional conflict-driven wakeup even in transactional memory systems with a limited number of transactional contexts. Atomos supports open-nested transactions, which are necessary for building both scalable application programs and virtual machine implementations.The implementation of the Atomos scheduler demonstrates the use of open nesting within the virtual machine and introduces the concept of transactional memory violation handlers that allow programs to recover from data dependency violations without rolling back.Atomos programming examples are given to demonstrate the usefulness of transactional programming primitives. Atomos and Java are compared through the use of several benchmarks. The results demonstrate both the improvements in parallel programming ease and parallel program performance provided by Atomos.

...read moreread less

185 citations

1
2
3
4
…
5

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Evaluating MapReduce for Multi-core and Multiprocessor Systems

[...]

C. Ranger¹, R. Raghuraman¹, A. Penmetsa¹, Gary Bradski¹, Christos Kozyrakis¹ - Show less +1 more•Institutions (1)

Stanford University¹

10 Feb 2007

TL;DR: It is established that, given a careful implementation, MapReduce is a promising model for scalable performance on shared-memory systems with simple parallel code.

...read moreread less

Abstract: This paper evaluates the suitability of the MapReduce model for multi-core and multi-processor systems. MapReduce was created by Google for application development on data-centers with thousands of servers. It allows programmers to write functional-style code that is automatically parallelized and scheduled in a distributed system. We describe Phoenix, an implementation of MapReduce for shared-memory systems that includes a programming API and an efficient runtime system. The Phoenix runtime automatically manages thread creation, dynamic task scheduling, data partitioning, and fault tolerance across processor nodes. We study Phoenix with multi-core and symmetric multiprocessor systems and evaluate its performance potential and error recovery features. We also compare MapReduce code to code written in lower-level APIs such as P-threads. Overall, we establish that, given a careful implementation, MapReduce is a promising model for scalable performance on shared-memory systems with simple parallel code

...read moreread less

1,058 citations

Proceedings Article•DOI•

STAMP: Stanford Transactional Applications for Multi-Processing

[...]

Chi Cao Minh¹, JaeWoong Chung¹, Christos Kozyrakis¹, Kunle Olukotun¹•Institutions (1)

Stanford University¹

30 Sep 2008

...read moreread less

934 citations

Book Chapter•DOI•

Transactional locking II

[...]

Dave Dice¹, Ori Shalev¹, Nir Shavit¹•Institutions (1)

Sun Microsystems Laboratories¹

18 Sep 2006

TL;DR: TL2 as mentioned in this paper is a software transactional memory (STM) algorithm based on a combination of commit-time locking and a novel global version-clock based validation technique, which is ten times faster than a single lock.

...read moreread less

Abstract: The transactional memory programming paradigm is gaining momentum as the approach of choice for replacing locks in concurrent programming. This paper introduces the transactional locking II (TL2) algorithm, a software transactional memory (STM) algorithm based on a combination of commit-time locking and a novel global version-clock based validation technique. TL2 improves on state-of-the-art STMs in the following ways: (1) unlike all other STMs it fits seamlessly with any system's memory life-cycle, including those using malloc/free (2) unlike all other lock-based STMs it efficiently avoids periods of unsafe execution, that is, using its novel version-clock validation, user code is guaranteed to operate only on consistent memory states, and (3) in a sequence of high performance benchmarks, while providing these new properties, it delivered overall performance comparable to (and in many cases better than) that of all former STM algorithms, both lock-based and non-blocking. Perhaps more importantly, on various benchmarks, TL2 delivers performance that is competitive with the best hand-crafted fine-grained concurrent structures. Specifically, it is ten-fold faster than a single lock. We believe these characteristics make TL2 a viable candidate for deployment of transactional memory today, long before hardware transactional support is available.

...read moreread less

891 citations

Proceedings Article•DOI•

NV-Heaps: making persistent objects fast and safe with next-generation, non-volatile memories

[...]

Joel Coburn¹, Adrian M. Caulfield¹, Akel Ameen D¹, Laura M. Grupp¹, Rajesh Gupta¹, Ranjit Jhala¹, Steven Swanson¹ - Show less +3 more•Institutions (1)

University of California, San Diego¹

05 Mar 2011

TL;DR: A lightweight, high-performance persistent object system called NV-heaps is implemented that provides transactional semantics while preventing these errors and providing a model for persistence that is easy to use and reason about.

...read moreread less

Abstract: Persistent, user-defined objects present an attractive abstraction for working with non-volatile program state. However, the slow speed of persistent storage (i.e., disk) has restricted their design and limited their performance. Fast, byte-addressable, non-volatile technologies, such as phase change memory, will remove this constraint and allow programmers to build high-performance, persistent data structures in non-volatile storage that is almost as fast as DRAM. Creating these data structures requires a system that is lightweight enough to expose the performance of the underlying memories but also ensures safety in the presence of application and system failures by avoiding familiar bugs such as dangling pointers, multiple free()s, and locking errors. In addition, the system must prevent new types of hard-to-find pointer safety bugs that only arise with persistent objects. These bugs are especially dangerous since any corruption they cause will be permanent.We have implemented a lightweight, high-performance persistent object system called NV-heaps that provides transactional semantics while preventing these errors and providing a model for persistence that is easy to use and reason about. We implement search trees, hash tables, sparse graphs, and arrays using NV-heaps, BerkeleyDB, and Stasis. Our results show that NV-heap performance scales with thread count and that data structures implemented using NV-heaps out-perform BerkeleyDB and Stasis implementations by 32x and 244x, respectively, by avoiding the operating system and minimizing other software overheads. We also quantify the cost of enforcing the safety guarantees that NV-heaps provide and measure the costs of NV-heap primitive operations.

...read moreread less

850 citations

Proceedings Article•DOI•

Composable memory transactions

[...]

Tim Harris¹, Simon Marlow¹, Simon Peyton-Jones¹, Maurice Herlihy¹•Institutions (1)

Microsoft¹

15 Jun 2005

TL;DR: This paper presents a new concurrency model, based on transactional memory, that offers far richer composition, and describes new modular forms of blocking and choice that have been inaccessible in earlier work.

...read moreread less

Abstract: Writing concurrent programs is notoriously difficult, and is of increasing practical importance. A particular source of concern is that even correctly-implemented concurrency abstractions cannot be composed together to form larger abstractions. In this paper we present a new concurrency model, based on transactional memory, that offers far richer composition. All the usual benefits of transactional memory are present (e.g. freedom from deadlock), but in addition we describe new modular forms of blocking and choice that have been inaccessible in earlier work.

...read moreread less

815 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse