scispace - formally typeset
Search or ask a question
Author

Steven Balensiefer

Bio: Steven Balensiefer is an academic researcher from University of Washington. The author has contributed to research in topics: Record locking & Transactional memory. The author has an hindex of 5, co-authored 5 publications receiving 504 citations.

Papers
More filters
Proceedings ArticleDOI
10 Jun 2007
TL;DR: The results on a set of Java programs show that strong atomicity can be implemented efficiently in a high-performance STM system and introduces a dynamic escape analysis that differentiates private and public data at runtime to make barriers cheaper and a static not-accessed-in-transaction analysis that removes many barriers completely.
Abstract: Transactional memory provides a new concurrency control mechanism that avoids many of the pitfalls of lock-based synchronization. High-performance software transactional memory (STM) implementations thus far provide weak atomicity: Accessing shared data both inside and outside a transaction can result in unexpected, implementation-dependent behavior. To guarantee isolation and consistent ordering in such a system, programmers are expected to enclose all shared-memory accesses inside transactions.A system that provides strong atomicity guarantees isolation even in the presence of threads that access shared data outside transactions. A strongly-atomic system also orders transactions with conflicting non-transactional memory operations in a consistent manner.In this paper, we discuss some surprising pitfalls of weak atomicity, and we present an STM system that avoids these problems via strong atomicity. We demonstrate how to implement non-transactional data accesses via efficient read and write barriers, and we present compiler optimizations that further reduce the overheads of these barriers. We introduce a dynamic escape analysis that differentiates private and public data at runtime to make barriers cheaper and a static not-accessed-in-transaction analysis that removes many barriers completely. Our results on a set of Java programs show that strong atomicity can be implemented efficiently in a high-performance STM system.

209 citations

Proceedings ArticleDOI
14 Jun 2008
TL;DR: A new weakly atomic Java STM implementation is described that provides single global lock semantics while permitting concurrent execution, but it is shown that this comes at a significant performance cost.
Abstract: As memory transactions have been proposed as a language-level replacement for locks, there is growing need for well-defined semantics. In contrast to database transactions, transaction memory (TM) semantics are complicated by the fact that programs may access the same memory locations both inside and outside transactions. Strongly atomic semantics, where non transactional accesses are treated as implicit single-operation transactions, remain difficult to provide without specialized hardware support or significant performance overhead. As an alternative, many in the community have informally proposed that a single global lock semantics [18,10], where transaction semantics are mapped to those of regions protected by a single global lock, provide an intuitive and efficiently implementable model for programmers.In this paper, we explore the implementation and performance implications of single global lock semantics in a weakly atomic STM from the perspective of Java, and we discuss why even recent STM implementations fall short of these semantics. We describe a new weakly atomic Java STM implementation that provides single global lock semantics while permitting concurrent execution, but we show that this comes at a significant performance cost. We also propose and implement various alternative semantics that loosen single lock requirements while still providing strong guarantees. We compare our new implementations to previous ones, including a strongly atomic STM.[24]

133 citations

Journal ArticleDOI
01 May 2005
TL;DR: A set of tools that enable the quantitative evaluation of architectures for quantum computers and a simple, regular architecture for ion-trap based quantum computers are presented.
Abstract: The theoretical study of quantum computation has yielded efficient algorithms for some traditionally hard problems. Correspondingly, experimental work on the underlying physical implementation technology has progressed steadily. However, almost no work has yet been done which explores the architecture design space of large scale quantum computing systems. In this paper, we present a set of tools that enable the quantitative evaluation of architectures for quantum computers. The infrastructure we created comprises a complete compilation and simulation system for computers containing thousands of quantum bits. We begin by compiling complete algorithms into a quantum instruction set. This ISA enables the simple manipulation of quantum state. Another tool we developed automatically transforms quantum software into an equivalent, fault-tolerant version required to operate on real quantum devices. Next, our infrastructure transforms the ISA into a set of low-level micro architecture specific control operations. In the future, these operations can be used to directly control a quantum computer. For now, our simulation framework quickly uses them to determine the reliability of the application for the target micro architecture. Finally, we propose a simple, regular architecture for ion-trap based quantum computers. Using our software infrastructure, we evaluate the design trade offs of this micro architecture.

80 citations

Journal ArticleDOI
TL;DR: A new weakly atomic Java STM implementation is described that provides single global lock semantics while permitting concurrent execution, but it is shown that this comes at a significant performance cost.
Abstract: As memory transactions have been proposed as a language-level replacement for locks, there is growing need for well-defined semantics. In contrast to database transactions, transaction memory (TM) semantics are complicated by the fact that programs may access the same memory locations both inside and outside transactions. Strongly atomic semantics, where non-transactional accesses are treated as implicit single-operation transactions, remain difficult to provide without specialized hardware support and/or significant performance overhead. As an alternative, many in the community have informally proposed that a single global lock semantics [16, 9], where transaction semantics are mapped to those of regions protected by a single global lock, provide an intuitive and efficiently implementable model for programmers.In this paper, we explore the implementation and performance implications of single global lock semantics in a weakly atomic STM from the perspective of Java, and we discuss why even recent STM implementations fall short of these semantics. We describe a new weakly atomic Java STM implementation that provides single global lock semantics while permitting concurrent execution, but we show that this comes at a significant performance cost. We also propose and implement various alternative semantics that loosen single lock requirements while still providing strong guarantees. We compare our new implementations to previous ones, including a strongly atomic STM. [22]

61 citations

Proceedings ArticleDOI
25 May 2005
TL;DR: QUALE is a set of tools for the design and analysis of microarchitectures for ion-trap quantum computers that is able to efficiently simulate large systems containing thousands of quantum bits.
Abstract: The theoretical study of quantum computation has yielded efficient algorithms for traditionally very hard problems. Correspondingly, the experimental work on technologies for implementing quantum computers has yielded many of the essential discrete components. Combining these components to produce an efficient and accurate quantum architecture is an open problem and exploring this design space requires an efficient evaluation framework. To date, all such frameworks have been either highly theoretical (ignoring vital issues like spatial constraints, resource contention, and the durative nature of quantum operations), or limited to systems with less than 40 qubits because of reliance on simulating the exact quantum state of qubits in the system. To address these issues the authors present QUALE, a set of tools for the design and analysis of microarchitectures for ion-trap quantum computers. QUALE allows the user to specify a quantum program in an existing quantum language, apply error correction, schedule the resulting computation on a proposed layout, and determine an upper bound on the expected accuracy of the resulting computation. By conducting analysis on a program-architecture pair QUALE takes into account realistic architectural constraints; by targeting simulation to error as opposed to the whole quantum state, QUALE is able to efficiently simulate large systems containing thousands of quantum bits.© (2005) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

28 citations


Cited by
More filters
Proceedings ArticleDOI
27 Feb 2006
TL;DR: This paper presents a new implementation of transactional memory, log-based transactionalMemory (LogTM), that makes commits fast by storing old values to a per-thread log in cacheable virtual memory and storing new values in place.
Abstract: Transactional memory (TM) simplifies parallel programming by guaranteeing that transactions appear to execute atomically and in isolation. Implementing these properties includes providing data version management for the simultaneous storage of both new (visible if the transaction commits) and old (retained if the transaction aborts) values. Most (hardware) TM systems leave old values "in place" (the target memory address) and buffer new values elsewhere until commit. This makes aborts fast, but penalizes (the much more frequent) commits. In this paper, we present a new implementation of transactional memory, log-based transactional memory (LogTM), that makes commits fast by storing old values to a per-thread log in cacheable virtual memory and storing new values in place. LogTM makes two additional contributions. First, LogTM extends a MOESI directory protocol to enable both fast conflict detection on evicted blocks and fast commit (using lazy cleanup). Second, LogTM handles aborts in (library) software with little performance penalty. Evaluations running micro- and SPLASH-2 benchmarks on a 32-way multiprocessor support our decision to optimize for commit by showing that only 1-2% of transactions abort.

724 citations

Journal ArticleDOI
TL;DR: The basic aspects of quantum error correction and fault-tolerance are examined largely through detailed examples, which are more relevant to experimentalists today and in the near future.
Abstract: Quantum error correction (QEC) and fault-tolerant quantum computation represent one of the most vital theoretical aspects of quantum information processing. It was well known from the early developments of this exciting field that the fragility of coherent quantum systems would be a catastrophic obstacle to the development of large-scale quantum computers. The introduction of quantum error correction in 1995 showed that active techniques could be employed to mitigate this fatal problem. However, quantum error correction and fault-tolerant computation is now a much larger field and many new codes, techniques, and methodologies have been developed to implement error correction for large-scale quantum algorithms. In response, we have attempted to summarize the basic aspects of quantum error correction and fault-tolerance, not as a detailed guide, but rather as a basic introduction. The development in this area has been so pronounced that many in the field of quantum information, specifically researchers who are new to quantum information or people focused on the many other important issues in quantum computation, have found it difficult to keep up with the general formalisms and methodologies employed in this area. Rather than introducing these concepts from a rigorous mathematical and computer science framework, we instead examine error correction and fault-tolerance largely through detailed examples, which are more relevant to experimentalists today and in the near future.

625 citations

Proceedings ArticleDOI
09 Jun 2007
TL;DR: For certain workloads, SigTM can match the performance of a full-featured hardware TM system, while for workloads with large read-sets it can be up to two times slower.
Abstract: We propose signature-accelerated transactional memory (SigTM), ahybrid TM system that reduces the overhead of software transactions. SigTM uses hardware signatures to track the read-set and write-set forpending transactions and perform conflict detection between concurrent threads. All other transactional functionality, including dataversioning, is implemented in software. Unlike previously proposed hybrid TM systems, SigTM requires no modifications to the hardware caches, which reduces hardware cost and simplifies support for nested transactions and multithreaded processor cores. SigTM is also the first hybrid TM system to provide strong isolation guarantees between transactional blocks and non-transactional accesses without additional read and write barriers in non-transactional code.Using a set of parallel programs that make frequent use of coarse-grain transactions, we show that SigTM accelerates software transactions by 30% to 280%. For certain workloads, SigTM can match the performance of a full-featured hardware TM system, while for workloads with large read-sets it can be up to two times slower. Overall, we show that SigTM combines the performance characteristics and strong isolation guarantees of hardware TM implementations with the low cost and flexibility of software TM systems.

340 citations

Proceedings ArticleDOI
09 Jan 2010
TL;DR: An ownership-record-free software transactional memory (STM) system that combines extremely low overhead with unusually clean semantics is presented, and the experience suggests that NOrec may be an ideal candidate for such a software system.
Abstract: Drawing inspiration from several previous projects, we present an ownership-record-free software transactional memory (STM) system that combines extremely low overhead with unusually clean semantics. While unlikely to scale to hundreds of active threads, this "NOrec" system offers many appealing features: very low fast-path latency--as low as any system we know of that admits concurrent updates; publication and privatization safety; livelock freedom; a small, constant amount of global metadata, and full compatibility with existing data structure layouts; no false conflicts due to hash collisions; compatibility with both managed and unmanaged languages, and both static and dynamic compilation; and easy acccommodation of closed nesting, inevitable (irrevocable) transactions, and starvation avoidance mechanisms. To the best of our knowledge, no extant STM system combines this set of features.While transactional memory for processors with hundreds of cores is likely to require hardware support, software implementations will be required for backward compatibility with current and near-future processors with 2--64 cores, as well as for fall-back in future machines when hardware resources are exhausted. Our experience suggests that NOrec may be an ideal candidate for such a software system. We also observe that it has considerable appeal for use within the operating system, and in systems that require both closed nesting and publication safety.

327 citations

Book
02 Jun 2010
TL;DR: This book presents an overview of the state of the art in the design and implementation of transactional memory systems, as of early spring 2010.
Abstract: The advent of multicore processors has renewed interest in the idea of incorporating transactions into the programming model used to write parallel programs. This approach, known as transactional memory, offers an alternative, and hopefully better, way to coordinate concurrent threads. The ACI (atomicity, consistency, isolation) properties of transactions provide a foundation to ensure that concurrent reads and writes of shared data do not produce inconsistent or incorrect results. At a higher level, a computation wrapped in a transaction executes atomically - either it completes successfully and commits its result in its entirety or it aborts. In addition, isolation ensures the transaction produces the same result as if no other transactions were executing concurrently. Although transactions are not a parallel programming panacea, they shift much of the burden of synchronizing and coordinating parallel computations from a programmer to a compiler, to a language runtime system, or to hardware. The challenge for the system implementers is to build an efficient transactional memory infrastructure. This book presents an overview of the state of the art in the design and implementation of transactional memory systems, as of early spring 2010. Table of Contents: Introduction / Basic Transactions / Building on Basic Transactions / Software Transactional Memory / Hardware-Supported Transactional Memory / Conclusions

309 citations