scispace - formally typeset
Search or ask a question
Topic

Transactional memory

About: Transactional memory is a research topic. Over the lifetime, 2365 publications have been published within this topic receiving 60818 citations.


Papers
More filters
Journal ArticleDOI
01 Nov 2020
TL;DR: This article develops two novel and provably correct LL/SC emulation schemes, which are implemented in the Synopsys DesignWare ARC nSIM DBT system, and evaluate their implementations against full applications, and targeted microbenchmarks.
Abstract: Dynamic binary translation (DBT) requires the implementation of load-link/store-conditional (LL/SC) primitives for guest systems that rely on this form of synchronization. When targeting, e.g., $\times 86$ host systems, LL/SC guest instructions are typically emulated using atomic compare-and-swap (CAS) instructions on the host. Whilst this direct mapping is efficient, this approach is problematic due to subtle differences between LL/SC and CAS semantics. In this article, we demonstrate that this is a real problem, and we provide code examples that fail to execute correctly on QEMU and a commercial DBT system, which both use the CAS approach to LL/SC emulation. We then develop two novel and provably correct LL/SC emulation schemes: 1) a purely software-based scheme, which uses the DBT system’s page translation cache for correctly selecting between fast, but unsynchronized, and slow, but fully synchronized memory accesses and 2) a hardware-accelerated scheme that leverages hardware transactional memory (HTM) provided by the host. We have implemented these two schemes in the Synopsys DesignWare ARC nSIM DBT system, and we evaluate our implementations against full applications, and targeted microbenchmarks. We demonstrate that our novel schemes are not only correct but also deliver competitive performance on-par or better than the widely used, but broken CAS scheme.

3 citations

Dissertation
01 Jan 2007
TL;DR: UTM, a hardware transactional memory system allowing unbounded virtualizable transactions, is presented, and it is shown how a hybrid system can be obtained, a hybrid of the software and hardware systems, obtaining the benefits of both.
Abstract: Transactions are gaining ground as a programmer-friendly means of expressing concurrency, as microarchitecture trends make it clear that parallel systems are in our future. This thesis presents the design and implementation of four efficient and powerful transaction systems: APEX, an object-oriented software-only system; UTM and LTM, two scalable systems using custom processor extensions; and HYAPEX, a hybrid of the software and hardware systems, obtaining the benefits of both. The software transaction system implements strong atomicity, which ensures that transactions are protected from the influence of nontransactional code. Previous software systems use weaker atomicity guarantees because strong atomicity is presumed to be too expensive. In this thesis strong atomicity is obtained with minimal slowdown for nontransactional code. Compiler analyses can further improve the efficiency of the mechanism, which has been formally verified with the SPIN model-checker. The low overhead of APEX allows it to be profitably combined with a hardware transaction system to provide fast execution of short and small transactions, while allowing fallback to software for large or complicated transactions. I present UTM, a hardware transactional memory system allowing unbounded virtualizable transactions, and show how a hybrid system can be obtained. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

3 citations

Proceedings ArticleDOI
08 Dec 2010
TL;DR: This paper proposes a scheme for automatic detection of view access in the View-Oriented Parallel Programming (VOPP) model, and demonstrates that the performance of Maotai 3.0 surpasses transactional memory models such as TL-2.0.
Abstract: This paper proposes a scheme for automatic detection of view access in the View-Oriented Parallel Programming (VOPP) model. VOPP is a shared-memory-based, data-centric model that uses “views” to bundle mutual exclusion with data access. Based on the automatic detection scheme, a view is automatically acquired when first accessed, and automatically released at proper time. This scheme simplifies the VOPP model and prevents programming errors. With this scheme, the programmability of VOPP is similar to transactional memory models. In addition, VOPP can eliminate data races without compromising performance. A new VOPP implementation, Maotai 3.0, has been developed and incorporated the above features. Experimental results demonstrate that the performance of Maotai 3.0 surpasses transactional memory models such as TL-2.

3 citations

Posted Content
TL;DR: In this article, the performance power of software combining in designing persistent algorithms and data structures was studied, such as stacks and queues based on PBcomb, as well as on PWFcomb, a wait-free universal construction.
Abstract: We study the performance power of software combining in designing persistent algorithms and data structures. We present Bcomb, a new blocking highly-efficient combining protocol, and built upon it to get PBcomb, a persistent version of it that performs a small number of persistence instructions and exhibits low synchronization cost. We built fundamental recoverable data structures, such as stacks and queues based on PBcomb, as well as on PWFcomb, a wait-free universal construction we present. Our experiments show that PBcomb and PWFcomb outperform by far state-of-the-art recoverable universal constructions and transactional memory systems, many of which ensure weaker consistency properties than our algorithms. We built recoverable queues and stacks, based on PBcomb and PWFcomb, and present experiments to show that they have much better performance than previous recoverable implementations of stacks and queues. We build the first recoverable implementation of a concurrent heap and present experiments to show that it has good performance when the size of the heap is not very large.

3 citations

Proceedings Article
Ketil Malde1
01 Jan 2013
TL;DR: This work investigates STM in the context of genome assembly, and demonstrates that a program using STM is able to successfully parallelize the genome scaffolding process with a near linear speedup.
Abstract: Parallel programs are key to exploiting the performance of modern computers, but traditional facilities for synchronizing threads of execution are notoriously difficult to use correctly, especially for problems with a non-trivial structure. Software transactional memory is a different approach to managing the complexity of interacting threads. By eliminating locking, many of the complexities of concurrency is eliminated, and the resulting programs are composable, and thus simplifies refactoring and other modifications. Here, we investigate STM in the context of genome assembly, and demonstrate that a program using STM is able to successfully parallelize the genome scaffolding process with a near linear speedup.

3 citations


Network Information
Related Topics (5)
Compiler
26.3K papers, 578.5K citations
87% related
Cache
59.1K papers, 976.6K citations
86% related
Parallel algorithm
23.6K papers, 452.6K citations
84% related
Model checking
16.9K papers, 451.6K citations
84% related
Programming paradigm
18.7K papers, 467.9K citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202316
202240
202129
202063
201970
201888