scispace - formally typeset
Search or ask a question
Topic

Transactional memory

About: Transactional memory is a research topic. Over the lifetime, 2365 publications have been published within this topic receiving 60818 citations.


Papers
More filters
Proceedings Article
01 Jan 2012
TL;DR: Preliminary evaluation results indicate that the proposed FastLane approach provides promising performance at low thread counts: FastL Lane almost systematically wins over a classical STM in the 2-4 threads range, and often performs better than sequential execu-
Abstract: Software transactional memory (STM) can lead to scalable implementations of concurrent programs, as the relative performance of an application increases with the number of threads that support it. However, the absolute performance is typically impaired by the overheads of transaction management and instrumented accesses to shared memory. This often leads a STMbased program with a low thread count to perform worse than a sequential, non-instrumented version of the same application. We propose FastLane, a new STM system that bridges the performance gap between sequential execution and classical STM algorithms when running on few cores (see Figure 1). FastLane seeks to reduce instrumentation costs and thus performance degradation in its target operation range. We introduce a family of algorithms that differentiate between two types of threads: One thread (the master) is allowed to commit transactions without aborting, thus with minimal instrumentation and management costs and at nearly sequential speed, while other threads (the helpers) execute speculatively. Helpers typically run slower than STM threads, as they should contribute to the application progress without impairing on the performance of the master (in particular, helpers never cause aborts for the master’s transactions) in addition to performing the extra bookkeeping associated with memory accesses. FastLane is implemented within a state-of-the-art STM runtime and compiler. Multiple code paths are generated for execution: sequential on a single core, FastLane (master and helper) for few cores, and STM for many cores. Applications can dynamically select a variant at runtime, depending on the number of cores available for execution. Preliminary evaluation results indicate that our approach provides promising performance at low thread counts: FastLane almost systematically wins over a classical STM in the 2-4 threads range, and often performs better than sequential execu-
Posted Content
TL;DR: It is shown that because of the formal properties of RMMTs, HTM is a good fit for adding concurrency to otherwise slow lock-based alternatives, and performs better than locks when the number of write operations increase, making it a practical structure to use in several write-intensive contexts.
Abstract: Succinct trees, such as wavelet trees and those based on, for instance, range Min-Max trees (RMMTs), are a family of practical data structures that store information close to their information-theoretic space lower bound. These structures are often static; meaning that once they are built, nodes cannot be added, deleted or modified. This read-only property simplifies concurrency. However, newer versions of these data structures allow for a fair degree of dynamism. Parallel programming using Hardware Transactional Memory(HTM), has been available in mainstream microprocessors since a few years ago. One limitation of HTM is still on the size of each transaction. This is why HTM's use, for the moment, is limited to operations that involve few memory addresses that need to be updated atomically, or where the level of concurrency is low. We provide the first available implementation of a concurrent, dynamic RMMT based on HTM, and we compare empirically how well HTM performs compared to a naive implementation using locks. We have shown that because of the formal properties of RMMTs, HTM is a good fit for adding concurrency to otherwise slow lock-based alternatives. We have also shown that HTM performs better than locks when the number of write operations increase, making it a practical structure to use in several write-intensive contexts. This is, as far as we know, the only practical implementation of RMMTs thoroughly tested using HTM.
Book ChapterDOI
28 Jun 2013
TL;DR: The concept of multi-thread speculation with the transactional memory into a new model and all the dynamic data can be managed by the DRDM, which can manage data and resolve incorrect data access among threads efficiently.
Abstract: In this paper, we propose a new Distributed Run-Time Dynamic Data Manager (DRDM) to manage the dynamic data among parallel threads and to handle the dynamically incorrect data access caused by parallel execution threads efficiently. Also, we combine the concept of multi-thread speculation with the transactional memory into a new model and all the dynamic data can be managed by the DRDM. The DRDM can detect incorrect data access immediately and resolve them to keep data consistent among threads and ensure the threads do not violate the data dependences during execution dynamically. We have demonstrated that the performance of parallel applications running with the DRDM can be at least 1.4 times faster than those running in sequential and thus the DRDM can manage data and resolve incorrect data access among threads efficiently.
Proceedings ArticleDOI
01 Sep 2015
TL;DR: This paper proposes Explicit Bit Barriers (EBB), a novel approach for fast synchronization between the mutator and HTM-encapsulated relocation tasks, and compares the efficiency of EBBs with read barriers based on virtual memory that rely on OS-level trap handlers.
Abstract: Multicore architectures offer a convenient way to unlock concurrency between application (called mutator) and garbage collector, yet efficient synchronization between the two by means of barriers is critical to unlock this concurrency. Hardware Transactional Memory (HTM), now commercially available, opens up new ways for synchronization with dramatically lower overhead for the mutator. Unfortunately, HTM-based schemes proposed to date either require specialized hardware support or impose severe overhead through invocation of OS-level trap handlers. This paper proposes Explicit Bit Barriers (EBB), a novel approach for fast synchronization between the mutator and HTM-encapsulated relocation tasks. We compare the efficiency of EBBs with read barriers based on virtual memory that rely on OS-level trap handlers. We show that EBBs are nearly as efficient as those needing specialized hardware, but run on commodity Intel processors with TSX extensions.
Proceedings ArticleDOI
04 Oct 2015
TL;DR: Txit as discussed by the authors leverages hardware transactional memory support from Intel Haswell processors to enforce these artificial transactions, which greatly reduces the set of possible interleavings by inserting transactions into the implementation of a lock-free data structure.
Abstract: Among all classes of parallel programming abstractions, lock-free data structures are considered one of the most scalable and efficient thanks to their fine-grained style of synchronization. However, they are also challenging for developers and tools to verify because of the huge number of possible interleavings that result from fine-grained synchronizations.This paper addresses this fundamental problem between performance and verifiability of lock-free data structure implementations. We present Txit, a system that greatly reduces the set of possible interleavings by inserting transactions into the implementation of a lock-free data structure. We leverage hardware transactional memory support from Intel Haswell processors to enforce these artificial transactions. Evaluation on six popular lock-free data structure libraries shows that Txit makes it easy to verify lock-free data structures while incurring acceptable runtime overhead. Further analysis shows that two inefficiencies in Haswell are the largest contributors to this overhead.

Network Information
Related Topics (5)
Compiler
26.3K papers, 578.5K citations
87% related
Cache
59.1K papers, 976.6K citations
86% related
Parallel algorithm
23.6K papers, 452.6K citations
84% related
Model checking
16.9K papers, 451.6K citations
84% related
Programming paradigm
18.7K papers, 467.9K citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202316
202240
202129
202063
201970
201888