Topic

Transactional memory

About: Transactional memory is a research topic. Over the lifetime, 2365 publications have been published within this topic receiving 60818 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Design tradeoffs in speculative Transactional memory architecture

[...]

Yaobin Wang¹, Zhiqin Liu¹, Mingyan Wu¹, Fupan Wang¹, Hong An² - Show less +1 more•Institutions (2)

Southwest University of Science and Technology¹, University of Science and Technology of China²

01 Dec 2011

TL;DR: A novel speculative transactional memory architecture “SPT” is proposed that supports both TLS&TM semantics, including its special hardware, compiler and execution support and further trades off several important factors in the multicore architecture design.

...read moreread less

Abstract: Combining the benefits of Thread level speculation (TLS) and Transactional memory (TM) can enhance the performance of chip multiprocessor (CMP) effectively This paper proposes a novel speculative transactional memory architecture “SPT” that supports both TLS&TM semantics, including its special hardware, compiler and execution support It further trades off several important factors in the multicore architecture design The experimental results show that 16 cache lines is the proper speculative buffer capacity and the write back strategy is the better cache design choice in speculation

...read moreread less

Dissertation•

Enhancing the efficiency and practicality of software transactional memory on massively multithreaded systems

[...]

Gökçen Kestor

22 Mar 2013

TL;DR: A new STM design is proposed, STM2, based on an assisted execution model in which time-consuming TM operations are offloaded to auxiliary threads while application threads optimistically perform computation, and subtle transactional data races in widely-used STAMP applications are discovered.

...read moreread less

Abstract: Chip Multithreading (CMT) processors promise to deliver higher performance by running more than one stream of instructions in parallel. To exploit CMT's capabilities, programmers have to parallelize their applications, which is not a trivial task. Transactional Memory (TM) is one of parallel programming models that aims at simplifying synchronization by raising the level of abstraction between semantic atomicity and the means by which that atomicity is achieved. TM is a promising programming model but there are still important challenges that must be addressed to make it more practical and efficient in mainstream parallel programming. The first challenge addressed in this dissertation is that of making the evaluation of TM proposals more solid with realistic TM benchmarks and being able to run the same benchmarks on different STM systems. We first introduce a benchmark suite, RMS-TM, a comprehensive benchmark suite to evaluate HTMs and STMs. RMS-TM consists of seven applications from the Recognition, Mining and Synthesis (RMS) domain that are representative of future workloads. RMS-TM features current TM research issues such as nesting and I/O inside transactions, while also providing various TM characteristics. Most STM systems are implemented as user-level libraries: the programmer is expected to manually instrument not only transaction boundaries, but also individual loads and stores within transactions. This library-based approach is increasingly tedious and error prone and also makes it difficult to make reliable performance comparisons. To enable an "apples-to-apples" performance comparison, we then develop a software layer that allows researchers to test the same applications with interchangeable STM back ends. The second challenge addressed is that of enhancing performance and scalability of TM applications running on aggressive multi-core/multi-threaded processors. Performance and scalability of current TM designs, in particular STM desings, do not always meet the programmer's expectation, especially at scale. To overcome this limitation, we propose a new STM design, STM2, based on an assisted execution model in which time-consuming TM operations are offloaded to auxiliary threads while application threads optimistically perform computation. Surprisingly, our results show that STM2 provides, on average, speedups between 1.8x and 5.2x over state-of-the-art STM systems. On the other hand, we notice that assisted-execution systems may show low processor utilization. To alleviate this problem and to increase the efficiency of STM2, we enriched STM2 with a runtime mechanism that automatically and adaptively detects application and auxiliary threads' computing demands and dynamically partition hardware resources between the pair through the hardware thread prioritization mechanism implemented in POWER machines. The third challenge is to define a notion of what it means for a TM program to be correctly synchronized. The current definition of transactional data race requires all transactions to be totally ordered "as if'' serialized by a global lock, which limits the scalability of TM designs. To remove this constraint, we first propose to relax the current definition of transactional data race to allow a higher level of concurrency. Based on this definition we propose the first practical race detection algorithm for C/C++ applications (TRADE) and implement the corresponding race detection tool. Then, we introduce a new definition of transactional data race that is more intuitive, transparent to the underlying TM implementation, can be used for a broad set of C/C++ TM programs. Based on this new definition, we proposed T-Rex, an efficient and scalable race detection tool for C/C++ TM applications. Using TRADE and T-Rex, we have discovered subtle transactional data races in widely-used STAMP applications which have not been reported in the past.

...read moreread less

Posted Content•

An Efficient Algorithm for Maintaining Acyclicity in Concurrent Graph Objects

[...]

Sathya Peri, Muktikanta Sa, Nandini Singhal

12 Nov 2016-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: This paper proposes an algorithm for maintaining a concurrent directed graph that is concurrently being updated by threads adding/deleting vertices and edges and poses the constraint that the graph should always be acyclic, the first work to propose a concurrent data structure for an adjacency list representation of the graphs.

...read moreread less

Abstract: In this paper, we propose an algorithm for maintaining a concurrent directed graph (for shared memory architecture) that is concurrently being updated by threads adding/deleting vertices and edges. The update methods of the algorithm are deadlock-free while the contains methods are wait-free. To the the best of our knowledge, this is the first work to propose a concurrent data structure for an adjacency list representation of the graphs. We extend the lazy list implementation of concurrent set for achieving this. We believe that there are many applications that can benefit from this concurrent graph structure. An important application that inspired us is SGT in databases and Transactional Memory. Motivated by this application, on this concurrent graph data-structure, we pose the constraint that the graph should always be acyclic. We ensure this by checking for graph acyclicity whenever we add an edge. To detect the cycle efficiently we have proposed a Wait-free reachability algorithm. We have compared the performance of the proposed concurrent data structure with coarse-grained locking implementation which has been traditionally used in implementing SGT. We show that our algorithm achieves on an average 8x improvement in throughput as compared to coarse-grained and sequential implementations.

...read moreread less

Proceedings Article•DOI•

Transactional Composition of Nonblocking Data Structures

[...]

Wentao Cai, Haosen Wen, Michael L. Scott

03 Jan 2023

TL;DR: NBTC as mentioned in this paper is a new methodology for atomic composition of nonblocking operations on concurrent data structures, which makes it easy to transform most nonblocking data structures into transactional counterparts while preserving their nonblocking liveness and high concurrency.

...read moreread less

Abstract: We introduce nonblocking transaction composition (NBTC), a new methodology for atomic composition of nonblocking operations on concurrent data structures. Unlike previous software transactional memory (STM) approaches, NBTC leverages the linearizability of existing nonblocking structures, reducing the number of memory accesses that must be executed together, atomically, to only one per operation in most cases (these are typically the linearizing instructions of the constituent operations). Our obstruction-free implementation of NBTC, which we call Medley, makes it easy to transform most nonblocking data structures into transactional counterparts while preserving their nonblocking liveness and high concurrency. In our experiments, Medley outperforms Lock-Free Transactional Transform (LFTT), the fastest prior competing methodology, by 40--170%. The marginal overhead of Medley's transactional composition, relative to separate operations performed in succession, is roughly 2.2×. For persistent memory, we observe that failure atomicity for transactions can be achieved "almost for free" with epoch-based periodic persistence. Toward that end, we integrate Medley with nbMontage, a general system for periodically persistent data structures. The resulting txMontage provides ACID transactions and achieves throughput up to two orders of magnitude higher than that of the OneFile persistent STM system.

...read moreread less

Posted Content•DOI•

Implementing and Verifying Release-Acquire Transactional Memory (Extended Version)

[...]

30 Jul 2022

TL;DR: TMS2-RA as mentioned in this paper is a relaxed operational transactional memory (TM) specification that provides a formal semantics for TM libraries and their clients that can be implemented by a C11 library, TML-RA, that uses relaxed and release-acquire atomics.

...read moreread less

Abstract: Transactional memory (TM) is an intensively studied synchronisation paradigm with many proposed implementations in software and hardware, and combinations thereof. However, TM under relaxed memory, e.g., C11 (the 2011 C/C++ standard) is still poorly understood, lacking rigorous foundations that support verifiable implementations. This paper addresses this gap by developing TMS2-RA, a relaxed operational TM specification. We integrate TMS2-RA with RC11 (the repaired C11 memory model that disallows load-buffering) to provide a formal semantics for TM libraries and their clients. We develop a logic, TARO, for verifying client programs that use TMS2-RA for synchronisation. We also show how TMS2-RA can be implemented by a C11 library, TML-RA, that uses relaxed and release-acquire atomics, yet guarantees the synchronisation properties required by TMS2-RA. We benchmark TML-RA and show that it outperforms its sequentially consistent counterpart in the STAMP benchmarks. Finally, we use a simulation-based verification technique to prove correctness of TML-RA. Our entire development is supported by the Isabelle/HOL proof assistant.

...read moreread less

Collapse

Network Information

Performance

Metrics

2,421

Papers

62,681

Citations

No. of papers in the topic in previous years
Year	Papers
2023	16
2022	40
2021	29
2020	63
2019	70
2018	88

Transactional memory

Papers published on a yearly basis

Papers

Trending Questions (7)

Network Information

Related Topics (5)

Performance

Metrics