scispace - formally typeset
Search or ask a question
Topic

Transactional memory

About: Transactional memory is a research topic. Over the lifetime, 2365 publications have been published within this topic receiving 60818 citations.


Papers
More filters
Proceedings ArticleDOI
21 Jan 2009
TL;DR: This paper is the first to formally define the progress semantics of lockbased TMs, which are considered the most effective in practice, and uses this semantics to reduce the problems of reasoning about the correctness and computability power of lock- based TMs to those of simple try-lock objects.
Abstract: Transactional memory (TM) is a promising paradigm for concurrent programming. Whereas the number of TM implementations is growing, however, little research has been conducted to precisely define TM semantics, especially their progress guarantees. This paper is the first to formally define the progress semantics of lockbased TMs, which are considered the most effective in practice.We use our semantics to reduce the problems of reasoning about the correctness and computability power of lock-based TMs to those of simple try-lock objects. More specifically, we prove that checking the progress of any set of transactions accessing an arbitrarily large set of shared variables can be reduced to verifying a simple property of each individual (logical) try-lock used by those transactions. We use this theorem to determine the correctness of state-of-the-art lock-based TMs and highlight various configuration ambiguities. We also prove that lock-based TMs have consensus number 2. This means that, on the one hand, a lock-based TM cannot be implemented using only read-write memory, but, on the other hand, it does not need very powerful instructions such as the commonly used compare-and-swap.We finally use our semantics to formally capture an inherent trade-off in the performance of lock-based TM implementations. Namely, we show that the space complexity of every lock-based software TM implementation that uses invisible reads is at least exponential in the number of objects accessible to transactions.

73 citations

Proceedings ArticleDOI
15 Sep 2007
TL;DR: This study reports the experience implementing a realistic application using transactional memory (TM) and evaluates the exploitable parallelism of a transactional parallel implementation and explores how it can be adapted to deliver better performance.
Abstract: Transactional memory proposes an alternative synchronization primitive to traditional locks. Its promise is to simplify the software development of multi-threaded applications while at the same time delivering the performance of parallel applications using (complex and error prone) fine grain locking. This study reports our experience implementing a realistic application using transactional memory (TM). The application is Lee's routing algorithm and was selected for its abundance of parallelism but difficulty of expressing it with locks. Each route between a source and a destination point in a grid can be considered a unit of parallelism. Starting from this simple approach, we evaluate the exploitable parallelism of a transactional parallel implementation and explore how it can be adapted to deliver better performance. The adaptations do not introduce locks nor alter the essence of the implemented algorithm, but deliver up to 20 times more parallelism. The adaptations are derived from understanding the application itself and TM. The evaluation simulates an abstracted TM system and, thus, the results are independent of specific software or hardware TM implemented, and describe properties of the application.

73 citations

Proceedings ArticleDOI
18 Feb 2007
TL;DR: This paper presents ATLAS, the first prototype for CMPs with hardware support for Transactional Memory (TM), a technology aiming to simplify parallel programming, and addresses issues such as overall performance, challenges of mapping ASIC-style CMP RTL on to FPGAs, software support, and the selection criteria for the base processor.
Abstract: Chip-multiprocessors are quickly gaining momentum in all segments of computing. However, the practical success of CMPs strongly depends on addressing the difficulty of multithreaded application development. To address this challenge, it is necessary to co-develop new CMP architecture with novel programming models. Currently, architecture research relies on software simulators which are too slow to facilitate interesting experiments with CMP software without using small datasets or significantly reducing the level of detail in the simulated models. An alternative to simulation is to exploit the rich capabilities of modern FPGAs to create FPGA-based platforms for novel CMP research. This paper presents ATLAS, the first prototype for CMPs with hardware support for Transactional Memory (TM), a technology aiming to simplify parallel programming. ATLAS uses the BEE2 multi-FPGA board to provide a system with 8 PowerPC cores that run at 100MHz and runs Linux. ATLAS provides significant benefits for CMP research such as 100x performance improvement over a software simulator and good visibility that helps with software tuning and architectural improvements. In addition to presenting and evaluating ATLAS, we share our observations about building a FPGA-based framework for CMP research. Specifically, we address issues such as overall performance, challenges of mapping ASIC-style CMP RTL on to FPGAs, software support, the selection criteria for the base processor, and the challenges of using pre-designed IP libraries.

72 citations

Proceedings ArticleDOI
09 Jan 2010
TL;DR: This work proposes, implements and evaluates several novel kernel-level scheduling support mechanisms for TM contention management, and introduces kernel- level TM scheduling support into both the Linux and Solaris kernels, believed to be the first to investigatekernel-level support forTM contention management.
Abstract: Transactional Memory (TM) is considered as one of the most promising paradigms for developing concurrent applications. TM has been shown to scale well on >multiple cores when the data access pattern behaves "well," i.e., when few conflicts are induced. In contrast, data patterns with frequent write sharing, with long transactions, or when many threads contend for a smaller number of cores, result in numerous conflicts. Until recently, TM implementations had little control of transactional threads, which remained under the supervision of the kernel's transaction-ignorant scheduler. Conflicts are thus traditionally resolved by consulting an STM-level contention manager. Consequently, the contention managers of these "conventional" TM implementations suffer from a lack of precision and often fail to ensure reasonable performance in high-contention workloads.Recently, scheduling-based TM contention-management has been proposed for increasing TM efficiency under high-contention [2, 5, 19]. However, only user-level schedulers have been considered. In this work, we propose, implement and evaluate several novel kernel-level scheduling support mechanisms for TM contention management. We also investigate different strategies for efficient communication between the kernel and the user-level TM library. To the best of our knowledge, our work is the first to investigate kernel-level support for TM contention management.We have introduced kernel-level TM scheduling support into both the Linux and Solaris kernels. Our experimental evaluation demonstrates that lightweight kernel-level scheduling support significantly reduces the number of aborts while improving transaction throughput on various workloads.

72 citations

Proceedings ArticleDOI
01 Jul 2008
TL;DR: This paper proposes leveraging the processing capabilities of multi-core processors to improve the efficiency of stateful components using optimistic parallelization techniques (as provided by transactional memory), and shows how simple conflict predictors can boost the parallelism and reduce the amount of resources used for a given level of parallelism.
Abstract: In event stream applications, events flow through a network of components that perform various types of operations, e.g., filtering, aggregation, transformation. When the operation only depends on the input events, one can trivially parallelize its processing by replicating the associated components. This is not possible, however, with stateful components or when there exist dependencies between the events. Parallel versions of a number of simple stream mining operators have been designed, but, in general, complex and user-defined operators are limited by single thread performance. In this paper, we propose leveraging the processing capabilities of multi-core processors to improve the efficiency of stateful components using optimistic parallelization techniques (as provided by transactional memory). We show that, even though some speculative event executions might need to be disregarded, the overall throughput increases noticeably in the general case and latency can be reduced by pre-processing out-of-order events. Moreover, we show how simple conflict predictors can boost the parallelism even more and reduce the amount of resources used for a given level of parallelism.

70 citations


Network Information
Related Topics (5)
Compiler
26.3K papers, 578.5K citations
87% related
Cache
59.1K papers, 976.6K citations
86% related
Parallel algorithm
23.6K papers, 452.6K citations
84% related
Model checking
16.9K papers, 451.6K citations
84% related
Programming paradigm
18.7K papers, 467.9K citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202316
202240
202129
202063
201970
201888