scispace - formally typeset
Search or ask a question
Topic

Transactional memory

About: Transactional memory is a research topic. Over the lifetime, 2365 publications have been published within this topic receiving 60818 citations.


Papers
More filters
Proceedings ArticleDOI
14 Feb 2009
TL;DR: The programming model, design and implementation of NePalTM; a transactional memory system where atomic blocks can be used for concurrency control at an arbitrary level of nested parallelism are presented.
Abstract: We present the programming model, design and implementation of NePalTM; a transactional memory system where atomic blocks can be used for concurrency control at an arbitrary level of nested parallelism.

20 citations

Proceedings ArticleDOI
J. Rattner1
17 Sep 2005
TL;DR: The motivation for multi-core architectures, their unique characteristics, and potential solutions to the fundamental software challenges, including architectural enhancements for transactional memory, fine-grain message passing, and speculative multi-threading are addressed.
Abstract: Summary form only given. It is likely that 2005 will be viewed as the year that parallelism came to the masses, with multiple vendors shipping dual/multi-core platforms into the mainstream consumer and enterprise markets. Assuming that this trend will follow Moore's Law scaling, mainstream systems will contain over 10 processing cores by the end of the decade, yielding unprecedented theoretical peak performance. However, it is unclear whether the software community is sufficiently ready for this transition and will be able to unleash these capabilities due to the significant challenges associated with parallel programming. This keynote addresses the motivation for multi-core architectures, their unique characteristics, and potential solutions to the fundamental software challenges, including architectural enhancements for transactional memory, fine-grain message passing, and speculative multi-threading. Finally, we stress the need for a concerted, accelerated effort, starting at the academic-level and encompassing the entire platform software ecosystem, to successfully make the multi-core architectural transition.

20 citations

Patent
Tim Harris1
23 Mar 2006
TL;DR: In this article, a software transactional memory system is described, which utilizes decomposed software transaction memory instructions as well as runtime optimizations to achieve efficient performance, such as code movement around procedure calls, addition of operations to provide strong atomicity, removal of unnecessary read-to-update upgrades, and removal of operations for newly-allocated objects.
Abstract: A software transactional memory system is described which utilizes decomposed software transactional memory instructions as well as runtime optimizations to achieve efficient performance. The decomposed instructions allow a compiler with knowledge of the instruction semantics to perform optimizations which would be unavailable on traditional software transactional memory systems. Additionally, high-level software transactional memory optimizations are performed such as code movement around procedure calls, addition of operations to provide strong atomicity, removal of unnecessary read-to-update upgrades, and removal of operations for newly-allocated objects. During execution, multi-use header words for objects are extended to provide for per-object housekeeping, as well as fast snapshots which illustrate changes to objects. Additionally, entries to software transactional memory logs are filtered using an associative table during execution, preventing needless writes to the logs. Finally a garbage collector with knowledge of the software transactional memory system compacts software transactional memory logs during garbage collection.

20 citations

Proceedings ArticleDOI
30 Jun 2009
TL;DR: All delay-based CMs, which pause a transaction for some finite duration upon conflict, are found to be unsuitable for the evaluated benchmarks with even moderate amounts of contention.
Abstract: In Transactional Memory (TM), contention management is the process of selecting which transaction should be aborted when a data access conflict arises. In this paper, the performance of published contention managers (CMs) is re-investigated using complex benchmarks recently published in the literature. Our results redefine the CM performance hierarchy. Greedy and Priority are found to give the best performance overall. Polka is still competitive, but by no means best performing as previously published, and in some cases degrading performance by orders of magnitude. In the worst example, execution of a benchmark completes in 6.5 seconds with Priority, yet fails to complete even after 20 minutes with Polka. Analysis of the benchmark found it aborted only 22% of all transactions, spread consistently over the duration of its execution. More generally, all delay-based CMs, which pause a transaction for some finite duration upon conflict, are found to be unsuitable for the evaluated benchmarks with even moderate amounts of contention. This has significant implications, given that TM is primarily aimedat easing concurrent programming for mainstream software development, where applications are unlikely to be highly optimised to reduce aborts.

20 citations

Journal ArticleDOI
TL;DR: The proposed clfB-tree—a B-tree structure whose tree node fits in a single cache line— achieves atomicity and consistency via in-place update, which requires maximum four cache line flushes.
Abstract: Emerging byte-addressable non-volatile memory (NVRAM) is expected to replace block device storages as an alternative low-latency persistent storage device. If NVRAM is used as a persistent storage device, a cache line instead of a disk page will be the unit of data transfer, consistency, and durability.In this work, we design and develop clfB-tree—a B-tree structure whose tree node fits in a single cache line. We employ existing write combining store buffer and restricted transactional memory to provide a failure-atomic cache line write operation. Using the failure-atomic cache line write operations, we atomically update a clfB-tree node via a single cache line flush instruction without major changes in hardware. However, there exist many processors that do not provide SW interface for transactional memory. For those processors, our proposed clfB-tree achieves atomicity and consistency via in-place update, which requires maximum four cache line flushes. We evaluate the performance of clfB-tree on an NVRAM emulation board with ARM Cortex A-9 processor and a workstation that has Intel Xeon E7-4809 v3 processor. Our experimental results show clfB-tree outperforms wB-tree and CDDS B-tree by a large margin in terms of both insertion and search performance.

20 citations


Network Information
Related Topics (5)
Compiler
26.3K papers, 578.5K citations
87% related
Cache
59.1K papers, 976.6K citations
86% related
Parallel algorithm
23.6K papers, 452.6K citations
84% related
Model checking
16.9K papers, 451.6K citations
84% related
Programming paradigm
18.7K papers, 467.9K citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202316
202240
202129
202063
201970
201888