scispace - formally typeset
Search or ask a question
Topic

Transactional memory

About: Transactional memory is a research topic. Over the lifetime, 2365 publications have been published within this topic receiving 60818 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This article presents a new scheme for hardware transactional memory (HTM) support within a cluster-based, many-core embedded system that lacks an underlying cache-coherence protocol, and proposes two alternative data versioning implementations for the HTM support, Full-Mirroring and Distributed Logging.
Abstract: High-end embedded systems, like their general-purpose counterparts, are turning to many-core cluster-based shared-memory architectures that provide a shared memory abstraction subject to non-uniform memory access costs. In order to keep the cores and memory hierarchy simple, many-core embedded systems tend to employ simple, scratchpad-like memories, rather than hardware managed caches that require some form of cache coherence management. These “coherence-free” systems still require some means to synchronize memory accesses and guarantee memory consistency. Conventional lock-based approaches may be employed to accomplish the synchronization, but may lead to both usability and performance issues. Instead, speculative synchronization, such as hardware transactional memory, may be a more attractive approach. However, hardware speculative techniques traditionally rely on the underlying cache-coherence protocol to synchronize memory accesses among the cores. The lack of a cache-coherence protocol adds new challenges in the design of hardware speculative support. In this article, we present a new scheme for hardware transactional memory (HTM) support within a cluster-based, many-core embedded system that lacks an underlying cache-coherence protocol. We propose two alternative data versioning implementations for the HTM support, Full-Mirroring and Distributed Logging and we conduct a performance comparison between them. To the best of our knowledge, these are the first designs for speculative synchronization for this type of architecture. Through a set of benchmark experiments using our simulation platform, we show that our designs can achieve significant performance improvements over traditional lock-based schemes.
Book ChapterDOI
05 Nov 2014
TL;DR: The major challenges faced by two state of the art OFTMs viz.
Abstract: Transactional Memory, one of the most viable alternatives to lock based concurrent systems, was explored by the researchers for practically implementing parallel processing. The goal was that threads will run parallel and improve system performance, but the effect of their execution will be linear. In STM, the non-blocking synchronization can be implemented by Wait-Freedom, Lock-Freedom or Obstruction-Freedom philosophy. Though Obstruction Free Transactional Memory (OFTM) provides the weakest progress guarantee, this paper concentrates upon OFTM because of its design flexibility and algorithmic simplifications. In this paper, the major challenges faced by two state of the art OFTMs viz. Dynamic Software Transactional Memory (DSTM) and Adaptive Software Transactional Memory (ASTM), have been addressed and an alternative arbitration strategy has been proposed that reduces the abort percentage both in case of Read-Write as well as Write-Write conflicts.
Patent
04 Jun 2020
TL;DR: Transactional memory management support circuitry 20, comprising address tracking means and two types of transactions, is described in this paper. But it does not specify how to track the addresses of the transactions.
Abstract: Transactional memory management support circuitry 20, comprising address tracking means and two types of transactions. A first type of transaction is started using a first type of start instruction, whereby commitment of results of instructions are prevented until a transaction end instruction 655 is reached, the instructions are instead executed speculatively. An abort 645 is triggered when a conflict is detected between an address of a memory access from another thread and the addresses tracked for the transaction, an abort ensures the results of the speculative instructions are not committed to memory. For a second type of transaction started using a second type of start instruction, an address of the read operation is marked as trackable whilst an address of a write operation is omitted from being marked as trackable. This allows an apparatus that supports transactional memory to also be used for multi-word address watching. The first type of transaction may capture an architectural state 615 to be restored 670 in the case of an abort. The second type of transaction may record an abort cause indication indicative of if the abort was caused by a conflict or not.
Proceedings ArticleDOI
17 Nov 2010
TL;DR: A criterion for selecting which transaction should be aborted taking account of data size in each log is proposed and another criterion which takes account of degree of conflict is also proposed.
Abstract: Lock-based synchronization techniques are commonly used in parallel programming on multi-core processors However, lock can cause deadlocks and poor scalabilities Hence, LogTM has been proposed and studied for lock-free synchronization LogTM is a kind of hardware transactional memory In LogTM, transactions are executed speculatively to ensure serializability and atomicity LogTM stores original values in a log before it is modified by a transaction If a transaction accesses a shared datum which has been accessed by another transaction running in parallel, LogTM detects it as conflict and restores all data from the associated log and restarts the transaction This is called aborting On abort, the costs for restoring data from a log increases in proportion to the data size on the log However, LogTM selects which transaction should be aborted by their initiated time Hence, if conflicts occur frequently, it may degrades the performance This paper proposes a criterion for selecting which transaction should be aborted taking account of data size in each log In addition, another criterion which takes account of degree of conflict is also proposed The result of the experiment with SPLASH-2 benchmark suite programs shows that the proposed methods improve the performance 27% in maximum
01 Jan 2014
TL;DR: This thesis presents techniques to measure power consumption of computer systems at various levels, and performs a detailed performance and energy characterization of Intel’s Restricted Transactional Memory (RTM).
Abstract: Society’s increasing dependence on information technology has resulted in the deployment of vast compute resources. The energy costs of operating these resources coupled with environmental concerns have made power-aware computing one of the primary challenges for the IT sector. Making energy-efficient computing a rule rather than an exception requires that researchers and system designers use the right set of techniques and tools. These involve measuring, modeling, and characterizing the energy consumption of computers at varying degrees of granularity. In this thesis, we present techniques to measure power consumption of computer systems at various levels. We compare them for accuracy and sensitivity and discuss their effectiveness. We test Intel’s hardware power model for estimation accuracy and show that it is fairly accurate for estimating energy consumption when sampled at the temporal granularity of more than tens of milliseconds. We present a methodology to estimate per-core processor power consumption using performance counter and temperature-based power modeling and validate it across multiple platforms. We show our model exhibits negligible computation overhead, and the median estimation errors ranges from 0.3% to 10.1% for applications from SPEC2006, SPEC-OMP and NAS benchmarks. We test the usefulness of the model in a meta-scheduler to enforce power constraint on a system. Finally, we perform a detailed performance and energy characterization of Intel’s Restricted Transactional Memory (RTM). We use TinySTM software transactional memory (STM) system to benchmark RTM’s performance against competing STM alternatives. We use microbenchmarks and STAMP benchmark suite to compare RTM versus STM performance and energy behavior. We quantify the RTM hardware limitations that affect its success rate. We show that RTM performs better than TinySTM when working-set fits inside the cache and that RTM is better at handling high contention workloads.

Network Information
Related Topics (5)
Compiler
26.3K papers, 578.5K citations
87% related
Cache
59.1K papers, 976.6K citations
86% related
Parallel algorithm
23.6K papers, 452.6K citations
84% related
Model checking
16.9K papers, 451.6K citations
84% related
Programming paradigm
18.7K papers, 467.9K citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202316
202240
202129
202063
201970
201888