scispace - formally typeset
Search or ask a question
Topic

Transactional memory

About: Transactional memory is a research topic. Over the lifetime, 2365 publications have been published within this topic receiving 60818 citations.


Papers
More filters
Proceedings ArticleDOI
01 Apr 2010
TL;DR: A novel HTM is proposed, called LiteTM, which completely eliminates the count and identifier and uses software to infer the lost information, showing that LiteTM reduces TokenTM's state overhead by about 87% while performing within 4%, on average, and 10%, in the worst case, of To ke nTM.
Abstract: Transactional memory (TM) has been proposed to address some of the programmability issues of chip multiprocessors. Hardware implementations of transactional memory (HTMs) have made significant progress in providing support for features such as long transactions that spill out of the cache, and context switches, page and thread migration in the middle of transactions. While essential for the adoption of HTMs in real products, supporting these features has resulted in significant state overhead. For instance, TokenTM adds at least 16 bits per block in the caches which is significant in absolute terms, and steals 16 of 64 (25%) memory ECC bits per block, weakening error protection. Also, the state bits nearly double the tag array size. These significant and practical concerns may impede the adoption of HTMs, squandering the progress achieved by HTMs. The overhead comes from tracking the thread identifier and the transactional read-sharer count at the L1-block granularity. The thread identifier is used to identify the transaction, if only one, to which an L1-evicted block belongs. The read-sharer count is used to identify conflicts involving multiple readers (i.e., write to a block with non-zero count). To reduce this overhead, we observe that the thread identifiers and read-sharer counts are not needed in a majority of cases. (1) Repeated misses to the same blocks are rare within a transaction (i.e., locality holds). (2) Transactional read-shared blocks that both are evicted from multiple sharers' L1s and are involved in conflicts are rare. Exploiting these observations, we propose a novel HTM, called LiteTM, which completely eliminates the count and identifier and uses software to infer the lost information. Using simulations of the STAMP benchmarks running on 8 cores, we show that LiteTM reduces TokenTM's state overhead by about 87% while performing within 4%, on average, and 10%, in the worst case, of To ke nTM.

12 citations

Proceedings ArticleDOI
04 Oct 2009
TL;DR: The proposed transactional memory workload characterization techniques will help TM architects select a small, diverse, set of TM workloads for their design evaluation, and show that the methods presented in this paper can be used to identify specific feature subsets.
Abstract: Programming to exploit the resources in a multicore system remains a major obstacle for both computer and software engineers Transactional memory offers an attractive alternative to traditional concurrent programming but implementations emerged before the programming model, leaving a gap in the design process In previous research, transactional microbenchmarks have been used to evaluate designs or lock-based multithreaded workloads have been manually converted into their transactional equivalents; others have even created dedicated transactional benchmarks Yet, throughout all of the investigations, transactional memory researchers have not settled on a way to describe the runtime characteristics that these programs exhibit; nor has there been any attempt to unify the way transactional memory implementations are evaluated In addition, the similarity (or redundancy) of these workloads is largely unknown Evaluating transactional memory designs using workloads that exhibit similar characteristics will unnecessarily increase the number of simulations without contributing new insight On the other hand, arbitrarily choosing a subset of transactional memory workloads for evaluation can miss important features and lead to biased or incorrect conclusions In this work, we propose a set of architecture-independent transaction-oriented workload characteristics that can accurately capture the behavior of transactional code We apply principle component analysis and clustering algorithms to analyze the proposed workload characteristics collected from a set of SPLASH-2, STAMP, and PARSEC transactional memory programs Our results show that using transactional characteristics to cluster the chosen benchmarks can reduce the number of required simulations by almost half We also show that the methods presented in this paper can be used to identify specific feature subsets With the increasing number of TM workloads in the future, we believe that the proposed transactional memory workload characterization techniques will help TM architects select a small, diverse, set of TM workloads for their design evaluation

12 citations

Book ChapterDOI
25 Aug 2014
TL;DR: This study identifies several issues associated with the employment of techniques originally conceived for STM, and proposes an innovative machine learning based technique explicitly designed to take into account peculiarities of HTM systems, and demonstrates its advantages in terms of higher accuracy and shorter learning times using the STAMP benchmark suite.
Abstract: Transactional Memory (TM) is an emerging paradigm that promises to ease the development of parallel applications. Due to its inherently speculative nature, however, TM can suffer of performance degradations in presence of conflict intensive workloads.A key technique to tackle this issue consists in dynamically regulating the number of concurrent threads, which allows for selecting the concurrency level that best fits the intrinsic parallelism of specific applications. In this area, several self-tuning approaches have been proposed for Software-based implementations of TM (STM). In this paper we investigate the effectiveness of these techniques when applied to Hardware TM (HTM), a theme that is particularly relevant and timely given the recent integration of hardware supports for TM in next generation of mainstream Intel processors. Our study, conducted on Intel’s implementation of HTM, identifies several issues associated with the employment of techniques originally conceived for STM. Motivated by these findings, we propose an innovative machine learning based technique explicitly designed to take into account peculiarities of HTM systems, and demonstrate its advantages, in terms of higher accuracy and shorter learning times, using the STAMP benchmark suite.

12 citations

Patent
26 Jun 2009
TL;DR: In this paper, a method and system for acquiring multiple software locks in bulk is disclosed, such as for atomic transactions in transactional memory systems, which may be applied to consolidate computationally expensive memory barrier operations across the lock acquisitions.
Abstract: A method and system for acquiring multiple software locks in bulk is disclosed. When multiple locks need to be acquired, such as for atomic transactions in transactional memory systems, the disclosed techniques may be applied to consolidate computationally expensive memory barrier operations across the lock acquisitions. A system may acquire multiple locks in bulk, at least in part, by modifying values in one or more fields of multiple locks and by then performing a memory barrier operation to ensure that the modified values in the multiple locks are visible to other application threads. The technique may be repeated for locks that the system fails to acquire during earlier iterations until all required locks are acquired. The described technique may be applied to various scenarios including static and/or dynamic transactional locking protocols.

12 citations

Proceedings ArticleDOI
15 Sep 2007
TL;DR: A relaxed software transactional memory (RSTM) model is proposed that allows the programmer to specify atomicity constraints for transactions with greater flexibility and precision.
Abstract: Software transactional memory (STM) systems have been proposed in order to make parallel programs easier to develop and verify compared to conventional lock-based programming techniques. However, conventional STMs do not scale in performance to a large number of concurrent threads for several classes of applications. While the atomicity semantics of traditional STMs greatly simplify the correct sharing of data between threads, these same atomicity semantics incur a large penalty in program execution time.we propose a relaxed software transactional memory (RSTM) model that allows the programmer to specify atomicity constraints for transactions with greater flexibility and precision. In a RSTM system, shared data accessed by a transaction is classified into several "consistency groups". A single consistency policy is enforced on each consistency group. Both membership of a group and the policy to be applied are declaratively specified by the programmer.

12 citations


Network Information
Related Topics (5)
Compiler
26.3K papers, 578.5K citations
87% related
Cache
59.1K papers, 976.6K citations
86% related
Parallel algorithm
23.6K papers, 452.6K citations
84% related
Model checking
16.9K papers, 451.6K citations
84% related
Programming paradigm
18.7K papers, 467.9K citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202316
202240
202129
202063
201970
201888