scispace - formally typeset
Book ChapterDOI

Reconstructing hardware transactional memory for workload optimized systems

Reads0
Chats0
TLDR
It is argued that Hardware Transactional Memory (HTM) can be a suitable implementation choice for these systems and the knowledge about the workload is extremely useful to make appropriate design choices in the workload optimized HTM.
Abstract
Workload optimized systems consisting of large number of general and special purpose cores, and with a support for shared memory programming, are slowly becoming prevalent. One of the major impediments for effective parallel programming on these systems is lock-based synchronization. An alternate synchronization solution called Transactional Memory (TM) is currently being explored.We observe that most of the TM design proposals in literature are catered to match the constrains of general purpose computing platforms. Given the fact that workload optimized systems utilize wider hardware design spaces and on-chip parallelism, we argue that Hardware Transactional Memory (HTM) can be a suitable implementation choice for these systems. We re-evaluate the criteria to be satisfied by a HTM and identify possible scope for relaxations in the context of workload optimized systems. Based on the relaxed criteria, we demonstrate the scope for building HTM design variants, such that, each variant caters to a specific workload requirement. We carry out suitable experiments to bring about the trade-off between the design variants. Overall, we show how the knowledge about the workload is extremely useful to make appropriate design choices in the workload optimized HTM.

read more

Citations
More filters
Journal ArticleDOI

Parallel Scientific Computation: A Structured Approach using BSP and MPI

TL;DR: This is the first textbook provides a comprehensive overview of the technical aspects of building parallel programs using BSP and BSPlib, and is contemporary, well presented, and balanced between concepts and the technical depth required for developing parallel algorithms.
References
More filters
Proceedings ArticleDOI

Scratchpad memory management in a multitasking environment

TL;DR: This paper introduces a dynamic scratchpad memory code allocation technique for code that supports dynamically created processes and analyzes several sharing strategies with regard to several preferable properties of multiprocess SPM allocation schemes.
Proceedings ArticleDOI

Performance study of mapping irregular computations on GPUs

TL;DR: This paper presents both an implementation of the Breadth First Search algorithm as well as that of a Matrix Parenthesization algorithm that showcase similar synchronization behavior when implemented on a GPU using CUDA, enabling a more direct comparison between them.
Proceedings ArticleDOI

eNVy: a NonVolatile main memory storage system

TL;DR: It is argued that technological advances will soon make it possible to build high-end file servers and database servers with large nonvolatile solid state memories, at a price/performance ratio competitive with disk- or RAID-based designs.

OS Support for Virtualizing Hardware Transactional Memory

TL;DR: It is found that aborting a transaction is generally faster than virtualizing it, and hence preferable in some cases, and it is shown virtualizing transactions can be necessary for system stability and to support code that voluntarily context switches.
Journal Article

Semantic-driven parallelization of loops operating on user-defined containers

TL;DR: ROSE as mentioned in this paper is a C++ infrastructure for source-to-source translation that provides an interface for programmers to easily write their own translators for optimizing the use of high-level abstractions.
Related Papers (5)