Book ChapterDOI
Reconstructing hardware transactional memory for workload optimized systems
Kunal Korgaonkar,Prabhat Jain,Deepak Tomar,Kashyap Garimella,V. Kamakoti +4 more
- pp 1-15
Reads0
Chats0
TLDR
It is argued that Hardware Transactional Memory (HTM) can be a suitable implementation choice for these systems and the knowledge about the workload is extremely useful to make appropriate design choices in the workload optimized HTM.Abstract:
Workload optimized systems consisting of large number of general and special purpose cores, and with a support for shared memory programming, are slowly becoming prevalent. One of the major impediments for effective parallel programming on these systems is lock-based synchronization. An alternate synchronization solution called Transactional Memory (TM) is currently being explored.We observe that most of the TM design proposals in literature are catered to match the constrains of general purpose computing platforms. Given the fact that workload optimized systems utilize wider hardware design spaces and on-chip parallelism, we argue that Hardware Transactional Memory (HTM) can be a suitable implementation choice for these systems. We re-evaluate the criteria to be satisfied by a HTM and identify possible scope for relaxations in the context of workload optimized systems. Based on the relaxed criteria, we demonstrate the scope for building HTM design variants, such that, each variant caters to a specific workload requirement. We carry out suitable experiments to bring about the trade-off between the design variants. Overall, we show how the knowledge about the workload is extremely useful to make appropriate design choices in the workload optimized HTM.read more
Citations
More filters
Journal ArticleDOI
Parallel Scientific Computation: A Structured Approach using BSP and MPI
TL;DR: This is the first textbook provides a comprehensive overview of the technical aspects of building parallel programs using BSP and BSPlib, and is contemporary, well presented, and balanced between concepts and the technical depth required for developing parallel algorithms.
References
More filters
Journal ArticleDOI
American Options under Proportional Transaction Costs: Pricing, Hedging and Stopping Algorithms for Long and Short Positions
Alet Roux,Tomasz Zastawniak +1 more
TL;DR: In this paper, a general discrete market in the presence of proportional transaction costs, modelled as bid-ask spreads, is studied for American options with an arbitrary payoff, and pricing algorithms and constructions of hedging strategies, stopping times and martingale representations are presented for short (seller's) and long (buyer's) positions in an American option.
Proceedings ArticleDOI
Fast near duplicate detection for personal image collections
Feng Tang,Yuli Gao +1 more
TL;DR: A novel fast near duplicate detection framework that takes advantages of heterogeneous features like EXIF data, global image histogram and local features is proposed and a structure matching algorithm that takes into account of a local feature's neighborhood which can effectively reject mismatches is developed.
Journal ArticleDOI
Incremental dependence analysis for interactive parallelization
TL;DR: This work has developed a tool (PAT) that maintains dependence information during incremental transformations to a Fortran program, including loop parallelization, codeReplication, alignment and shifting, as well as insertion and deletion of code including parallel primitives.
Journal ArticleDOI
The American put under transactions costs
Stylianos Perrakis,Jean Lefoll +1 more
TL;DR: In this article, the optimal super-replication of American put options with physical delivery of the underlying asset, such as stock options, by means of a stock plus riskless asset portfolio is examined.
Proceedings ArticleDOI
Cascade: hardware for high/variable precision arithmetic
TL;DR: The Cascade hardware architecture for high/variable precision arithmetic uses a radix-16 redundant signed-digit number representation and provides a complete suite of memory management functions implemented in hardware, including a garbage collector.