scispace - formally typeset
Journal ArticleDOI

Partitioning and Data Mapping in Reconfigurable Cache and Scratchpad Memory--Based Architectures

TLDR
A novel graph-based structure to tackle data allocation in an application is introduced and a data allocation heuristic to map program objects for a fixed-size SPM-cache hybrid system that targets whole program optimization is presented.
Abstract
Scratchpad memory (SPM) is considered a useful component in the memory hierarchy, solely or along with caches, for meeting the power and energy constraints as performance ceases to be the sole criteria for processor design. Although the efficiency of SPM is well known, its use has been restricted owing to difficulties in programmability. Real applications usually have regions that are amenable to exploitation by either SPM or cache and hence can benefit if the two are used in conjunction. Dynamically adjusting the local memory resources to suit application demand can significantly improve the efficiency of the overall system. In this article, we propose a compiler technique to map application data objects to the SPM-cache and also partition the local memory between the SPM and cache depending on the dynamic requirement of the application. First, we introduce a novel graph-based structure to tackle data allocation in an application. Second, we use this to present a data allocation heuristic to map program objects for a fixed-size SPM-cache hybrid system that targets whole program optimization. We finally extend this formulation to adapt the SPM and cache sizes, as well as the data allocation as per the requirement of different application regions. We study the applicability of the technique on various workloads targeted at both SPM-only and hardware reconfigurable memory systems, observing an average of 18p energy-delay improvement over state-of-the-art techniques.

read more

Citations
More filters

Parallel CAD: Algorithm Design and Programming Special Section Call for Papers TODAES: ACM Transactions on Design Automation of Electronic Systems

TL;DR: This journal special section will cover recent progress on parallel CAD research, including algorithm foundations, programming models, parallel architectural-specific optimization, and verification, as well as other topics relevant to the design of parallel CAD algorithms and software tools.
Journal ArticleDOI

SAM: Software-Assisted Memory Hierarchy for Scalable Manycore Embedded Systems

TL;DR: This letter proposes a system architecture for a scalable software-assisted memory hierarchy for emerging manycore embedded systems that overcomes the coherence overhead and inflexibility of purely hardware-managed memory hierarchies in adapting to variable workloads.

Software Assists to On-chip Memory Hierarchy of Manycore Embedded Systems

TL;DR: This thesis presents the required hardware and software support for a software-assisted memory hierarchy that is composed of distributed memories which can be partitioned between caches and software-programmable memories (SPMs) at runtime, and proposes approximation techniques for major building blocks of this hybrid cache-SPM memory hierarchy.
Journal ArticleDOI

OSM: Off-Chip Shared Memory for GPUs

TL;DR: Off-Chip Shared Memory (OSM) is proposed that allocates shared memory space in the off-chip memory and accelerates accesses to it via a small on-chip cache and designs a unified cache for shared memory and global address spaces, providing more caching space for global memory address space even for the workloads with high shared memory utilization.
Journal ArticleDOI

Mapi-Pro: An Energy Efficient Memory Mapping Technique for Intermittent Computing

TL;DR: In this paper , an ILP-based memory mapping technique was proposed to reduce the system's energy-delay product (EDP) by using non-volatile memory (NVM) for saving the system state during power failure.
References
More filters
Journal ArticleDOI

Pin: building customized program analysis tools with dynamic instrumentation

TL;DR: The goals are to provide easy-to-use, portable, transparent, and efficient instrumentation, and to illustrate Pin's versatility, two Pintools in daily use to analyze production software are described.
Proceedings ArticleDOI

MediaBench: a tool for evaluating and synthesizing multimedia and communications systems

TL;DR: The MediaBench benchmark suite as discussed by the authors is a benchmark suite that has been designed to fill the gap between the compiler community and embedded applications developers, which has been constructed through a three-step process: intuition and market driven initial selection, experimental measurement, and integration with system synthesis algorithms to establish usefulness.
Journal ArticleDOI

Introduction to the cell multiprocessor

TL;DR: This paper discusses the history of the project, the program objectives and challenges, the disign concept, the architecture and programming models, and the implementation of the Cell multiprocessor.
Proceedings ArticleDOI

Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation

TL;DR: Interval simulation provides a balance between detailed cycle-accurate simulation and one-IPC simulation, allowing long-running simulations to be modeled much faster than with detailed cycle, while still providing the detail necessary to observe core-uncore interactions across the entire system.
Proceedings ArticleDOI

Scratchpad memory: a design alternative for cache on-chip memory in embedded systems

TL;DR: The results clearly establish scratch pad memory as a low power alternative in most situations with an average energy reduction of 40% and the average area-time reduction for the scratchpad memory was 46% of the cache memory.
Related Papers (5)