scispace - formally typeset
Open AccessBook ChapterDOI

A Tool Suite for Simulation Based Analysis of Memory Access Behavior

TLDR
An execution driven cache simulator which relates event metrics to a dynamically built-up call-graph, and a graphical front end able to visualize the generated data in various ways are presented.
Abstract
In this paper, two tools are presented: an execution driven cache simulator which relates event metrics to a dynamically built-up call-graph, and a graphical front end able to visualize the generated data in various ways. To get a general purpose, easy-to-use tool suite, the simulation approach allows us to take advantage of runtime instrumentation, i.e. no preparation of application code is needed, and enables for sophisticated preprocessing of the data already in the simulation phase. In an ongoing project, research on advanced cache analysis is based on these tools. Taking a multigrid solver as an example, we present the results obtained from the cache simulation together with real data measured by hardware performance counters.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Fast parallel image registration on CPU and GPU for diagnostic classification of Alzheimer's disease.

TL;DR: The accelerated registration tool elastix is employed in a study on diagnostic classification of Alzheimer's disease and cognitively normal controls based on T1-weighted MRI and has nearly identical results to the non-optimized version.
Journal ArticleDOI

DynamO : a free O(N) general event-driven molecular dynamics simulator

TL;DR: DynamO is presented, a general event‐driven simulation package, which displays the optimal ${\cal O}$(N) asymptotic scaling of the computational cost with the number of particles N, rather than the standard scaling found in most standard algorithms.
Proceedings ArticleDOI

State of the Art of Performance Visualization

TL;DR: This work discusses performance as it relates to visualization and survey existing approaches in performance visualization and develops a taxonomy for the contexts in which different performance visualizations reside and describes the state of the art research pertaining to each.
Journal ArticleDOI

A Cache-Aware Algorithm for PDEs on Hierarchical Data Structures Based on Space-Filling Curves

TL;DR: Data access becomes very fast—even faster than the common access to nonhierarchical data stored in matrices—and, in particular, cache misses are reduced considerably.
Proceedings ArticleDOI

Input-sensitive profiling

TL;DR: A building block technique and a toolkit towards automatic discovery of workload-dependent performance bottlenecks that other profilers may fail to detect and can provide useful characterizations of the workload and behavior of individual routines in the context of mainstream applications are presented.
References
More filters
Proceedings ArticleDOI

Gprof: A call graph execution profiler

TL;DR: The gprof profiler accounts for the running time of called routines in therunning time of the routines that call them, and the design and use of this profiler is described.
Proceedings ArticleDOI

ATOM: a system for building customized program analysis tools

TL;DR: ATOM as mentioned in this paper is a single framework for building a wide range of customized program analysis tools, including block counting, profiling, dynamic memory recording, instruction and data cache simulation, pipeline simulation, evaluating branch prediction, and instruction scheduling.
Journal ArticleDOI

The Paradyn parallel performance measurement tool

TL;DR: Dynamic instrumentation lets us defer insertion until the moment it is needed (and remove it when it is no longer needed); Paradyn's Performance Consultant decides when and where to insert instrumentation.
Proceedings ArticleDOI

Shade: a fast instruction-set simulator for execution profiling

TL;DR: A tool called Shade is described which combines efficient instruction-set simulation with a flexible, extensible trace generation capability and discusses instruction set emulation in general.
Journal ArticleDOI

A Portable Programming Interface for Performance Evaluation on Modern Processors

TL;DR: The purpose of the PAPI project is to specify a standard application programming interface for accessing hardware performance counters available on most modern microprocessors, which exist as a small set of registers that count events.
Related Papers (5)