A Tool Suite for Simulation Based Analysis of Memory Access Behavior
Josef Weidendorfer,Markus Kowarschik,Carsten Trinitis +2 more
- pp 440-447
TLDR
An execution driven cache simulator which relates event metrics to a dynamically built-up call-graph, and a graphical front end able to visualize the generated data in various ways are presented.Abstract:
In this paper, two tools are presented: an execution driven cache simulator which relates event metrics to a dynamically built-up call-graph, and a graphical front end able to visualize the generated data in various ways. To get a general purpose, easy-to-use tool suite, the simulation approach allows us to take advantage of runtime instrumentation, i.e. no preparation of application code is needed, and enables for sophisticated preprocessing of the data already in the simulation phase. In an ongoing project, research on advanced cache analysis is based on these tools. Taking a multigrid solver as an example, we present the results obtained from the cache simulation together with real data measured by hardware performance counters.read more
Citations
More filters
Journal ArticleDOI
Fast parallel image registration on CPU and GPU for diagnostic classification of Alzheimer's disease.
Denis P Shamonin,Esther E. Bron,Boudewijn P. F. Lelieveldt,Marion Smits,Stefan Klein,Marius Staring +5 more
TL;DR: The accelerated registration tool elastix is employed in a study on diagnostic classification of Alzheimer's disease and cognitively normal controls based on T1-weighted MRI and has nearly identical results to the non-optimized version.
Journal ArticleDOI
DynamO : a free O(N) general event-driven molecular dynamics simulator
TL;DR: DynamO is presented, a general event‐driven simulation package, which displays the optimal ${\cal O}$(N) asymptotic scaling of the computational cost with the number of particles N, rather than the standard scaling found in most standard algorithms.
Proceedings ArticleDOI
State of the Art of Performance Visualization
Katherine E. Isaacs,Alfredo Gimenez,Ilir Jusufi,Todd Gamblin,Abhinav Bhatele,Martin Schulz,Bernd Hamann,Peer-Timo Bremer +7 more
TL;DR: This work discusses performance as it relates to visualization and survey existing approaches in performance visualization and develops a taxonomy for the contexts in which different performance visualizations reside and describes the state of the art research pertaining to each.
Journal ArticleDOI
A Cache-Aware Algorithm for PDEs on Hierarchical Data Structures Based on Space-Filling Curves
TL;DR: Data access becomes very fast—even faster than the common access to nonhierarchical data stored in matrices—and, in particular, cache misses are reduced considerably.
Proceedings ArticleDOI
Input-sensitive profiling
TL;DR: A building block technique and a toolkit towards automatic discovery of workload-dependent performance bottlenecks that other profilers may fail to detect and can provide useful characterizations of the workload and behavior of individual routines in the context of mainstream applications are presented.
References
More filters
Proceedings ArticleDOI
Gprof: A call graph execution profiler
TL;DR: The gprof profiler accounts for the running time of called routines in therunning time of the routines that call them, and the design and use of this profiler is described.
Proceedings ArticleDOI
ATOM: a system for building customized program analysis tools
Amitabh Srivastava,Alan Eustace +1 more
TL;DR: ATOM as mentioned in this paper is a single framework for building a wide range of customized program analysis tools, including block counting, profiling, dynamic memory recording, instruction and data cache simulation, pipeline simulation, evaluating branch prediction, and instruction scheduling.
Journal ArticleDOI
The Paradyn parallel performance measurement tool
Barton P. Miller,Mark Callaghan,J.M. Cargille,Jeffrey K. Hollingsworth,R.B. Irvin,Karen L. Karavanic,Krishna Kunchithapadam,Tia Newhall +7 more
TL;DR: Dynamic instrumentation lets us defer insertion until the moment it is needed (and remove it when it is no longer needed); Paradyn's Performance Consultant decides when and where to insert instrumentation.
Proceedings ArticleDOI
Shade: a fast instruction-set simulator for execution profiling
Bob Cmelik,David Keppel +1 more
TL;DR: A tool called Shade is described which combines efficient instruction-set simulation with a flexible, extensible trace generation capability and discusses instruction set emulation in general.
Journal ArticleDOI
A Portable Programming Interface for Performance Evaluation on Modern Processors
TL;DR: The purpose of the PAPI project is to specify a standard application programming interface for accessing hardware performance counters available on most modern microprocessors, which exist as a small set of registers that count events.