scispace - formally typeset
Search or ask a question
Author

David Whalley

Bio: David Whalley is an academic researcher from Florida State University. The author has contributed to research in topics: Compiler & Cache. The author has an hindex of 35, co-authored 121 publications receiving 5533 citations. Previous affiliations of David Whalley include Humboldt University of Berlin & Lawrence Livermore National Laboratory.


Papers
More filters
Journal ArticleDOI
TL;DR: Different approaches to the determination of upper bounds on execution times are described and several commercially available tools1 and research prototypes are surveyed.
Abstract: The determination of upper bounds on execution times, commonly called worst-case execution times (WCETs), is a necessary step in the development and validation process for hard real-time systems. This problem is hard if the underlying processor architecture has components, such as caches, pipelines, branch prediction, and other speculative components. This article describes different approaches to this problem and surveys several commercially available tools1 and research prototypes.

1,946 citations

Journal ArticleDOI
TL;DR: This paper describes an approach for bounding the worst and best case performance of large code segments on machines that exploit both pipelining and instruction caching and indicates that the timing analyzer efficiently produces tight predictions of best and best-case performance for pipelined and instruction cache.
Abstract: Predicting the execution time of code segments in real-time systems is challenging. Most recently designed machines contain pipelines and caches. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require several cycles to resolve. Whether an instruction will stall due to a pipeline hazard or a cache miss depends on the dynamic sequence of previous instructions executed and memory references performed. Furthermore, these penalties are not independent since delays due to pipeline stalls and cache miss penalties may overlap. This paper describes an approach for bounding the worst and best case performance of large code segments on machines that exploit both pipelining and instruction caching. First, a method is used to analyze a program's control flow to statically categorize the caching behavior of each instruction. Next, these categorizations are used in the pipeline analysis of sequences of instructions representing paths within the program. A timing analyzer uses the pipeline path analysis to estimate the worst and best-case execution performance of each loop and function in the program. Finally, a graphical user interface is invoked that allows a user to request timing predictions on portions of the program. The results indicate that the timing analyzer efficiently produces tight predictions of worst and best-case performance for pipelining and instruction caching.

223 citations

Proceedings Article
01 Jan 1994
TL;DR: In this paper, a static cache sitnula- tion is used to analyze a program's control flow to stati- cally categorize the caching behavior of each instruction and then a timing analyzer, which uses the Categorization informa- tion, then estimates the worst-case instruction cache per- formance for each loop and function in the program.
Abstract: The use of caches poses a difficult rradeoff for architects of real-time systems. While caches provide signifreant performance advantages, they have also been viewed as inherently unpredictable since the behavior of a cache ref- erence depends upon the history of the previous refer- ences. The use of caches will only be suitable for real- time systems if a reasonably tight bound on the perfor- mance of programs using cache nzemory can be predirterl. This paper describes an approach for bounding the worst- case instruction cache performanre of large code seg- ments. First, a new method called Static Cache Sitnula- tion is used to analyze a program's control flow to stati- cally categorize the caching behavior of each instruction. A timing analyzer, which uses the Categorization informa- tion, then estimates the worst-case instruction cache per- formance for each loop and function in the program.

192 citations

Proceedings ArticleDOI
05 Dec 1995
TL;DR: This paper describes an approach for bounding the worst-case performance of large code segments on machines that exploit both pipelining and instruction caching, and a graphical user interface is invoked that allows a user to request timing predictions on portions of the program.
Abstract: Recently designed machines contain pipelines and caches. While both features provide significant performance advantages, they also pose problems for predicting execution time of code segments in real-time systems. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require several cycles to resolve. Whether an instruction will stall due to a pipeline hazard or a cache miss depends on the dynamic sequence of previous instructions executed and memory references performed. Furthermore, these penalties are not independent since delays due to pipeline stalls and cache miss penalties may overlap. This paper describes an approach for bounding the worst-case performance of large code segments on machines that exploit both pipelining and instruction caching. First, a method is used to analyze a program's control flow to statically categorize the caching behavior of each instruction. Next, these categorizations are used in the pipeline analysis of sequences of instructions representing paths within the program. A timing analyzer uses the pipeline path analysis to estimate the worst-case execution performance of each loop and function in the program. Finally, a graphical user interface is invoked that allows a user to request timing predictions on portions of the program.

186 citations

Proceedings ArticleDOI
09 Jun 1997
TL;DR: Results of incorporating instruction cache predictions within pipeline simulation show that timing predictions for set-associative caches remain just as tight as predictions for direct-mapped caches.
Abstract: The contributions of this paper are twofold. First, an automatic tool-based approach is described to bound worst-case data cache performance. The given approach works on fully optimized code, performs the analysis over the entire control flow of a program, detects and exploits both spatial and temporal locality within data references, produces results typically within a few seconds, and estimates, on average, 30% tighter WCET bounds than can be predicted without analyzing data cache behavior. Results obtained by running the system on representative programs are presented and indicate that timing analysis of data cache behavior can result in significantly tighter worst-case performance predictions. Second, a framework to bound worst-case instruction cache performance for set-associative caches is formally introduced and operationally described. Results of incorporating instruction cache predictions within pipeline simulation show that timing predictions for set-associative caches remain just as tight as predictions for direct-mapped caches. The cache simulation overhead scales linearly with increasing associativity.

166 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Different approaches to the determination of upper bounds on execution times are described and several commercially available tools1 and research prototypes are surveyed.
Abstract: The determination of upper bounds on execution times, commonly called worst-case execution times (WCETs), is a necessary step in the development and validation process for hard real-time systems. This problem is hard if the underlying processor architecture has components, such as caches, pipelines, branch prediction, and other speculative components. This article describes different approaches to this problem and surveys several commercially available tools1 and research prototypes.

1,946 citations

Journal ArticleDOI
TL;DR: The authors argue that public managers should look inside the "black box" of collaboration processes and find a complex construct of five variable dimensions: governance, administration, organizational autonomy, mutuality, and norms.
Abstract: Social science research contains a wealth of knowledge for people seeking to understand collaboration processes. The authors argue that public managers should look inside the “black box” of collaboration processes. Inside, they will find a complex construct of five variable dimensions: governance, administration, organizational autonomy, mutuality, and norms. Public managers must know these five dimensions and manage them intentionally in order to collaborate effectively.

1,115 citations

Journal ArticleDOI
TL;DR: The delta debugging algorithm generalizes and simplifies the failing test case to a minimal test case that still produces the failure, and isolates the difference between a passing and a failingTest case.
Abstract: Given some test case, a program fails. Which circumstances of the test case are responsible for the particular failure? The delta debugging algorithm generalizes and simplifies the failing test case to a minimal test case that still produces the failure. It also isolates the difference between a passing and a failing test case. In a case study, the Mozilla Web browser crashed after 95 user actions. Our prototype implementation automatically simplified the input to three relevant user actions. Likewise, it simplified 896 lines of HTML to the single line that caused the failure. The case study required 139 automated test runs or 35 minutes on a 500 MHz PC.

980 citations

Journal ArticleDOI
TL;DR: The survey outlines fundamental results about multiprocessor real-time scheduling that hold independent of the scheduling algorithms employed, and provides a taxonomy of the different scheduling methods, and considers the various performance metrics that can be used for comparison purposes.
Abstract: This survey covers hard real-time scheduling algorithms and schedulability analysis techniques for homogeneous multiprocessor systems. It reviews the key results in this field from its origins in the late 1960s to the latest research published in late 2009. The survey outlines fundamental results about multiprocessor real-time scheduling that hold independent of the scheduling algorithms employed. It provides a taxonomy of the different scheduling methods, and considers the various performance metrics that can be used for comparison purposes. A detailed review is provided covering partitioned, global, and hybrid scheduling algorithms, approaches to resource sharing, and the latest results from empirical investigations. The survey identifies open issues, key research challenges, and likely productive research directions.

910 citations

Proceedings ArticleDOI
06 Oct 2003
TL;DR: A method for performing fault localization using similar program spectra and a generic method for establishing the quality of a report, based on the way an "ideal user" would navigate the program using the report to save effort during debugging.
Abstract: We present a method for performing fault localization using similar program spectra. Our method assumes the existence of a faulty run and a larger number of correct runs. It then selects according to a distance criterion the correct run that most resembles the faulty run, compares the spectra corresponding to these two runs, and produces a report of "suspicious" parts of the program. Our method is widely applicable because it does not require any knowledge of the programinput and no more information from the user than a classification of the runs as either "correct" or "faulty". To experimentally validate the viability of the method, we implemented it in a tool, WHITHER using basic block profiling spectra. We experimented with two different similarity measures and the Siemens suite of 132 programs with injected bugs. To measure the success of the tool, we developed a generic method for establishing the quality of a report. The method is based on the way an "ideal user" would navigate the program using the report to save effort during debugging. The best results we obtained were, on average, above 50%, meaning that our ideal user would avoid looking at half of the program.

730 citations