scispace - formally typeset
Open AccessProceedings ArticleDOI

Measuring empirical computational complexity

TLDR
The tool, the Trend Profiler (trend-prof), is described, for constructing models of empirical computational complexity that predict how many times each basic block in a program runs as a linear or a powerlaw function of user-specified features of the program's workloads.
Abstract
The standard language for describing the asymptotic behavior of algorithms is theoretical computational complexity. We propose a method for describing the asymptotic behavior of programs in practice by measuring their empirical computational complexity. Our method involves running a program on workloads spanning several orders of magnitude in size, measuring their performance, and fitting these observations to a model that predicts performance as a function of workload size. Comparing these models to the programmer's expectations or to theoretical asymptotic bounds can reveal performance bugs or confirm that a program's performance scales as expected. Grouping and ranking program locations based on these models focuses attention on scalability-critical code. We describe our tool, the Trend Profiler (trend-prof), for constructing models of empirical computational complexity that predict how many times each basic block in a program runs as a linear (y = a + bx) or a powerlaw (y = axb) function of user-specified features of the program's workloads. We ran trend-prof on several large programs and report cases where a program scaled as expected, beat its worst-case theoretical complexity bound, or had a performance bug.

read more

Content maybe subject to copyright    Report

Measuring Empirical Computational Complexity
by
Simon Fredrick Goldsmith
B.S. (Carnegie Mellon University) 2001
A dissertation submitted in partial satisfaction of the
requirements for the degree of
Doctor of Philosophy
in
Computer Science
in the
GRADUATE DIVISION
of the
UNIVERSITY OF CALIFORNIA, BERKELEY
Committee in charge:
Professor Alex Aiken, Co-chair
Professor Koushik Sen, Co-chair
Professor Rastislav Bodik
Professor Dor Abrahamson
Fall 2008

The dissertation of Simon Fredrick Goldsmith is approved:
Co-chair Date
Co-chair Date
Date
Date
University of California, Berkeley
Fall 2008

Measuring Empirical Computational Complexity
Copyright 2008
by
Simon Fredrick Goldsmith

1
Abstract
Measuring Empirical Computational Complexity
by
Simon Fredrick Goldsmith
Doctor of Philosophy in Computer Science
University of California, Berkeley
Professor Alex Aiken, Co-chair
Professor Koushik Sen, Co-chair
Scalability is a fundamental problem in computer science. Computer scientists
often describ e the scalability of algorithms in the language of theoretical computational
complexity, bounding the number of operations an algorithm performs as a function of the
size of its input. The main contribution of this dissertation is to provide an analogous
description of the scalability of actual software implementations run on realistic workloads.
We propose a metho d for describing the asymptotic behavior of programs in prac-
tice by measuring their empirical computational complexity. Our method involves running a
program on workloads spanning several orders of magnitude in size, measuring their perfor-
mance, and fitting these observations to a model that predicts performance as a function of
workload size. Comparing these models to the programmer’s expectations or to theoretical
asymptotic bounds can reveal performance bugs or confirm that a program’s performance

2
scales as expected.
We develop our methodology for constructing these models of empirical complexity
as we describe and evaluate two techniques. Our first technique, BB-TrendProf, con-
structs models that predict how many times each basic block runs as a linear (y = a + bx)
or a powerlaw (y = ax
b
) function of user-specified features of the program’s workloads. To
present output succinctly and focus attention on scalability-critical code, BB-TrendProf
groups and ranks program locations based on these models. We demonstrate the power of
BB-TrendProf compared to existing tools by running it on several large programs and
reporting cases where its models show (1) an implementation of a complex algorithm scal-
ing as ex pected, (2) two complex algorithms beating their worst-case theoretical complexity
bounds when run on realistic inputs, and (3) a performance bug.
Our second technique, CF-TrendProf, models performance of loops and func-
tions both p er-function-invocation and per-workload. It improves upon the precision of
BB-TrendProf’s models by using control flow to generate candidates from a richer fam-
ily of models and a novel model selection criteria to select among these candidates. We
show that CF-TrendProf’s improvements to model generation and selection allow it to
correctly characterize or closely approximate the empirical scalability of several well-known
algorithms and data structures and to diagnose several synthetic, but realistic, scalabil-
ity problems without observing an e gregiously expensive workload. We also show that
CF-TrendProf deals with multiple workload features better than BB-TrendProf. We
qualitatively compare the output of BB-TrendProf and CF-TrendProf and discuss
their relative strengths and weaknesses.

Citations
More filters
Proceedings ArticleDOI

SPEED: precise and efficient static estimation of program computational complexity

TL;DR: An inter-procedural technique for computing symbolic bounds on the number of statements a procedure executes in terms of its scalar inputs and user-defined quantitative functions of input data-structures and an algorithm for automating this proof methodology is introduced.
Proceedings ArticleDOI

Control-flow refinement and progress invariants for bound analysis

TL;DR: This paper describes two techniques, control-flow refinement and progress invariants, that together enable estimation of precise bounds for procedures with nested and multi-path loops, and presents an algorithm that uses progress invariant to compute precise limits for nested loops.
Proceedings ArticleDOI

PerfFuzz: automatically generating pathological inputs

TL;DR: PerfFuzz is presented, a method to automatically generate inputs that exercise pathological behavior across program locations, without any domain knowledge, and outperforms prior work by generating inputs that Exercise the most-hit program branch 5x to 69x times more, and result in 1.9x to 24.7x longer total execution paths.
Journal ArticleDOI

Empirical hardness models: Methodology and a case study on combinatorial auctions

TL;DR: The use of supervised machine learning is proposed to build models that predict an algorithm's runtime given a problem instance and techniques for interpreting them are described to gain understanding of the characteristics that cause instances to be hard or easy.
Proceedings Article

Predicting Execution Time of Computer Programs Using Sparse Polynomial Regression

TL;DR: This paper proposes the SPORE (Sparse POlynomial REgression) methodology to build accurate prediction models of program performance using feature data collected from program execution on sample inputs and shows that SPORE methods can give accurate prediction with relative error less than 7% by using a moderate number of training data samples.
References
More filters
Book

Introduction to Algorithms

TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.
Book

Compilers: Principles, Techniques, and Tools

TL;DR: This book discusses the design of a Code Generator, the role of the Lexical Analyzer, and other topics related to code generation and optimization.
BookDOI

Probability theory : the logic of science

TL;DR: In this article, a survey of elementary applications of probability theory can be found, including the following: 1. Plausible reasoning 2. The quantitative rules 3. Elementary sampling theory 4. Elementary hypothesis testing 5. Queer uses for probability theory 6. Elementary parameter estimation 7. The central, Gaussian or normal distribution 8. Sufficiency, ancillarity, and all that 9. Repetitive experiments, probability and frequency 10. Advanced applications: 11. Discrete prior probabilities, the entropy principle 12. Simple applications of decision theory 15.
Book

Mathematical Statistics and Data Analysis

TL;DR: In this article, the authors present a model for estimating parameters and fitting of probability distributions from the normal distribution. But the model is not suitable for the analysis of categorical data.
Book

Accuracy and stability of numerical algorithms

TL;DR: This book gives a thorough, up-to-date treatment of the behavior of numerical algorithms in finite precision arithmetic by combining algorithmic derivations, perturbation theory, and rounding error analysis.
Related Papers (5)
Frequently Asked Questions (9)
Q1. What is the effect of clustering on performance?

Clustering dramatically reduces the number of degrees of freedom of the overall performance model; that is, clustering simplifies their presentation of program performance by dramatically reducing the number of program components whose costs the authors model. 

Contexts are useful for apportioning performance cost of a library to different callers or even costs of data structure operations to different instances of the data structure. 

CF-TrendProf allows the user to mark function invocations with contexts basedon the call graph or on arbitrary runtime data values. 

the log-log scatter plot would seem to be more appropriate for understanding the relationship between performance and workload features since, unlike a linear-linear scatter plot, a constant relative error corresponds to constant distance. 

there is a danger that if the authors allow their models to grow gratuitously complex, that they will overfit the training data and not generalize to other data. 

The mechanism is to annotate each function with types that describe how many steps the function takes to compute its result and use a dependent type system to verify these bounds. 

In order to realize such an ideal situation in the context of a larger algorithm, a user would likely have to provide CF-TrendProf with suitable context annotations and invocation features. 

This dependence of performance on such subtle properties means that the apparent scalability of an algorithm that CF-TrendProf measures is as much a consequence of the code being measured, as it is of the empirical distribution of workloads. 

In a setting where performance need not be a clean function of workload features, reducing the number of unknown entities is useful.