Symbolic execution for software testing: three decades later

doi:10.1145/2408776.2408795

Home
/
Papers
/
Symbolic execution for software testing: three decades later

Journal Article•DOI•

Symbolic execution for software testing: three decades later

Cristian Cadar¹, Koushik Sen²•Institutions (2)

Imperial College London¹, University of California, Berkeley²

01 Feb 2013-Communications of The ACM (ACM)-Vol. 56, Iss: 2, pp 82-90

TL;DR: The challenges---and great promise---of modern symbolic execution techniques, and the tools to help implement them.

read less

Abstract: The challenges---and great promise---of modern symbolic execution techniques, and the tools to help implement them.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

VUzzer: Application-aware Evolutionary Fuzzing.

[...]

Sanjay Rawat, Vivek Jain¹, Ashish Kumar², Lucian Cojocar, Cristiano Giuffrida, Herbert Bos - Show less +2 more•Institutions (2)

Thapar University¹, STMicroelectronics²

27 Feb 2017

TL;DR: This paper presents an application - aware evolutionary fuzzing strategy that does not require any prior knowledge of the application or input format, and leverages control - and data - flow features based on static and dynamic analysis to infer fundamental prop - erties of the applications.

...read moreread less

Abstract: See, stats, and : https : / / www . researchgate . net / publication / 311886374 VUzzer : Application - aware Conference DOI : 10 . 14722 / ndss . 2017 . 23404 CITATIONS 0 READS 17 6 , including : Some : Systems Sanjay Vrije , Amsterdam , Netherlands 38 SEE Ashish International 1 SEE Cristiano VU 51 SEE Herbert VU 163 , 836 SEE All . The . All - text and , letting . Abstract—Fuzzing is an effective software testing technique to find bugs . Given the size and complexity of real - world applications , modern fuzzers tend to be either scalable , but not effective in exploring bugs that lie deeper in the execution , or capable of penetrating deeper in the application , but not scalable . In this paper , we present an application - aware evolutionary fuzzing strategy that does not require any prior knowledge of the application or input format . In order to maximize coverage and explore deeper paths , we leverage control - and data - flow features based on static and dynamic analysis to infer fundamental prop - erties of the application . This enables much faster generation of interesting inputs compared to an application - agnostic approach . We implement our fuzzing strategy in VUzzer and evaluate it on three different datasets : DARPA Grand Challenge binaries (CGC) , a set of real - world applications (binary input parsers) , and the recently released LAVA dataset . On all of these datasets , VUzzer yields significantly better results than state - of - the - art fuzzers , by quickly finding several existing and new bugs .

...read moreread less

532 citations

Cites background from "Symbolic execution for software tes..."

...However, fuzzers like AFL are designed to target arbitrarily large programs and, in spite of several advancements, the application of symbolic/concolic techniques to such programs remains a challenge [10]....
[...]

Journal Article•DOI•

Metamorphic Testing: A Review of Challenges and Opportunities

[...]

Tsong Yueh Chen¹, Fei-Ching Kuo¹, Huai Liu², Pak-Lok Poon³, Dave Towey⁴, T. H. Tse⁵, Zhi Quan Zhou⁶ - Show less +3 more•Institutions (6)

Swinburne University of Technology¹, Victoria University, Australia², RMIT University³, The University of Nottingham Ningbo China⁴, University of Hong Kong⁵, University of Wollongong⁶

04 Jan 2018-ACM Computing Surveys

TL;DR: The current research of metamorphic testing is reviewed and the challenges yet to be addressed are discussed, and visions for further improvement are presented and opportunities for new research are highlighted.

...read moreread less

Abstract: Metamorphic testing is an approach to both test case generation and test result verification. A central element is a set of metamorphic relations, which are necessary properties of the target function or algorithm in relation to multiple inputs and their expected outputs. Since its first publication, we have witnessed a rapidly increasing body of work examining metamorphic testing from various perspectives, including metamorphic relation identification, test case generation, integration with other software engineering techniques, and the validation and evaluation of software systems. In this article, we review the current research of metamorphic testing and discuss the challenges yet to be addressed. We also present visions for further improvement of metamorphic testing and highlight opportunities for new research.

...read moreread less

392 citations

Proceedings Article•DOI•

Angora: Efficient Fuzzing by Principled Search

[...]

Peng Chen¹, Hao Chen²•Institutions (2)

ShanghaiTech University¹, University of California, Davis²

20 May 2018

TL;DR: Angora as discussed by the authors is a new mutation-based fuzzer that outperforms the state-of-the-art fuzzers by a wide margin, which aims to increase branch coverage by solving path constraints without symbolic execution.

...read moreread less

Abstract: Fuzzing is a popular technique for finding software bugs. However, the performance of the state-of-the-art fuzzers leaves a lot to be desired. Fuzzers based on symbolic execution produce quality inputs but run slow, while fuzzers based on random mutation run fast but have difficulty producing quality inputs. We propose Angora, a new mutation-based fuzzer that outperforms the state-of-the-art fuzzers by a wide margin. The main goal of Angora is to increase branch coverage by solving path constraints without symbolic execution. To solve path constraints efficiently, we introduce several key techniques: scalable byte-level taint tracking, context-sensitive branch count, search based on gradient descent, and input length exploration. On the LAVA-M data set, Angora found almost all the injected bugs, found more bugs than any other fuzzer that we compared with, and found eight times as many bugs as the second-best fuzzer in the program who. Angora also found 103 bugs that the LAVA authors injected but could not trigger. We also tested Angora on eight popular, mature open source programs. Angora found 6, 52, 29, 40 and 48 new bugs in file, jhead, nm, objdump and size, respectively. We measured the coverage of Angora and evaluated how its key techniques contribute to its impressive performance.

...read moreread less

375 citations

Journal Article•DOI•

A Survey on Metamorphic Testing

[...]

Sergio Segura¹, Gordon Fraser², Ana B. Sánchez¹, Antonio Ruiz-Cortés¹•Institutions (2)

University of Seville¹, University of Sheffield²

01 Sep 2016-IEEE Transactions on Software Engineering

TL;DR: This article provides a comprehensive survey on metamorphic testing, which summarises the research results and application areas, and analyses common practice in empirical studies of metamorphIC testing as well as the main open challenges.

...read moreread less

Abstract: A test oracle determines whether a test execution reveals a fault, often by comparing the observed program output to the expected output. This is not always practical, for example when a program's input-output relation is complex and difficult to capture formally. Metamorphic testing provides an alternative, where correctness is not determined by checking an individual concrete output, but by applying a transformation to a test input and observing how the program output “morphs” into a different one as a result. Since the introduction of such metamorphic relations in 1998, many contributions on metamorphic testing have been made, and the technique has seen successful applications in a variety of domains, ranging from web services to computer graphics. This article provides a comprehensive survey on metamorphic testing: It summarises the research results and application areas, and analyses common practice in empirical studies of metamorphic testing as well as the main open challenges.

...read moreread less

362 citations

Additional excerpts

...Fault-based testing uses symbolic evaluation [132], [133] and constraint solving [133] techniques to prove the absence of certain types of faults in...
[...]

Journal Article•DOI•

A Survey of Symbolic Execution Techniques

[...]

Roberto Baldoni¹, Emilio Coppa¹, Daniele Cono D'Elia¹, Camil Demetrescu¹, Irene Finocchi¹ - Show less +1 more•Institutions (1)

Sapienza University of Rome¹

23 May 2018-ACM Computing Surveys

TL;DR: A survey of the main challenges, challenges, and solutions for symbolic execution can be found in this paper, where the authors provide an overview of main ideas, challenges and solutions developed in the area.

...read moreread less

Abstract: Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program’s authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the past four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience.

...read moreread less

271 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146

Collapse

References

PDF

Open Access

More filters

Book Chapter•DOI•

Z3: an efficient SMT solver

[...]

Leonardo de Moura¹, Nikolaj Bjørner¹•Institutions (1)

Microsoft¹

29 Mar 2008

TL;DR: Z3 is a new and efficient SMT Solver freely available from Microsoft Research that is used in various software verification and analysis applications.

...read moreread less

Abstract: Satisfiability Modulo Theories (SMT) problem is a decision problem for logical first order formulas with respect to combinations of background theories such as: arithmetic, bit-vectors, arrays, and uninterpreted functions. Z3 is a new and efficient SMT Solver freely available from Microsoft Research. It is used in various software verification and analysis applications.

...read moreread less

6,859 citations

Proceedings Article•DOI•

LLVM: a compilation framework for lifelong program analysis & transformation

[...]

Chris Lattner¹, Vikram Adve¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

20 Mar 2004

TL;DR: The design of the LLVM representation and compiler framework is evaluated in three ways: the size and effectiveness of the representation, including the type information it provides; compiler performance for several interprocedural problems; and illustrative examples of the benefits LLVM provides for several challenging compiler problems.

...read moreread less

Abstract: We describe LLVM (low level virtual machine), a compiler framework designed to support transparent, lifelong program analysis and transformation for arbitrary programs, by providing high-level information to compiler transformations at compile-time, link-time, run-time, and in idle time between runs. LLVM defines a common, low-level code representation in static single assignment (SSA) form, with several novel features: a simple, language-independent type-system that exposes the primitives commonly used to implement high-level language features; an instruction for typed address arithmetic; and a simple mechanism that can be used to implement the exception handling features of high-level languages (and setjmp/longjmp in C) uniformly and efficiently. The LLVM compiler framework and code representation together provide a combination of key capabilities that are important for practical, lifelong analysis and transformation of programs. To our knowledge, no existing compilation approach provides all these capabilities. We describe the design of the LLVM representation and compiler framework, and evaluate the design in three ways: (a) the size and effectiveness of the representation, including the type information it provides; (b) compiler performance for several interprocedural problems; and (c) illustrative examples of the benefits LLVM provides for several challenging compiler problems.

...read moreread less

4,841 citations

Journal Article•DOI•

Symbolic execution and program testing

[...]

James C. King¹•Institutions (1)

IBM¹

01 Jul 1976-Communications of The ACM

TL;DR: A particular system called EFFIGY which provides symbolic execution for program testing and debugging is described, which interpretively executes programs written in a simple PL/I style programming language.

...read moreread less

Abstract: This paper describes the symbolic execution of programs. Instead of supplying the normal inputs to a program (e.g. numbers) one supplies symbols representing arbitrary values. The execution proceeds as in a normal execution except that values may be symbolic formulas over the input symbols. The difficult, yet interesting issues arise during the symbolic execution of conditional branch type statements. A particular system called EFFIGY which provides symbolic execution for program testing and debugging is also described. It interpretively executes programs written in a simple PL/I style programming language. It includes many standard debugging features, the ability to manage and to prove things about symbolic expressions, a simple program testing manager, and a program verifier. A brief discussion of the relationship between symbolic execution and program proving is also included.

...read moreread less

2,941 citations

"Symbolic execution for software tes..." refers background in this paper

...While the key idea behind symbolic execution was introduced more than three decades ago [5, 12, 22, 25], it has only recently been made practical, as a result of significant advances in constraint satisfiability [14], and of more scalable dynamic approaches which combine concrete and symbolic execution [9, 20]....
[...]
...The key idea behind symbolic execution [12, 25] is to use symbolic values, instead of concrete data values as input and to represent the values of program variables as symbolic expressions over the symbolic input values....
[...]

Proceedings Article•DOI•

KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs

[...]

Cristian Cadar¹, Daniel Dunbar¹, Dawson Engler¹•Institutions (1)

Stanford University¹

08 Dec 2008

TL;DR: A new symbolic execution tool, KLEE, capable of automatically generating tests that achieve high coverage on a diverse set of complex and environmentally-intensive programs, and significantly beat the coverage of the developers' own hand-written test suite is presented.

...read moreread less

Abstract: We present a new symbolic execution tool, KLEE, capable of automatically generating tests that achieve high coverage on a diverse set of complex and environmentally-intensive programs. We used KLEE to thoroughly check all 89 stand-alone programs in the GNU COREUTILS utility suite, which form the core user-level environment installed on millions of Unix systems, and arguably are the single most heavily tested set of open-source programs in existence. KLEE-generated tests achieve high line coverage -- on average over 90% per tool (median: over 94%) -- and significantly beat the coverage of the developers' own hand-written test suite. When we did the same for 75 equivalent tools in the BUSYBOX embedded system suite, results were even better, including 100% coverage on 31 of them.We also used KLEE as a bug finding tool, applying it to 452 applications (over 430K total lines of code), where it found 56 serious bugs, including three in COREUTILS that had been missed for over 15 years. Finally, we used KLEE to crosscheck purportedly identical BUSYBOX and COREUTILS utilities, finding functional correctness errors and a myriad of inconsistencies.

...read moreread less

2,896 citations

Journal Article•DOI•

DART: directed automated random testing

[...]

Patrice Godefroid¹, Nils Klarlund¹, Koushik Sen²•Institutions (2)

Alcatel-Lucent¹, University of Illinois at Urbana–Champaign²

12 Jun 2005

TL;DR: DART is a new tool for automatically testing software that combines three main techniques, automated extraction of the interface of a program with its external environment using static source-code parsing, and dynamic analysis of how the program behaves under random testing and automatic generation of new test inputs to direct systematically the execution along alternative program paths.

...read moreread less

Abstract: We present a new tool, named DART, for automatically testing software that combines three main techniques: (1) automated extraction of the interface of a program with its external environment using static source-code parsing; (2) automatic generation of a test driver for this interface that performs random testing to simulate the most general environment the program can operate in; and (3) dynamic analysis of how the program behaves under random testing and automatic generation of new test inputs to direct systematically the execution along alternative program paths. Together, these three techniques constitute Directed Automated Random Testing, or DART for short. The main strength of DART is thus that testing can be performed completely automatically on any program that compiles -- there is no need to write any test driver or harness code. During testing, DART detects standard errors such as program crashes, assertion violations, and non-termination. Preliminary experiments to unit test several examples of C programs are very encouraging.

...read moreread less

2,346 citations

"Symbolic execution for software tes..." refers background or methods in this paper

...DART [20] is the first concolic testing tool that combines dynamic test generation with random testing and model checking techniques with the goal of systematically executing all (or as many as possible) execution paths of a program, while checking each execution for various types of errors....
[...]
...CUTE (A Concolic Unit Testing Engine) and jCUTE (CUTE for Java)31,33,35 extend DART to handle multithreaded programs that manipulate dynamic data structures using pointer operations....
[...]
...DART, CUTE, and CREST....
[...]
...DART19 is the first concolic testing tool that combines dynamic test generation with random testing and model checking techniques with the goal of systematically executing all (or as many as possible) feasible paths of a program, while checking each execution for various types of errors....
[...]
...On the one end of the spectrum is a system like DART that only reasons about concrete pointers, or systems like CUTE and CREST that support only equality and inequality constraints for pointers, which can be efficiently solved.35 At the other end are systems like EXE, and more recently KLEE and SAGE10,17,35 that model pointers using the theory of arrays with selections and updates implemented by solvers like STP or Z3....
[...]