A Survey of Symbolic Execution Techniques

doi:10.1145/3182657

Home
/
Papers
/
A Survey of Symbolic Execution Techniques

Journal Article•DOI•

A Survey of Symbolic Execution Techniques

Roberto Baldoni¹, Emilio Coppa¹, Daniele Cono D'Elia¹, Camil Demetrescu¹, Irene Finocchi¹ - Show less +1 more•Institutions (1)

Sapienza University of Rome¹

23 May 2018-ACM Computing Surveys (ACM)-Vol. 51, Iss: 3, pp 50

TL;DR: A survey of the main challenges, challenges, and solutions for symbolic execution can be found in this paper, where the authors provide an overview of main ideas, challenges and solutions developed in the area.

read less

Abstract: Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program’s authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the past four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Survey of machine learning techniques for malware analysis

[...]

Daniele Ucci¹, Leonardo Aniello², Roberto Baldoni¹•Institutions (2)

Sapienza University of Rome¹, University of Southampton²

01 Mar 2019-Computers & Security

TL;DR: This survey aims at providing an overview on the way machine learning has been used so far in the context of malware analysis in Windows environments, i.e. for the analysis of Portable Executables.

...read moreread less

316 citations

Proceedings Article•DOI•

DeepCT: Tomographic Combinatorial Testing for Deep Learning Systems

[...]

Lei Ma¹, Felix Juefei-Xu², Minhui Xue³, Bo Li⁴, Li Li⁵, Yang Liu⁶, Jianjun Zhao⁷ - Show less +3 more•Institutions (7)

Harbin Institute of Technology¹, Carnegie Mellon University², Macquarie University³, University of Illinois at Urbana–Champaign⁴, Monash University⁵, Nanyang Technological University⁶, Kyushu University⁷

15 Mar 2019

TL;DR: This paper proposes a set of combinatorial testing criteria specialized for DL systems, as well as a CT coverage guided test generation technique, and demonstrates that CT provides a promising avenue for testing DL systems.

...read moreread less

Abstract: Deep learning (DL) has achieved remarkable progress over the past decade and has been widely applied to many industry domains. However, the robustness of DL systems recently becomes great concerns, where minor perturbation on the input might cause the DL malfunction. These robustness issues could potentially result in severe consequences when a DL system is deployed to safety-critical applications and hinder the real-world deployment of DL systems. Testing techniques enable the robustness evaluation and vulnerable issue detection of a DL system at an early stage. The main challenge of testing a DL system attributes to the high dimensionality of its inputs and large internal latent feature space, which makes testing each state almost impossible. For traditional software, combinatorial testing (CT) is an effective testing technique to balance the testing exploration effort and defect detection capabilities. In this paper, we perform an exploratory study of CT on DL systems. We propose a set of combinatorial testing criteria specialized for DL systems, as well as a CT coverage guided test generation technique. Our evaluation demonstrates that CT provides a promising avenue for testing DL systems.

...read moreread less

146 citations

Proceedings Article•DOI•

Manticore: a user-friendly symbolic execution framework for binaries and smart contracts

[...]

Mark Mossberg, Felipe Manzano, Eric Hennenfent, Alex Groce, Gustavo Grieco, Josselin Feist, Trent Brunson, Artem Dinaburg - Show less +4 more

10 Nov 2019

TL;DR: Manticore as discussed by the authors is an open-source dynamic symbolic execution framework for analyzing binaries and Ethereum smart contracts, which can be used to find bugs and verify the correctness of code for commercial clients.

...read moreread less

Abstract: An effective way to maximize code coverage in software tests is through dynamic symbolic execution---a technique that uses constraint solving to systematically explore a program's state space. We introduce an open-source dynamic symbolic execution framework called Manticore for analyzing binaries and Ethereum smart contracts. Manticore's flexible architecture allows it to support both traditional and exotic execution environments, and its API allows users to customize their analysis. Here, we discuss Manticore's architecture and demonstrate the capabilities we have used to find bugs and verify the correctness of code for our commercial clients.

...read moreread less

132 citations

Journal Article•DOI•

Program Analysis of Commodity IoT Applications for Security and Privacy: Challenges and Opportunities

[...]

Z. Berkay Celik¹, Earlence Fernandes², Eric Pauley¹, Gang Tan¹, Patrick McDaniel¹ - Show less +1 more•Institutions (2)

Pennsylvania State University¹, University of Washington²

30 Aug 2019-ACM Computing Surveys

TL;DR: This article studies privacy and security issues in IoT that require program-analysis techniques with an emphasis on identified attacks against these systems and defenses implemented so far and relates the efficacy of program- analysis techniques to security and privacy issues.

...read moreread less

Abstract: Recent advances in Internet of Things (IoT) have enabled myriad domains such as smart homes, personal monitoring devices, and enhanced manufacturing. IoT is now pervasive—new applications are being used in nearly every conceivable environment, which leads to the adoption of device-based interaction and automation. However, IoT has also raised issues about the security and privacy of these digitally augmented spaces. Program analysis is crucial in identifying those issues, yet the application and scope of program analysis in IoT remains largely unexplored by the technical community. In this article, we study privacy and security issues in IoT that require program-analysis techniques with an emphasis on identified attacks against these systems and defenses implemented so far. Based on a study of five IoT programming platforms, we identify the key insights that result from research efforts in both the program analysis and security communities and relate the efficacy of program-analysis techniques to security and privacy issues. We conclude by studying recent IoT analysis systems and exploring their implementations. Through these explorations, we highlight key challenges and opportunities in calibrating for the environments in which IoT systems will be used.

...read moreread less

106 citations

Posted Content•

Manticore: A User-Friendly Symbolic Execution Framework for Binaries and Smart Contracts

[...]

Mark Mossberg, Felipe Manzano, Eric Hennenfent, Alex Groce, Gustavo Grieco, Josselin Feist, Trent Brunson, Artem Dinaburg - Show less +4 more

08 Jul 2019-arXiv: Software Engineering

TL;DR: Manticore is introduced, an open-source dynamic symbolic execution framework for analyzing binaries and Ethereum smart contracts that allows it to support both traditional and exotic execution environments, and its API allows users to customize their analysis.

...read moreread less

Abstract: An effective way to maximize code coverage in software tests is through dynamic symbolic execution$-$a technique that uses constraint solving to systematically explore a program's state space. We introduce an open-source dynamic symbolic execution framework called Manticore for analyzing binaries and Ethereum smart contracts. Manticore's flexible architecture allows it to support both traditional and exotic execution environments, and its API allows users to customize their analysis. Here, we discuss Manticore's architecture and demonstrate the capabilities we have used to find bugs and verify the correctness of code for our commercial clients.

...read moreread less

76 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Program Slicing

[...]

Mark Weiser¹•Institutions (1)

University of Maryland, College Park¹

01 Jul 1984-IEEE Transactions on Software Engineering

TL;DR: Program slicing as mentioned in this paper is a method for automatically decomposing programs by analyzing their data flow and control flow. But it is not a technique for finding statement-minimal slices, as it is in general unsolvable, but using data flow analysis is sufficient to find approximate slices.

...read moreread less

Abstract: Program slicing is a method for automatically decomposing programs by analyzing their data flow and control flow. Starting from a subset of a program's behavior, slicing reduces that program to a minimal form which still produces that behavior. The reduced program, called a ``slice,'' is an independent program guaranteed to represent faithfully the original program within the domain of the specified subset of behavior. Some properties of slices are presented. In particular, finding statement-minimal slices is in general unsolvable, but using data flow analysis is sufficient to find approximate slices. Potential applications include automatic slicing tools for debuggng and parallel processing of slices.

...read moreread less

3,163 citations

Journal Article•DOI•

Symbolic execution and program testing

[...]

James C. King¹•Institutions (1)

IBM¹

01 Jul 1976-Communications of The ACM

TL;DR: A particular system called EFFIGY which provides symbolic execution for program testing and debugging is described, which interpretively executes programs written in a simple PL/I style programming language.

...read moreread less

Abstract: This paper describes the symbolic execution of programs. Instead of supplying the normal inputs to a program (e.g. numbers) one supplies symbols representing arbitrary values. The execution proceeds as in a normal execution except that values may be symbolic formulas over the input symbols. The difficult, yet interesting issues arise during the symbolic execution of conditional branch type statements. A particular system called EFFIGY which provides symbolic execution for program testing and debugging is also described. It interpretively executes programs written in a simple PL/I style programming language. It includes many standard debugging features, the ability to manage and to prove things about symbolic expressions, a simple program testing manager, and a program verifier. A brief discussion of the relationship between symbolic execution and program proving is also included.

...read moreread less

2,941 citations

Proceedings Article•DOI•

KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs

[...]

Cristian Cadar¹, Daniel Dunbar¹, Dawson Engler¹•Institutions (1)

Stanford University¹

08 Dec 2008

TL;DR: A new symbolic execution tool, KLEE, capable of automatically generating tests that achieve high coverage on a diverse set of complex and environmentally-intensive programs, and significantly beat the coverage of the developers' own hand-written test suite is presented.

...read moreread less

Abstract: We present a new symbolic execution tool, KLEE, capable of automatically generating tests that achieve high coverage on a diverse set of complex and environmentally-intensive programs. We used KLEE to thoroughly check all 89 stand-alone programs in the GNU COREUTILS utility suite, which form the core user-level environment installed on millions of Unix systems, and arguably are the single most heavily tested set of open-source programs in existence. KLEE-generated tests achieve high line coverage -- on average over 90% per tool (median: over 94%) -- and significantly beat the coverage of the developers' own hand-written test suite. When we did the same for 75 equivalent tools in the BUSYBOX embedded system suite, results were even better, including 100% coverage on 31 of them.We also used KLEE as a bug finding tool, applying it to 452 applications (over 430K total lines of code), where it found 56 serious bugs, including three in COREUTILS that had been missed for over 15 years. Finally, we used KLEE to crosscheck purportedly identical BUSYBOX and COREUTILS utilities, finding functional correctness errors and a myriad of inconsistencies.

...read moreread less

2,896 citations

Proceedings Article•DOI•

Separation logic: a logic for shared mutable data structures

[...]

John C. Reynolds¹•Institutions (1)

Carnegie Mellon University¹

22 Jul 2002

TL;DR: An extension of Hoare logic that permits reasoning about low-level imperative programs that use shared mutable data structure is developed, including extensions that permit unrestricted address arithmetic, dynamically allocated arrays, and recursive procedures.

...read moreread less

Abstract: In joint work with Peter O'Hearn and others, based on early ideas of Burstall, we have developed an extension of Hoare logic that permits reasoning about low-level imperative programs that use shared mutable data structure. The simple imperative programming language is extended with commands (not expressions) for accessing and modifying shared structures, and for explicit allocation and deallocation of storage. Assertions are extended by introducing a "separating conjunction" that asserts that its subformulas hold for disjoint parts of the heap, and a closely related "separating implication". Coupled with the inductive definition of predicates on abstract data structures, this extension permits the concise and flexible description of structures with controlled sharing. In this paper, we survey the current development of this program logic, including extensions that permit unrestricted address arithmetic, dynamically allocated arrays, and recursive procedures. We also discuss promising future directions.

...read moreread less

2,348 citations

Journal Article•DOI•

DART: directed automated random testing

[...]

Patrice Godefroid¹, Nils Klarlund¹, Koushik Sen²•Institutions (2)

Alcatel-Lucent¹, University of Illinois at Urbana–Champaign²

12 Jun 2005

TL;DR: DART is a new tool for automatically testing software that combines three main techniques, automated extraction of the interface of a program with its external environment using static source-code parsing, and dynamic analysis of how the program behaves under random testing and automatic generation of new test inputs to direct systematically the execution along alternative program paths.

...read moreread less

Abstract: We present a new tool, named DART, for automatically testing software that combines three main techniques: (1) automated extraction of the interface of a program with its external environment using static source-code parsing; (2) automatic generation of a test driver for this interface that performs random testing to simulate the most general environment the program can operate in; and (3) dynamic analysis of how the program behaves under random testing and automatic generation of new test inputs to direct systematically the execution along alternative program paths. Together, these three techniques constitute Directed Automated Random Testing, or DART for short. The main strength of DART is thus that testing can be performed completely automatically on any program that compiles -- there is no need to write any test driver or harness code. During testing, DART detects standard errors such as program crashes, assertion violations, and non-termination. Preliminary experiments to unit test several examples of C programs are very encouraging.

...read moreread less

2,346 citations