Modeling and Discovering Vulnerabilities with Code Property Graphs
Fabian Yamaguchi,Nico Golde,Daniel Arp,Konrad Rieck +3 more
- pp 590-604
Reads0
Chats0
TLDR
This paper introduces a novel representation of source code called a code property graph that merges concepts of classic program analysis, namely abstract syntax trees, control flow graphs and program dependence graphs, into a joint data structure that enables it to elegantly model templates for common vulnerabilities with graph traversals that can identify buffer overflows, integer overflOWS, format string vulnerabilities, or memory disclosures.Abstract:
The vast majority of security breaches encountered today are a direct result of insecure code. Consequently, the protection of computer systems critically depends on the rigorous identification of vulnerabilities in software, a tedious and error-prone process requiring significant expertise. Unfortunately, a single flaw suffices to undermine the security of a system and thus the sheer amount of code to audit plays into the attacker's cards. In this paper, we present a method to effectively mine large amounts of source code for vulnerabilities. To this end, we introduce a novel representation of source code called a code property graph that merges concepts of classic program analysis, namely abstract syntax trees, control flow graphs and program dependence graphs, into a joint data structure. This comprehensive representation enables us to elegantly model templates for common vulnerabilities with graph traversals that, for instance, can identify buffer overflows, integer overflows, format string vulnerabilities, or memory disclosures. We implement our approach using a popular graph database and demonstrate its efficacy by identifying 18 previously unknown vulnerabilities in the source code of the Linux kernel.read more
Citations
More filters
Proceedings ArticleDOI
SOK: (State of) The Art of War: Offensive Techniques in Binary Analysis
Yan Shoshitaishvili,Ruoyu Wang,Christopher Salls,Nick Stephens,Mario Polino,Andrew Dutcher,John Grosen,Siji Feng,Christophe Hauser,Christopher Kruegel,Giovanni Vigna +10 more
TL;DR: This paper presents a binary analysis framework that implements a number of analysis techniques that have been proposed in the past and implements these techniques in a unifying framework, which allows other researchers to compose them and develop new approaches.
Proceedings ArticleDOI
Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs
TL;DR: A novel semantic-based approach that classifies Android malware via dependency graphs that is capable of detecting zero-day malware with a low false negative rate and an acceptable false positive rate while tolerating minor implementation differences is proposed.
Proceedings ArticleDOI
LAVA: Large-Scale Automated Vulnerability Addition
Brendan Dolan-Gavitt,Patrick Hulin,Engin Kirda,Tim Leek,Andrea Mambretti,Wil Robertson,Frederick Ulrich,Ryan Whelan +7 more
TL;DR: LAVA, a novel dynamic taint analysis-based technique for producing ground-truth corpora by quickly and automatically injecting large numbers of realistic bugs into program source code, forms the basis of an approach for generating large ground- Truth vulnerability corpora on demand, enabling rigorous tool evaluation and providing a high-quality target for tool developers.
Proceedings ArticleDOI
discovRE: Efficient Cross-Architecture Identification of Bugs in Binary Code.
TL;DR: A new approach to efficiently search for similar functions in binary code, called discovRE, that supports four instruction set architectures (x86, x64, ARM, MIPS) and is four orders of magnitude faster than the state-of-the-art academic approach for cross-architecture bug search in binaries.
Journal ArticleDOI
Software Vulnerability Analysis and Discovery Using Machine-Learning and Data-Mining Techniques: A Survey
TL;DR: An extensive review of the many different works in the field of software vulnerability analysis and discovery that utilize machine-learning and data-mining techniques that utilize both advantages and shortcomings in this domain is provided.
References
More filters
Book
Compilers: Principles, Techniques, and Tools
TL;DR: This book discusses the design of a Code Generator, the role of the Lexical Analyzer, and other topics related to code generation and optimization.
Journal ArticleDOI
Program Slicing
TL;DR: Program slicing as mentioned in this paper is a method for automatically decomposing programs by analyzing their data flow and control flow. But it is not a technique for finding statement-minimal slices, as it is in general unsolvable, but using data flow analysis is sufficient to find approximate slices.
Journal ArticleDOI
The program dependence graph and its use in optimization
TL;DR: An intermediate program representation, called the program dependence graph (PDG), that makes explicit both the data and control dependences for each operation in a program, allowing transformations to be triggered by one another and applied only to affected dependences.
Program slicing
Keith Gallagher,David Binkley +1 more
TL;DR: Applications of program slicing are surveyed, ranging from its first use as a debugging technique to current applications in property verification using finite state models, and a summary of research challenges for the slicing community is discussed.
Journal ArticleDOI
Interprocedural slicing using dependence graphs
TL;DR: A new kind of graph to represent programs is introduced, called a system dependence graph, which extends previous dependence representations to incorporate collections of procedures (with procedure calls) rather than just monolithic programs.