scispace - formally typeset
Search or ask a question
Author

Arie Gurfinkel

Bio: Arie Gurfinkel is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Model checking & Predicate abstraction. The author has an hindex of 13, co-authored 17 publications receiving 336 citations. Previous affiliations of Arie Gurfinkel include SEI Investments Company & Software Engineering Institute.

Papers
More filters
Proceedings ArticleDOI
12 Dec 2012
TL;DR: This paper proposes a scheme that captures the semantics of functions as semantic hashes, each of which represent the input-output behavior of a basic block, and uses a form of locality-sensitive hashing known as Min Hashing, functions with many common features can be quickly identified, and the complexity of clustering is reduced to O(N).
Abstract: The ability to identify semantically-related functions, in large collections of binary executables, is important for malware detection. Intuitively, two pieces of code are similar if they have the same effect on a machine's state. Current state-of-the-art tools employ a variety of pair wise comparisons (e.g., template matching using SMT solvers, Value-Set analysis at critical program points, API call matching, etc.) However, these methods are unshakable for clustering large datasets, of size N, since they require O(N^2) comparisons. In this paper, we present an alternative approach based upon "hashing". We propose a scheme that captures the semantics of functions as semantic hashes. Our approach treats a function as a set of features, each of which represent the input-output behavior of a basic block. Using a form of locality-sensitive hashing known as Min Hashing, functions with many common features can be quickly identified, and the complexity of clustering is reduced to O(N). Experiments on functions extracted from the CERT malware catalog indicate that we are able to cluster closely related code with a low false positive rate.

52 citations

Proceedings ArticleDOI
22 Jan 2014
TL;DR: A static approach that uses symbolic execution and inter-procedural data flow analysis to discover object instances, data members, and methods of a common class and helps malware reverse engineers to understand how classes are laid out and to identify their methods.
Abstract: Object-oriented programming complicates the already difficult task of reverse engineering software, and is being used increasingly by malware authors. Unlike traditional procedural-style code, reverse engineers must understand the complex interactions between object-oriented methods and the shared data structures with which they operate on, a tedious manual process.In this paper, we present a static approach that uses symbolic execution and inter-procedural data flow analysis to discover object instances, data members, and methods of a common class. The key idea behind our work is to track the propagation and usage of a unique object instance reference, called a this pointer. Our goal is to help malware reverse engineers to understand how classes are laid out and to identify their methods. We have implemented our approach in a tool called ObJDIGGER, which produced encouraging results when validated on real-world malware samples.

37 citations

Journal ArticleDOI
TL;DR: The vacuity detection tool, VaqTree, uses a characteristic of resolution proofs— peripherality—and proves that if a variable is a source of vacuity, then there exists a resolution proof in which this variable is peripheral.
Abstract: When model-checking reports that a property holds on a model, vacuity detection increases user confidence in this result by checking that the property is satisfied in the intended way. While vacuity detection is effective, it is a relatively expensive technique requiring many additional model-checking runs. We address the problem of efficient vacuity detection for Bounded Model Checking (BMC) of linear temporal logic properties, presenting three partial vacuity detection methods based on the efficient analysis of the resolution proof produced by a successful BMC run. In particular, we define a characteristic of resolution proofs— peripherality—and prove that if a variable is a source of vacuity, then there exists a resolution proof in which this variable is peripheral. Our vacuity detection tool, VaqTree, uses these methods to detect vacuous variables, decreasing the total number of model-checking runs required to detect all sources of vacuity.

34 citations

ReportDOI
01 Nov 2012
TL;DR: A framework for reliability validation and improvement is proposed that integrates several recommended technology solutions and provides the basis for a set of metrics for cost-effective reliability improvement that overcome the challenges of existing software complexity, reliability, and cost metrics.
Abstract: : Software-reliant systems such as rotorcraft and other aircraft have experienced exponential growth in software size and complexity. The current software engineering practice of build then test has made them unaffordable to build and qualify. This report discusses the challenges of qualifying such systems, presenting the findings of several government and industry studies. It identifies several root cause areas and proposes a framework for reliability validation and improvement that integrates several recommended technology solutions: validation of formalized requirements; an architecture-centric, model-based engineering approach that uncovers system-level problems early through analysis; use of static analysis for validating system behavior and other system properties; and managed confidence in qualification through system assurance. This framework also provides the basis for a set of metrics for cost-effective reliability improvement that overcome the challenges of existing software complexity, reliability, and cost metrics.

29 citations

Proceedings ArticleDOI
30 Oct 2011
TL;DR: This work construct (and verify) a sequential program S that over-approximates all executions of C up to time W, while respecting priorities and bounds on the number of preemptions implied by RMS.
Abstract: Real-Time Embedded Software (RTES) constitutes an important sub-class of concurrent safety-critical programs We consider the problem of verifying functional correctness of periodic RTES, a popular variant of RTES that execute periodic tasks in an order determined by Rate Monotonic Scheduling (RMS) A computational model of a periodic RTES is a finite collection of terminating tasks that arrive periodically and must complete before their next arrival We present an approach for time-bounded verification of safety properties in periodic RTES Our approach is based on sequentialization Given an RTES C and a time-bound W, we construct (and verify) a sequential program S that over-approximates all executions of C up to time W, while respecting priorities and bounds on the number of preemptions implied by RMS Our algorithm supports partial-order reduction, preemption locks, and priority locks We implemented our approach for C programs, with properties specified via user-provided assertions We evaluated our tool on several realistic examples, and were able to detect a subtle concurrency issue in a robot controller

26 citations


Cited by
More filters
Journal Article
TL;DR: In benchmark studies using a set of large industrial circuit verification instances, this method is greatly more efficient than BDD-based symbolic model checking, and compares favorably to some recent SAT-based model checking methods on positive instances.
Abstract: We consider a fully SAT-based method of unbounded symbolic model checking based on computing Craig interpolants. In benchmark studies using a set of large industrial circuit verification instances, this method is greatly more efficient than BDD-based symbolic model checking, and compares favorably to some recent SAT-based model checking methods on positive instances.

775 citations

Proceedings Article
20 Aug 2014
TL;DR: The first public, large-scale analysis of firmware images is presented, which discovered a total of 38 previously unknown vulnerabilities in over 693 firmware images and extended some of those vulnerabilities to over 123 different products.
Abstract: As embedded systems are more than ever present in our society, their security is becoming an increasingly important issue. However, based on the results of many recent analyses of individual firmware images, embedded systems acquired a reputation of being insecure. Despite these facts, we still lack a global understanding of embedded systems' security as well as the tools and techniques needed to support such general claims. In this paper we present the first public, large-scale analysis of firmware images. In particular, we unpacked 32 thousand firmware images into 1.7 million individual files, which we then statically analyzed. We leverage this large-scale analysis to bring new insights on the security of embedded devices and to underline and detail several important challenges that need to be addressed in future research. We also show the main benefits of looking at many different devices at the same time and of linking our results with other large-scale datasets such as the ZMap's HTTPS survey. In summary, without performing sophisticated static analysis, we discovered a total of 38 previously unknown vulnerabilities in over 693 firmware images. Moreover, by correlating similar files inside apparently unrelated firmware images, we were able to extend some of those vulnerabilities to over 123 different products. We also confirmed that some of these vulnerabilities altogether are affecting at least 140K devices accessible over the Internet. It would not have been possible to achieve these results without an analysis at such wide scale. We believe that this project, which we plan to provide as a firmware unpacking and analysis web service, will help shed some light on the security of embedded devices.

342 citations

Book ChapterDOI
18 Jul 2015
TL;DR: The key distinguishing feature of SeaHorn is its modular design that separates the concerns of the syntax of the programming language, its operational semantics, and the verification semantics that simplifies interfacing with multiple verification tools based on Horn-clauses.
Abstract: In this paper, we present SeaHorn, a software verification framework. The key distinguishing feature of SeaHorn is its modular design that separates the concerns of the syntax of the programming language, its operational semantics, and the verification semantics. SeaHorn encompasses several novelties: it (a) encodes verification conditions using an efficient yet precise inter-procedural technique, (b) provides flexibility in the verification semantics to allow different levels of precision, (c) leverages the state-of-the-art in software model checking and abstract interpretation for verification, and (d) uses Horn-clauses as an intermediate language to represent verification conditions which simplifies interfacing with multiple verification tools based on Horn-clauses. SeaHorn provides users with a powerful verification tool and researchers with an extensible and customizable framework for experimenting with new software verification techniques. The effectiveness and scalability of SeaHorn are demonstrated by an extensive experimental evaluation using benchmarks from SV-COMP 2015 and real avionics code.

298 citations