scispace - formally typeset
Search or ask a question
Author

Cristina Cifuentes

Bio: Cristina Cifuentes is an academic researcher from Oracle Corporation. The author has contributed to research in topics: Executable & Binary translation. The author has an hindex of 28, co-authored 85 publications receiving 2711 citations. Previous affiliations of Cristina Cifuentes include Queensland University of Technology & University of Tasmania.


Papers
More filters
Proceedings ArticleDOI
14 Jun 2006
TL;DR: The design and implementation of the Squawk VM is described as applied to the Sun™ Small Programmable Object Technology (SPOT) wireless device; a device developed at Sun Microsystems Laboratories for experimentation with wireless sensor and actuator applications.
Abstract: The Squawk virtual machine is a small Java™ virtual machine (VM) written mostly in Java that runs without an operating system on a wireless sensor platform. Squawk translates standard class file into an internal pre-linked, position independent format that is compact and allows for efficient execution of bytecodes that have been placed into a read-only memory. In addition, Squawk implements an application isolation mechanism whereby applications are represented as object and are therefore treated as first class objects (i.e., they can be reified). Application isolation also enables Squawk to run multiple applications at once with all immutable state being shared between the applications. Mutable state is not shared. The combination of these features reduce the memory footprint of the VM, making it ideal for deployment on small devices.Squawk provides a wireless API that allows developers to write applications for wireless sensor networks (WSNs), this API is an extension of the generic connection framework (GCF). Authentication of deployed files on the wireless device and migration of applications between devices is also performed by the VM.This paper describes the design and implementation of the Squawk VM as applied to the Sun™ Small Programmable Object Technology (SPOT) wireless device; a device developed at Sun Microsystems Laboratories for experimentation with wireless sensor and actuator applications.

238 citations

Journal ArticleDOI
TL;DR: The structure of a decompiler is presented, along with a thorough description of the different modules that form part of a decomiler, and the type of analyses that are performed on the machine code to regenerate high‐level language code.
Abstract: The structure of a decompiler is presented, along with a thorough description of the different modules that form part of a decompiler, and the type of analyses that are performed on the machine code to regenerate high-level language code. The phases of the decompiler have been grouped into three main modules: front-end, universal decompiling machine, and back-end. The front-end is a machine dependent module that performs the loading, parsing and semantic analysis of the input program, as well as generating an intermediate representation of the program. The universal decompiling machine is a machine and language independent module that performs data and control flow analysis of the program based on the intermediate representation, and the program''s control flow graph. The back-end is a language dependent module that deals with the details of the target high-level language.

223 citations

Dissertation
01 Jan 1994
TL;DR: Techniques for writing reverse compilers or decompilers are presented in this thesis, based on compiler and optimization theory, and applied to decompilation in a unique way; these techniques have never before been published.
Abstract: Techniques for writing reverse compilers or decompilers are presented in this thesis. These techniques are based on compiler and optimization theory, and are applied to decompilation in a unique way; these techniques have never before been published. A decompiler is composed of several phases which are grouped into modules dependent on language or machine features. The front-end is a machine dependent module that parses the binary program, analyzes the semantics of the instructions in the program, and generates an intermediate low-level representation of the program, as well as a control flow graph of each subroutine. The universal decompiling machine is a language and machine independent module that analyzes the low-level intermediate code and transforms it into a high-level representation available in any high-level language, and analyzes the structure of the control flow graph(s) and transform them into graphs that make use of high-level control structures. Finally, the back-end is a target language dependent module that generates code for the target language. Decompilation is a process that involves the use of tools to load the binary program into memory, parse or disassemble such a program, and decompile or analyze the program to generate a high-level language program. This process benefits from compiler and library signatures to recognize particular compilers and library subroutines. Whenever a compiler signature is recognized in the binary program, all compiler start-up and library subroutines are not decompiled; in the former case, the routines are eliminated from the final target program and the entry point to the main program is used for the decompiler analysis, in the latter case the subroutines are replaced by their library name. The presented techniques were implemented in a prototype decompiler for the Intel i80286 architecture running under the DOS operating system, dcc, which produces target C programs for source .exe or .com files. Sample decompiled programs, comparisons against the initial high-level language program, and an analysis of results is presented in Chapter 9. Chapter 1 gives an introduction to decompilation from a compiler point of view, Chapter 2 gives an overview of the history of decompilation since its appearance in the early 1960s, Chapter 3 presents the relations between the static binary code of the source binary program and the actions performed at run-time to implement the program, Chapter 4 describes the phases of the front-end module, Chapter 5 defines data optimization techniques to analyze the intermediate code and transform it into a higher-representation, Chapter 6 defines control structure transformation techniques to analyze the structure of the control flow graph and transform it into a graph of high-level control structures, Chapter 7 describes the back-end module, Chapter 8 presents the decompilation tool programs, Chapter 9 gives an overview of the implementation of dcc and the results obtained, and Chapter 10 gives the conclusions and future work of this research.

192 citations

Patent
04 Feb 2003
TL;DR: In this article, a method for compiling a logic design is presented, where the logic design comprises a plurality of modules, compiling separately the plurality of module files into object files, and linking the plurality files to execute the logic.
Abstract: A method for compiling a logic design includes inputting a logic design and an input file into a plurality of compilers, respectively, where the logic design comprises a plurality of modules, compiling separately the plurality of modules into a plurality of object files, and linking the plurality of object files to execute the logic design.

132 citations

Proceedings ArticleDOI
01 Oct 1997
TL;DR: In this article, the authors apply conventional intra-procedural static analysis to binary executables for the purposes of static analysis of machine code and assembly code, such as debugging code and determining the instructions that affect an indexed jump or an indirect call on a register.
Abstract: Program slicing is a technique for determining the set of statements of a program that potentially affect the value of a variable at some point in the program. Intra and interprocedural slicing of high level languages has greatly been studied in the literature; both static and dynamic techniques have been used to aid in the debugging, maintenance, parallelization, program integration, and dataflow testing of programs. We explain how to apply conventional intraprocedural static analysis to binary executables for the purposes of static analysis of machine code and assembly code, such as debugging code and determining the instructions that affect an indexed jump or an indirect call on a register. This analysis is useful in the decoding of machine instructions phase of reverse engineering tools of binary executables, such as binary translators, disassemblers, binary profilers and binary debuggers

121 citations


Cited by
More filters
01 Jul 1997
TL;DR: It is argued that automatic code obfuscation is currently the most viable method for preventing reverse engineering and the design of a code obfuscator is described, a tool which converts a program into an equivalent one that is more diicult to understand and reverse engineer.
Abstract: It has become more and more common to distribute software in forms that retain most or all of the information present in the original source code. An important example is Java bytecode. Since such codes are easy to decompile, they increase the risk of malicious reverse engineering attacks. In this paper we review several techniques for technical protection of software secrets. We will argue that automatic code obfuscation is currently the most viable method for preventing reverse engineering. We then describe the design of a code obfuscator, a tool which converts a program into an equivalent one that is more diicult to understand and reverse engineer. The obfuscator is based on the application of code transformations, in many cases similar to those used by compiler optimizers. We describe a large number of such transformations, classify them, and evaluate them with respect to their potency (To what degree is a human reader confused?), resilience (How well are automatic deobfuscation attacks resisted?), and cost (How much overhead is added to the application?). We nally discuss some possible deobfuscation techniques (such as program slicing) and possible countermeasures an obfuscator could employ against them.

1,019 citations

Proceedings ArticleDOI
01 Dec 2007
TL;DR: A binary obfuscation scheme that relies on opaque constants, which are primitives that allow us to load a constant into a register such that an analysis tool cannot determine its value, demonstrates that static analysis techniques alone might no longer be sufficient to identify malware.
Abstract: Malicious code is an increasingly important problem that threatens the security of computer systems. The traditional line of defense against malware is composed of malware detectors such as virus and spyware scanners. Unfortunately, both researchers and malware authors have demonstrated that these scanners, which use pattern matching to identify malware, can be easily evaded by simple code transformations. To address this shortcoming, more powerful malware detectors have been proposed. These tools rely on semantic signatures and employ static analysis techniques such as model checking and theorem proving to perform detection. While it has been shown that these systems are highly effective in identifying current malware, it is less clear how successful they would be against adversaries that take into account the novel detection mechanisms. The goal of this paper is to explore the limits of static analysis for the detection of malicious code. To this end, we present a binary obfuscation scheme that relies on the idea of opaque constants, which are primitives that allow us to load a constant into a register such that an analysis tool cannot determine its value. Based on opaque constants, we build obfuscation transformations that obscure program control flow, disguise access to local and global variables, and interrupt tracking of values held in processor registers. Using our proposed obfuscation approach, we were able to show that advanced semantics-based malware detectors can be evaded. Moreover, our opaque constant primitive can be applied in a way such that is provably hard to analyze for any static code analyzer. This demonstrates that static analysis techniques alone might no longer be sufficient to identify malware.

838 citations

Journal ArticleDOI
TL;DR: This paper outlines a set of requirements for IoT middleware, and presents a comprehensive review of the existing middleware solutions against those requirements, and open research issues, challenges, and future research directions are highlighted.
Abstract: The Internet of Things (IoT) envisages a future in which digital and physical things or objects (e.g., smartphones, TVs, cars) can be connected by means of suitable information and communication technologies, to enable a range of applications and services. The IoT’s characteristics, including an ultra-large-scale network of things, device and network level heterogeneity, and large numbers of events generated spontaneously by these things, will make development of the diverse applications and services a very challenging task. In general, middleware can ease a development process by integrating heterogeneous computing and communications devices, and supporting interoperability within the diverse applications and services. Recently, there have been a number of proposals for IoT middleware. These proposals mostly addressed wireless sensor networks (WSNs), a key component of IoT, but do not consider RF identification (RFID), machine-to-machine (M2M) communications, and supervisory control and data acquisition (SCADA), other three core elements in the IoT vision. In this paper, we outline a set of requirements for IoT middleware, and present a comprehensive review of the existing middleware solutions against those requirements. In addition, open research issues, challenges, and future research directions are highlighted.

805 citations

Proceedings ArticleDOI
22 May 2016
TL;DR: This paper presents a binary analysis framework that implements a number of analysis techniques that have been proposed in the past and implements these techniques in a unifying framework, which allows other researchers to compose them and develop new approaches.
Abstract: Finding and exploiting vulnerabilities in binary code is a challenging task. The lack of high-level, semantically rich information about data structures and control constructs makes the analysis of program properties harder to scale. However, the importance of binary analysis is on the rise. In many situations binary analysis is the only possible way to prove (or disprove) properties about the code that is actually executed. In this paper, we present a binary analysis framework that implements a number of analysis techniques that have been proposed in the past. We present a systematized implementation of these techniques, which allows other researchers to compose them and develop new approaches. In addition, the implementation of these techniques in a unifying framework allows for the direct comparison of these apporaches and the identification of their advantages and disadvantages. The evaluation included in this paper is performed using a recent dataset created by DARPA for evaluating the effectiveness of binary vulnerability analysis techniques. Our framework has been open-sourced and is available to the security community.

758 citations

Proceedings ArticleDOI
27 Oct 2003
TL;DR: Experimental results indicate that significant portions of executables that have been obfuscated using the techniques described are disassembled incorrectly, thereby showing the efficacy of the methods.
Abstract: A great deal of software is distributed in the form of executable code. The ability to reverse engineer such executables can create opportunities for theft of intellectual property via software piracy, as well as security breaches by allowing attackers to discover vulnerabilities in an application. The process of reverse engineering an executable program typically begins with disassembly, which translates machine code to assembly code. This is then followed by various decompilation steps that aim to recover higher-level abstractions from the assembly code. Most of the work to date on code obfuscation has focused on disrupting or confusing the decompilation phase. This paper, by contrast, focuses on the initial disassembly phase. Our goal is to disrupt the static disassembly process so as to make programs harder to disassemble correctly. We describe two widely used static disassembly algorithms, and discuss techniques to thwart each of them. Experimental results indicate that significant portions of executables that have been obfuscated using our techniques are disassembled incorrectly, thereby showing the efficacy of our methods.

694 citations