scispace - formally typeset
Search or ask a question
Author

Stelios Sidiroglou-Douskos

Bio: Stelios Sidiroglou-Douskos is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Integer overflow & Source code. The author has an hindex of 10, co-authored 16 publications receiving 1165 citations.

Papers
More filters
Proceedings ArticleDOI
09 Sep 2011
TL;DR: The results indicate that, for a range of applications, this approach typically delivers performance increases of over a factor of two (and up to a factors of seven) while changing the result that the application produces by less than 10%.
Abstract: Many modern computations (such as video and audio encoders, Monte Carlo simulations, and machine learning algorithms) are designed to trade off accuracy in return for increased performance. To date, such computations typically use ad-hoc, domain-specific techniques developed specifically for the computation at hand. Loop perforation provides a general technique to trade accuracy for performance by transforming loops to execute a subset of their iterations. A criticality testing phase filters out critical loops (whose perforation produces unacceptable behavior) to identify tunable loops (whose perforation produces more efficient and still acceptably accurate computations). A perforation space exploration algorithm perforates combinations of tunable loops to find Pareto-optimal perforation policies. Our results indicate that, for a range of applications, this approach typically delivers performance increases of over a factor of two (and up to a factor of seven) while changing the result that the application produces by less than 10%.

490 citations

Proceedings ArticleDOI
12 Oct 2015
TL;DR: It is shown that many popular code bases such as Apache and Nginx use coding practices that create flexibility in their intended control flow graph (CFG) even when a strong static analyzer is used to construct the CFG, which allows an attacker to gain control of the execution while strictly adhering to a fine-grained CFI.
Abstract: Control flow integrity (CFI) has been proposed as an approach to defend against control-hijacking memory corruption attacks. CFI works by assigning tags to indirect branch targets statically and checking them at runtime. Coarse-grained enforcements of CFI that use a small number of tags to improve the performance overhead have been shown to be ineffective. As a result, a number of recent efforts have focused on fine-grained enforcement of CFI as it was originally proposed. In this work, we show that even a fine-grained form of CFI with unlimited number of tags and a shadow stack (to check calls and returns) is ineffective in protecting against malicious attacks. We show that many popular code bases such as Apache and Nginx use coding practices that create flexibility in their intended control flow graph (CFG) even when a strong static analyzer is used to construct the CFG. These flexibilities allow an attacker to gain control of the execution while strictly adhering to a fine-grained CFI. We then construct two proof-of-concept exploits that attack an unlimited tag CFI system with a shadow stack. We also evaluate the difficulties of generating a precise CFG using scalable static analysis for real-world applications. Finally, we perform an analysis on a number of popular applications that highlights the availability of such attacks.

216 citations

Proceedings ArticleDOI
17 May 2015
TL;DR: It is shown that, for architectures that do not support segmentation in which CPI relies on information hiding, CPI's safe region can be leaked and then maliciously modified by using data pointer overwrites.
Abstract: Memory corruption attacks continue to be a major vector of attack for compromising modern systems Numerous defenses have been proposed against memory corruption attacks, but they all have their limitations and weaknesses Stronger defenses such as complete memory safety for legacy languages (C/C++) incur a large overhead, while weaker ones such as practical control flow integrity have been shown to be ineffective A recent technique called code pointer integrity (CPI) promises to balance security and performance by focusing memory safety on code pointers thus preventing most control-hijacking attacks while maintaining low overhead CPI protects access to code pointers by storing them in a safe region that is protected by instruction level isolation On x86-32, this isolation is enforced by hardware, on x86-64 and ARM, isolation is enforced by information hiding We show that, for architectures that do not support segmentation in which CPI relies on information hiding, CPI's safe region can be leaked and then maliciously modified by using data pointer overwrites We implement a proof-of-concept exploit against Nginx and successfully bypass CPI implementations that rely on information hiding in 6 seconds with 13 observed crashes We also present an attack that generates no crashes and is able to bypass CPI in 98 hours Our attack demonstrates the importance of adequately protecting secrets in security mechanisms and the dangers of relying on difficulty of guessing without guaranteeing the absence of memory leaks

165 citations

Proceedings ArticleDOI
03 Jun 2015
TL;DR: Experimental results using seven donor applications to eliminate ten errors in seven recipient applications highlight the ability of CP to transfer code across applications to eliminated out of bounds access, integer overflow, and divide by zero errors.
Abstract: We present Code Phage (CP), a system for automatically transferring correct code from donor applications into recipient applications that process the same inputs to successfully eliminate errors in the recipient. Experimental results using seven donor applications to eliminate ten errors in seven recipient applications highlight the ability of CP to transfer code across applications to eliminate out of bounds access, integer overflow, and divide by zero errors. Because CP works with binary donors with no need for source code or symbolic information, it supports a wide range of use cases. To the best of our knowledge, CP is the first system to automatically transfer code across multiple applications.

129 citations

Proceedings ArticleDOI
09 Jun 2014
TL;DR: A system, RCV, for enabling software applications to survive divide-by-zero and null-dereference errors and performs a manual analysis of the source code relevant to the benchmark errors, which indicates that for 11 of the 18 errors the RCV and later patched versions produce identical or equivalent results on all inputs.
Abstract: We present a system, RCV, for enabling software applications to survive divide-by-zero and null-dereference errors. RCV operates directly on off-the-shelf, production, stripped x86 binary executables. RCV implements recovery shepherding, which attaches to the application process when an error occurs, repairs the execution, tracks the repair effects as the execution continues, contains the repair effects within the application process, and detaches from the process after all repair effects are flushed from the process state. RCV therefore incurs negligible overhead during the normal execution of the application. We evaluate RCV on all divide-by-zero and null-dereference errors available in the CVE database [2] from January 2011 to March 2013 that 1) provide publicly-available inputs that trigger the error which 2) we were able to use to trigger the reported error in our experimental environment. We collected a total of 18 errors in seven real world applications, Wireshark, the FreeType library, Claws Mail, LibreOffice, GIMP, the PHP interpreter, and Chromium. For 17 of the 18 errors, RCV enables the application to continue to execute to provide acceptable output and service to its users on the error-triggering inputs. For 13 of the 18 errors, the continued RCV execution eventually flushes all of the repair effects and RCV detaches to restore the application to full clean functionality. We perform a manual analysis of the source code relevant to our benchmark errors, which indicates that for 11 of the 18 errors the RCV and later patched versions produce identical or equivalent results on all inputs.

108 citations


Cited by
More filters
01 Jan 2003
TL;DR: A super-peer is a node in a peer-to-peer network that operates both as a server to a set of clients, and as an equal in a network of super-peers.
Abstract: A super-peer is a node in a peer-to-peer network that operates both as a server to a set of clients, and as an equal in a network of super-peers. Super-peer networks strike a balance between the efficiency of centralized search, and the autonomy, load balancing and robustness to attacks provided by distributed search. Furthermore, they take advantage of the heterogeneity of capabilities (e.g., bandwidth, processing power) across peers, which recent studies have shown to be enormous. Hence, new and old P2P systems like KaZaA and Gnutella are adopting super-peers in their design. Despite their growing popularity, the behavior of super-peer networks is not well understood. For example, what are the potential drawbacks of super-peer networks? How can super-peers be made more reliable? How many clients should a super-peer take on to maximize efficiency? we examine super-peer networks in detail, gaming an understanding of their fundamental characteristics and performance tradeoffs. We also present practical guidelines and a general procedure for the design of an efficient super-peer network.

916 citations

Journal ArticleDOI
TL;DR: A survey of techniques for approximate computing (AC), which discusses strategies for finding approximable program portions and monitoring output quality, techniques for using AC in different processing units, processor components, memory technologies, and so forth, as well as programming frameworks for AC.
Abstract: Approximate computing trades off computation quality with effort expended, and as rising performance demands confront plateauing resource budgets, approximate computing has become not merely attractive, but even imperative. In this article, we present a survey of techniques for approximate computing (AC). We discuss strategies for finding approximable program portions and monitoring output quality, techniques for using AC in different processing units (e.g., CPU, GPU, and FPGA), processor components, memory technologies, and so forth, as well as programming frameworks for AC. We classify these techniques based on several key characteristics to emphasize their similarities and differences. The aim of this article is to provide insights to researchers into working of AC techniques and inspire more efforts in this area to make AC the mainstream computing approach in future systems.

890 citations

Proceedings ArticleDOI
01 Dec 2012
TL;DR: A programming model is defined that allows programmers to identify approximable code regions -- code that can produce imprecise but acceptable results and is faster and more energy efficient than executing the original code.
Abstract: This paper describes a learning-based approach to the acceleration of approximate programs. We describe the \emph{Parrot transformation}, a program transformation that selects and trains a neural network to mimic a region of imperative code. After the learning phase, the compiler replaces the original code with an invocation of a low-power accelerator called a \emph{neural processing unit} (NPU). The NPU is tightly coupled to the processor pipeline to accelerate small code regions. Since neural networks produce inherently approximate results, we define a programming model that allows programmers to identify approximable code regions -- code that can produce imprecise but acceptable results. Offloading approximable code regions to NPUs is faster and more energy efficient than executing the original code. For a set of diverse applications, NPU acceleration provides whole-application speedup of 2.3x and energy savings of 3.0x on average with quality loss of at most 9.6%.

532 citations

Proceedings ArticleDOI
11 Jan 2016
TL;DR: Experimental results show that, on a benchmark set of 69 real-world defects drawn from eight open-source projects, Prophet significantly outperforms the previous state-of-the-art patch generation system.
Abstract: We present Prophet, a novel patch generation system that works with a set of successful human patches obtained from open- source software repositories to learn a probabilistic, application-independent model of correct code. It generates a space of candidate patches, uses the model to rank the candidate patches in order of likely correctness, and validates the ranked patches against a suite of test cases to find correct patches. Experimental results show that, on a benchmark set of 69 real-world defects drawn from eight open-source projects, Prophet significantly outperforms the previous state-of-the-art patch generation system.

495 citations

Proceedings ArticleDOI
29 May 2013
TL;DR: This work analysis and characterization of inherent application resilience present in a suite of 12 widely used applications from the domains of recognition, data mining, and search and proposes a systematic framework for Application Resilience Characterization (ARC), which characterizes the resilient parts using approximation models that abstract a wide range of approximate computing techniques.
Abstract: Approximate computing is an emerging design paradigm that enables highly efficient hardware and software implementations by exploiting the inherent resilience of applications to in-exactness in their computations. Previous work in this area has demonstrated the potential for significant energy and performance improvements, but largely consists of ad hoc techniques that have been applied to a small number of applications. Taking approximate computing closer to mainstream adoption requires (i) a deeper understanding of inherent application resilience across a broader range of applications (ii) tools that can quantitatively establish the inherent resilience of an application, and (iii) methods to quickly assess the potential of various approximate computing techniques for a given application. We make two key contributions in this direction. Our primary contribution is the analysis and characterization of inherent application resilience present in a suite of 12 widely used applications from the domains of recognition, data mining, and search. Based on this analysis, we present several new insights into the nature of resilience and its relationship to various key application characteristics. To facilitate our analysis, we propose a systematic framework for Application Resilience Characterization (ARC) that (a) partitions an application into resilient and sensitive parts and (b) characterizes the resilient parts using approximation models that abstract a wide range of approximate computing techniques. We believe that the key insights that we present can help shape further research in the area of approximate computing, while automatic resilience characterization frameworks such as ARC can greatly aid designers in the adoption approximate computing.

464 citations