Showing papers presented at "Conference on Object-Oriented Programming Systems, Languages, and Applications in 2011"

PDF

Open Access

Proceedings Article•DOI•

Variability-aware parsing in the presence of lexical macros and conditional compilation

[...]

Christian Kästner¹, Paolo G. Giarrusso¹, Tillmann Rendel¹, Sebastian Erdweg¹, Klaus Ostermann¹, Thorsten Berger² - Show less +2 more•Institutions (2)

University of Marburg¹, Leipzig University²

22 Oct 2011

TL;DR: A novel variability-aware parser that can parse almost all unpreprocessed code without heuristics in practicable time is contributed and paves the road for further analysis, such as variability- aware type checking.

...read moreread less

Abstract: In many projects, lexical preprocessors are used to manage different variants of the project (using conditional compilation) and to define compile-time code transformations (using macros). Unfortunately, while being a simple way to implement variability, conditional compilation and lexical macros hinder automatic analysis, even though such analysis is urgently needed to combat variability-induced complexity. To analyze code with its variability, we need to parse it without preprocessing it. However, current parsing solutions use unsound heuristics, support only a subset of the language, or suffer from exponential explosion. As part of the TypeChef project, we contribute a novel variability-aware parser that can parse almost all unpreprocessed code without heuristics in practicable time. Beyond the obvious task of detecting syntax errors, our parser paves the road for further analysis, such as variability-aware type checking. We implement variability-aware parsers for Java and GNU C and demonstrate practicability by parsing the product line MobileMedia and the entire X86 architecture of the Linux kernel with 6065 variable features.

...read moreread less

248 citations

Proceedings Article•DOI•

Emscripten: an LLVM-to-JavaScript compiler

[...]

Alon Zakai

22 Oct 2011

TL;DR: This work presents and proves the validity of Emscripten's Relooper algorithm, which recreates high-level loop structures from low-level branching data, and opens up two avenues for running code written in languages other than JavaScript on the web.

...read moreread less

Abstract: We present Emscripten, a compiler from LLVM (Low Level Virtual Machine) assembly to JavaScript. This opens up two avenues for running code written in languages other than JavaScript on the web: (1) Compile code directly into LLVM assembly, and then compile that into JavaScript using Emscripten, or (2) Compile a language's entire runtime into LLVM and then JavaScript, as in the previous approach, and then use the compiled runtime to run code written in that language. For example, the former approach can work for C and C++, while the latter can work for Python; all three examples open up new opportunities for running code on the web.Emscripten itself is written in JavaScript and is available under the MIT license (a permissive open source license), at http://www.emscripten.org. As a compiler from LLVM to JavaScript, the challenges in designing Emscripten are somewhat the reverse of the norm - one must go from a low-level assembly into a high-level language, and recreate parts of the original high-level structure of the code that were lost in the compilation to low-level LLVM. We detail the methods used in Emscripten to deal with those challenges, and in particular present and prove the validity of Emscripten's Relooper algorithm, which recreates high-level loop structures from low-level branching data.

...read moreread less

225 citations

Proceedings Article•DOI•

SugarJ: library-based syntactic language extensibility

[...]

Sebastian Erdweg¹, Tillmann Rendel¹, Christian Kästner¹, Klaus Ostermann¹•Institutions (1)

University of Marburg¹

22 Oct 2011

TL;DR: To demonstrate the expressiveness and applicability of sugar libraries, SugarJ, a language on top of Java, SDF and Stratego, which supports syntactic extensibility is developed, and the utility of self-applicability is illustrated by embedding XML Schema, a metalanguage to define XML languages.

...read moreread less

Abstract: Existing approaches to extend a programming language with syntactic sugar often leave a bitter taste, because they cannot be used with the same ease as the main extension mechanism of the programming language - libraries. Sugar libraries are a novel approach for syntactically extending a programming language within the language. A sugar library is like an ordinary library, but can, in addition, export syntactic sugar for using the library. Sugar libraries maintain the composability and scoping properties of ordinary libraries and are hence particularly well-suited for embedding a multitude of domain-specific languages into a host language. They also inherit self-applicability from libraries, which means that sugar libraries can provide syntactic extensions for the definition of other sugar libraries. To demonstrate the expressiveness and applicability of sugar libraries, we have developed SugarJ, a language on top of Java, SDF and Stratego, which supports syntactic extensibility. SugarJ employs a novel incremental parsing technique, which allows changing the syntax within a source file. We demonstrate SugarJ by five language extensions, including embeddings of XML and closures in Java, all available as sugar libraries. We illustrate the utility of self-applicability by embedding XML Schema, a metalanguage to define XML languages.

...read moreread less

162 citations

Proceedings Article•DOI•

HOP: achieving efficient anonymity in MANETs by combining HIP, OLSR, and pseudonyms

[...]

Javier Campos¹, Carlos T. Calafate¹, Marga Nácher¹, Pietro Manzoni¹, Juan-Carlos Cano¹ - Show less +1 more•Institutions (1)

Polytechnic University of Valencia¹

01 Jan 2011

TL;DR: This paper proposes and implements HOP, a novel solution based on cryptographic Host Identity Protocol (HIP) that offers security and user-level anonymity in MANET environments while maintaining good performance levels and introduces enhancements to the authentication process to achieve Host Identity Tag (HIT) relationship anonymity.

...read moreread less

Abstract: Offering secure and anonymous communications in mobile ad hoc networking environments is essential to achieve confidence and privacy, thus promoting widespread adoption of this kind of networks. In addition, some minimum performance levels must be achieved for any solution to be practical and become widely adopted. In this paper, we propose and implement HOP, a novel solution based on cryptographic Host Identity Protocol (HIP) that offers security and user-level anonymity in MANET environments while maintaining good performance levels. In particular, we introduce enhancements to the authentication process to achieve Host Identity Tag (HIT) relationship anonymity, along with source/destination HIT anonymity when combined with multihoming. Afterward we detail how we integrate our improved version of HIP with the OLSR routing protocol to achieve efficient support for pseudonyms. We implemented our proposal in an experimental testbed, and the results obtained show that performance levels achieved are quite good, and that the integration with OLSR is achieved with a low overhead.

...read moreread less

150 citations

Proceedings Article•DOI•

Catch me if you can: performance bug detection in the wild

[...]

Milan Jovic¹, Andrea Adamoli¹, Matthias Hauswirth¹•Institutions (1)

University of Lugano¹

22 Oct 2011

TL;DR: This paper argues that -- especially in the case of interactive applications -- traditional profilers find irrelevant problems but fail to find relevant bugs, and introduces lag hunting, an approach that identifies perceptible performance bugs by monitoring the behavior of applications deployed in the wild.

...read moreread less

Abstract: Profilers help developers to find and fix performance problems. But do they find performance bugs -- performance problems that real users actually notice? In this paper we argue that -- especially in the case of interactive applications -- traditional profilers find irrelevant problems but fail to find relevant bugs.We then introduce lag hunting, an approach that identifies perceptible performance bugs by monitoring the behavior of applications deployed in the wild. The approach transparently produces a list of performance issues, and for each issue provides the developer with information that helps in finding the cause of the problem.We evaluate our approach with an experiment where we monitor an application used by 24 users for 1958 hours over the course of 3-months. We characterize the resulting 881 issues, and we find and fix the causes of a set of representative examples.

...read moreread less

108 citations

Proceedings Article•DOI•

F4F: taint analysis of framework-based web applications

[...]

Manu Sridharan¹, Shay Artzi¹, Marco Pistoia¹, Salvatore A. Guarnieri¹, Omer Tripp¹, Ryan Berg¹ - Show less +2 more•Institutions (1)

IBM¹

22 Oct 2011

TL;DR: F4F (Framework For Frameworks), a system for effective taint analysis of framework-based web applications, is presented and F4F support to a state-of-the-art taint-analysis engine is added.

...read moreread less

Abstract: This paper presents F4F (Framework For Frameworks), a system for effective taint analysis of framework-based web applications. Most modern web applications utilize one or more web frameworks, which provide useful abstractions for common functionality. Due to extensive use of reflective language constructs in framework implementations, existing static taint analyses are often ineffective when applied to framework-based applications. While previous work has included ad hoc support for certain framework constructs, adding support for a large number of frameworks in this manner does not scale from an engineering standpoint.F4F employs an initial analysis pass in which both application code and configuration files are processed to generate a specification of framework-related behaviors. A taint analysis engine can leverage these specifications to perform a much deeper, more precise analysis of framework-based applications. Our specification language has only a small number of simple but powerful constructs, easing analysis engine integration. With this architecture, new frameworks can be handled with no changes to the core analysis engine, yielding significant engineering benefits.We implemented specification generators for several web frameworks and added F4F support to a state-of-the-art taint-analysis engine. In an experimental evaluation, the taint analysis enhanced with F4F discovered 525 new issues across nine benchmarks, a harmonic mean of 2.10X more issues per benchmark. Furthermore, manual inspection of a subset of the new issues showed that many were exploitable or reflected bad security practice.

...read moreread less

104 citations

Proceedings Article•DOI•

Da capo con scala: design and analysis of a scala benchmark suite for the java virtual machine

[...]

Andreas Sewe¹, Mira Mezini¹, Aibek Sarimbekov², Walter Binder²•Institutions (2)

Technische Universität Darmstadt¹, University of Lugano²

22 Oct 2011

TL;DR: The design and analysis of the first full-fledged benchmark suite for Scala is presented and the benchmarks contained therein are compared with those from the well-known DaCapo 9.12 benchmark suite to show where the differences are between Scala and Java code---and where not.

...read moreread less

Abstract: Originally conceived as the target platform for Java alone, the Java Virtual Machine (JVM) has since been targeted by other languages, one of which is Scala. This trend, however, is not yet reflected by the benchmark suites commonly used in JVM research. In this paper, we thus present the design and analysis of the first full-fledged benchmark suite for Scala. We furthermore compare the benchmarks contained therein with those from the well-known DaCapo 9.12 benchmark suite and show where the differences are between Scala and Java code---and where not.

...read moreread less

99 citations

Proceedings Article•DOI•

PREFAIL: a programmable tool for multiple-failure injection

[...]

Pallavi Joshi¹, Haryadi S. Gunawi¹, Koushik Sen¹•Institutions (1)

University of California, Berkeley¹

22 Oct 2011

TL;DR: PreFail is presented, a programmable failure-injection tool that enables testers to write a wide range of policies to prune down the large space of multiple failures and finds all the bugs that one can find using exhaustive testing while spending 10X--200X less time than exhaustive testing.

...read moreread less

Abstract: As hardware failures are no longer rare in the era of cloud computing, cloud software systems must "prevail" against multiple, diverse failures that are likely to occur. Testing software against multiple failures poses the problem of combinatorial explosion of multiple failures. To address this problem, we present PreFail, a programmable failure-injection tool that enables testers to write a wide range of policies to prune down the large space of multiple failures. We integrate PreFail to three cloud software systems (HDFS, Cassandra, and ZooKeeper), show a wide variety of useful pruning policies that we can write for them, and evaluate the speed-ups in testing time that we obtain by using the policies. In our experiments, our testing approach with appropriate policies found all the bugs that one can find using exhaustive testing while spending 10X--200X less time than exhaustive testing.

...read moreread less

92 citations

Proceedings Article•DOI•

Testing atomicity of composed concurrent operations

[...]

Ohad Shacham¹, Nathan Bronson², Alex Aiken², Mooly Sagiv¹, Martin Vechev³, Eran Yahav⁴ - Show less +2 more•Institutions (4)

Tel Aviv University¹, Stanford University², IBM³, Technion – Israel Institute of Technology⁴

22 Oct 2011

TL;DR: A novel technique based on modular testing of client code in the presence of an adversarial environment is presented, which uses commutativity specifications to drastically reduce the number of executions explored to detect a bug.

...read moreread less

Abstract: We address the problem of testing atomicity of composed concurrent operations. Concurrent libraries help programmers exploit parallel hardware by providing scalable concurrent operations with the illusion that each operation is executed atomically. However, client code often needs to compose atomic operations in such a way that the resulting composite operation is also atomic while preserving scalability. We present a novel technique for testing the atomicity of client code composing scalable concurrent operations. The challenge in testing this kind of client code is that a bug may occur very rarely and only on a particular interleaving with a specific thread configuration. Our technique is based on modular testing of client code in the presence of an adversarial environment; we use commutativity specifications to drastically reduce the number of executions explored to detect a bug. We implemented our approach in a tool called COLT, and evaluated its effectiveness on a range of 51 real-world concurrent Java programs. Using COLT, we found 56 atomicity violations in Apache Tomcat, Cassandra, MyFaces Trinidad, and other applications.

...read moreread less

89 citations

Proceedings Article•DOI•

Synthesizing method sequences for high-coverage testing

[...]

Suresh Thummalapenta¹, Tao Xie², Nikolai Tillmann³, Jonathan de Halleux³, Zhendong Su⁴ - Show less +1 more•Institutions (4)

IBM¹, North Carolina State University², Microsoft³, University of California, Davis⁴

22 Oct 2011

TL;DR: A novel approach to intelligently navigate the large search space, called Seeker, which synergistically combines static and dynamic analyses and achieves higher branch coverage and def-use coverage than existing state-of-the-art approaches.

...read moreread less

Abstract: High-coverage testing is challenging. Modern object-oriented programs present additional challenges for testing. One key difficulty is the generation of proper method sequences to construct desired objects as method parameters. In this paper, we cast the problem as an instance of program synthesis that automatically generates candidate programs to satisfy a user-specified intent. In our setting, candidate programs are method sequences, and desired object states specify an intent. Automatic generation of desired method sequences is difficult due to its large search space---sequences often involve methods from multiple classes and require specific primitive values. This paper introduces a novel approach, called Seeker, to intelligently navigate the large search space. Seeker synergistically combines static and dynamic analyses: (1) dynamic analysis generates method sequences to cover branches; (2) static analysis uses dynamic analysis information for not-covered branches to generate candidate sequences; and (3) dynamic analysis explores and eliminates statically generated sequences. For evaluation, we have implemented Seeker and demonstrate its effectiveness on four subject applications totalling 28K LOC. We show that Seeker achieves higher branch coverage and def-use coverage than existing state-of-the-art approaches. We also show that Seeker detects 34 new defects missed by existing tools.

...read moreread less

86 citations

Proceedings Article•DOI•

Why nothing matters: the impact of zeroing

[...]

Xi Yang¹, Stephen M. Blackburn¹, Daniel Frampton¹, Jennifer B. Sartor², Kathryn S. McKinley³ - Show less +1 more•Institutions (3)

Australian National University¹, École Polytechnique Fédérale de Lausanne², University of Texas at Austin³

22 Oct 2011

TL;DR: This paper evaluates the two widely used zero initialization designs, showing that they make different tradeoffs to achieve very similar performance, and inspires three better designs: bulk zeroing with cache-bypassing (non-temporal) instructions to reduce the direct and indirect zeroing costs simultaneously.

...read moreread less

Abstract: Memory safety defends against inadvertent and malicious misuse of memory that may compromise program correctness and security. A critical element of memory safety is zero initialization. The direct cost of zero initialization is surprisingly high: up to 12.7%, with average costs ranging from 2.7 to 4.5% on a high performance virtual machine on IA32 architectures. Zero initialization also incurs indirect costs due to its memory bandwidth demands and cache displacement effects. Existing virtual machines either: a) minimize direct costs by zeroing in large blocks, or b) minimize indirect costs by zeroing in the allocation sequence, which reduces cache displacement and bandwidth. This paper evaluates the two widely used zero initialization designs, showing that they make different tradeoffs to achieve very similar performance. Our analysis inspires three better designs: (1) bulk zeroing with cache-bypassing (non-temporal) instructions to reduce the direct and indirect zeroing costs simultaneously, (2) concurrent non-temporal bulk zeroing that exploits parallel hardware to move work off the application's critical path, and (3) adaptive zeroing, which dynamically chooses between (1) and (2) based on available hardware parallelism. The new software strategies offer speedups sometimes greater than the direct overhead, improving total performance by 3% on average. Our findings invite additional optimizations and microarchitectural support.

...read moreread less

Proceedings Article•DOI•

Automated construction of JavaScript benchmarks

[...]

Gregor Richards¹, Andreas Gal², Brendan Eich², Jan Vitek¹•Institutions (2)

Purdue University¹, Mozilla Foundation²

22 Oct 2011

TL;DR: JSBench is described, a flexible tool for workload capture and benchmark generation, and its use in creating eight benchmarks based on popular sites is demonstrated, showing that workloads created by JSBench match the behavior of the original web applications.

...read moreread less

Abstract: JavaScript is a highly dynamic language for web-based applications. Innovative implementation techniques for improving its speed and responsiveness have been developed in recent years. Industry benchmarks such as WebKit SunSpider are often cited as a measure of the efficacy of these techniques. However, recent studies have shown that these benchmarks fail to accurately represent the dynamic nature of modern JavaScript applications, and so may be poor predictors of real-world performance. Worse, they may guide the development of optimizations which are unhelpful for real applications. Our goal is to develop a tool and techniques to automate the creation of realistic and representative benchmarks from existing web applications. We propose a record-and-replay approach to capture JavaScript sessions which has sufficient fidelity to accurately recreate key characteristics of the original application, and at the same time is sufficiently flexible that a recording produced on one platform can be replayed on a different one. We describe JSBench, a flexible tool for workload capture and benchmark generation, and demonstrate its use in creating eight benchmarks based on popular sites. Using a variety of runtime metrics collected with instrumented versions of Firefox, Internet Explorer, and Safari, we show that workloads created by JSBench match the behavior of the original web applications.

...read moreread less

Proceedings Article•DOI•

First-class state change in plaid

[...]

Joshua Sunshine¹, Karl Naden¹, Sven Stork¹, Jonathan Aldrich¹, Éric Tanter² - Show less +1 more•Institutions (2)

Carnegie Mellon University¹, University of Chile²

22 Oct 2011

TL;DR: This paper evaluates Plaid through a series of examples taken from the Plaid compiler and the standard libraries of Smalltalk and Java, showing how Plaid can more closely model state-based designs, enhancing understandability, enhancing dynamic error checking, and providing reuse benefits.

...read moreread less

Abstract: Objects model the world, and state is fundamental to a faithful modeling. Engineers use state machines to understand and reason about state transitions, but programming languages provide little support for building software based on state abstractions. We propose Plaid, a language in which objects are modeled not just in terms of classes, but in terms of changing abstract states. Each state may have its own representation, as well as methods that may transition the object into a new state. A formal model precisely defines the semantics of core Plaid constructs such as state transition and trait-like state composition. We evaluate Plaid through a series of examples taken from the Plaid compiler and the standard libraries of Smalltalk and Java. These examples show how Plaid can more closely model state-based designs, enhancing understandability, enhancing dynamic error checking, and providing reuse benefits.

...read moreread less

Proceedings Article•DOI•

Declaratively programming the mobile web with Mobl

[...]

Zef Hemel¹, Eelco Visser¹•Institutions (1)

Delft University of Technology¹

22 Oct 2011

TL;DR: Mobl as mentioned in this paper is a new language designed to declaratively construct mobile web applications, integrating languages for user interface design, styling, data modeling, querying and application logic into a single, unified language that is flexible, expressive, enables early detection of errors, and has good IDE support.

...read moreread less

Abstract: A new generation of mobile touch devices, such as the iPhone, iPad and Android devices, are equipped with powerful, modern browsers. However, regular websites are not optimized for the specific features and constraints of these devices, such as limited screen estate, unreliable Internet access, touch-based interaction patterns, and features such as GPS. While recent advances in web technology enable web developers to build web applications that take advantage of the unique properties of mobile devices, developing such applications exposes a number of problems, specifically: developers are required to use many loosely coupled languages with limited tool support and application code is often verbose and imperative. We introduce mobl, a new language designed to declaratively construct mobile web applications. Mobl integrates languages for user interface design, styling, data modeling, querying and application logic into a single, unified language that is flexible, expressive, enables early detection of errors, and has good IDE support.

...read moreread less

Proceedings Article•DOI•

RoleCast: finding missing security checks when you do not know what checks are

[...]

Sooel Son¹, Kathryn S. McKinley², Vitaly Shmatikov¹•Institutions (2)

University of Texas at Austin¹, Microsoft²

22 Oct 2011

TL;DR: ROLECAST is the first system capable of statically identifying security logic that mediates security-sensitive events in Web applications rather than taking a specification of this logic as input, and discovered 13 previously unreported, remotely exploitable vulnerabilities in 11 substantial PHP and JSP applications.

...read moreread less

Abstract: Web applications written in languages such as PHP and JSP are notoriously vulnerable to accidentally omitted authorization checks and other security bugs. Existing techniques that find missing security checks in library and system code assume that (1) security checks can be recognized syntactically and (2) the same pattern of checks applies universally to all programs. These assumptions do not hold for Web applications. Each Web application uses different variables and logic to check the user's permissions. Even within the application, security logic varies based on the user's role, e.g., regular users versus administrators. This paper describes ROLECAST, the first system capable of statically identifying security logic that mediates security-sensitive events (such as database writes) in Web applications, rather than taking a specification of this logic as input. We observe a consistent software engineering pattern-the code that implements distinct user role functionality and its security logic resides in distinct methods and files-and develop a novel algorithm for discovering this pattern in Web applications. Our algorithm partitions the set of file contexts (a coarsening of calling contexts) on which security-sensitive events are control dependent into roles. Roles are based on common functionality and security logic. ROLECAST identifies security-critical variables and applies rolespecific variable consistency analysis to find missing security checks. ROLECAST discovered 13 previously unreported, remotely exploitable vulnerabilities in 11 substantial PHP and JSP applications, with only 3 false positives.This paper demonstrates that (1) accurate inference of application- and role-specific security logic improves the security of Web applications without specifications, and (2) static analysis can discover security logic automatically by exploiting distinctive software engineering features.

...read moreread less

Proceedings Article•DOI•

SHERIFF: precise detection and automatic mitigation of false sharing

[...]

Tongping Liu¹, Emery D. Berger¹•Institutions (1)

University of Massachusetts Amherst¹

22 Oct 2011

TL;DR: This paper presents two tools that attack the problem of false sharing: Sheriff-Detect and Sheriff-Protect, which mitigates false sharing by adaptively isolating shared updates from different threads into separate physical addresses, effectively eliminating most of the performance impact offalse sharing.

...read moreread less

Abstract: False sharing is an insidious problem for multithreaded programs running on multicore processors, where it can silently degrade performance and scalability. Previous tools for detecting false sharing are severely limited: they cannot distinguish false sharing from true sharing, have high false positive rates, and provide limited assistance to help programmers locate and resolve false sharing.This paper presents two tools that attack the problem of false sharing: Sheriff-Detect and Sheriff-Protect. Both tools leverage a framework we introduce here called Sheriff. Sheriff breaks out threads into separate processes, and exposes an API that allows programs to perform per-thread memory isolation and tracking on a per-page basis. We believe Sheriff is of independent interest.Sheriff-Detect finds instances of false sharing by comparing updates within the same cache lines by different threads, and uses sampling to rank them by performance impact. Sheriff-Detect is precise (no false positives), runs with low overhead (on average, 20%), and is accurate, pinpointing the exact objects involved in false sharing. We present a case study demonstrating Sheriff-Detect's effectiveness at locating false sharing in a variety of benchmarks.Rewriting a program to fix false sharing can be infeasible when source is unavailable, or undesirable when padding objects would unacceptably increase memory consumption or further worsen runtime performance. Sheriff-Protect mitigates false sharing by adaptively isolating shared updates from different threads into separate physical addresses, effectively eliminating most of the performance impact of false sharing. We show that Sheriff-Protect can improve performance for programs with catastrophic false sharing by up to 9×, without programmer intervention.

...read moreread less

Proceedings Article•DOI•

Tool-supported refactoring for JavaScript

[...]

Asger Feldthaus¹, Todd Millstein², Anders Møller¹, Max Schäfer³, Frank Tip⁴ - Show less +1 more•Institutions (4)

Aarhus University¹, University of California, Los Angeles², University of Oxford³, IBM⁴

22 Oct 2011

TL;DR: This work proposes a framework for specifying and implementing JavaScript refactorings based on pointer analysis, and describes novel refactoring motivated by best practice recommendations for JavaScript programming, and demonstrates how they can be described concisely in terms of queries provided by the framework.

...read moreread less

Abstract: Refactoring is a popular technique for improving the structure of existing programs while maintaining their behavior. For statically typed programming languages such as Java, a wide variety of refactorings have been described, and tool support for performing refactorings and ensuring their correctness is widely available in modern IDEs. For the JavaScript programming language, however, existing refactoring tools are less mature and often unable to ensure that program behavior is preserved. Refactoring algorithms that have been developed for statically typed languages are not applicable to JavaScript because of its dynamic nature. We propose a framework for specifying and implementing JavaScript refactorings based on pointer analysis. We describe novel refactorings motivated by best practice recommendations for JavaScript programming, and demonstrate how they can be described concisely in terms of queries provided by our framework. Experiments performed with a prototype implementation on a suite of existing applications show that our approach is well-suited for developing practical refactoring tools for JavaScript.

...read moreread less

Proceedings Article•DOI•

Kismet: parallel speedup estimates for serial programs

[...]

Donghwan Jeon¹, Saturnino Garcia¹, Chris Louie¹, Michael Taylor¹•Institutions (1)

University of California, San Diego¹

22 Oct 2011

TL;DR: Kismet, a tool that creates parallel speedup estimates for unparallelized serial programs, builds upon the hierarchical critical path analysis technique, a recently developed dynamic analysis that localizes parallelism to each of the potentially nested regions in the target program.

...read moreread less

Abstract: Software engineers now face the difficult task of refactoring serial programs for parallel execution on multicore processors. Currently, they are offered little guidance as to how much benefit may come from this task, or how close they are to the best possible parallelization. This paper presents Kismet, a tool that creates parallel speedup estimates for unparallelized serial programs. Kismet differs from previous approaches in that it does not require any manual analysis or modification of the program. This difference allows quick analysis of many programs, avoiding wasted engineering effort on those that are fundamentally limited. To accomplish this task, Kismet builds upon the hierarchical critical path analysis (HCPA) technique, a recently developed dynamic analysis that localizes parallelism to each of the potentially nested regions in the target program. It then uses a parallel execution time model to compute an approximate upper bound for performance, modeling constraints that stem from both hardware parameters and internal program structure.Our evaluation applies Kismet to eight high-parallelism NAS Parallel Benchmarks running on a 32-core AMD multicore system, five low-parallelism SpecInt benchmarks, and six medium-parallelism benchmarks running on the finegrained MIT Raw processor. The results are compelling. Kismet is able to significantly improve the accuracy of parallel speedup estimates relative to prior work based on critical path analysis.

...read moreread less

Proceedings Article•DOI•

Trustworthy numerical computation in Scala

[...]

Eva Darulova¹, Viktor Kuncak¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

22 Oct 2011

TL;DR: This work presents a library solution for rigorous arithmetic computation that tracks a (double) floating point value, but also a guaranteed upper bound on the error between this value and the ideal value that would be computed in the real-value semantics.

...read moreread less

Abstract: Modern computing has adopted the floating point type as a default way to describe computations with real numbers. Thanks to dedicated hardware support, such computations are efficient on modern architectures, even in double precision. However, rigorous reasoning about the resulting programs remains difficult. This is in part due to a large gap between the finite floating point representation and the infinite-precision real-number semantics that serves as the developers' mental model. Because programming languages do not provide support for estimating errors, some computations in practice are performed more and some less precisely than needed.We present a library solution for rigorous arithmetic computation. Our numerical data type library tracks a (double) floating point value, but also a guaranteed upper bound on the error between this value and the ideal value that would be computed in the real-value semantics. Our implementation involves a set of linear approximations based on an extension of affine arithmetic. The derived approximations cover most of the standard mathematical operations, including trigonometric functions, and are more comprehensive than any publicly available ones. Moreover, while interval arithmetic rapidly yields overly pessimistic estimates, our approach remains precise for several computational tasks of interest. We evaluate the library on a number of examples from numerical analysis and physical simulations. We found it to be a useful tool for gaining confidence in the correctness of the computation.

...read moreread less

Proceedings Article•DOI•

Two for the price of one: a model for parallel and incremental computation

[...]

Sebastian Burckhardt¹, Daan Leijen¹, Caitlin Sadowski², Jaeheon Yi², Thomas Ball¹ - Show less +1 more•Institutions (2)

Microsoft¹, University of California, Santa Cruz²

22 Oct 2011

TL;DR: This work presents a novel algorithm for parallel self-adjusting computation that extends a deterministic parallel programming model with support for recording and repeating computations and describes programming techniques that proved particularly useful to improve the performance of self- adjustment in practice.

...read moreread less

Abstract: Parallel or incremental versions of an algorithm can significantly outperform their counterparts, but are often difficult to develop. Programming models that provide appropriate abstractions to decompose data and tasks can simplify parallelization. We show in this work that the same abstractions can enable both parallel and incremental execution. We present a novel algorithm for parallel self-adjusting computation. This algorithm extends a deterministic parallel programming model (concurrent revisions) with support for recording and repeating computations. On record, we construct a dynamic dependence graph of the parallel computation. On repeat, we reexecute only parts whose dependencies have changed.We implement and evaluate our idea by studying five example programs, including a realistic multi-pass CSS layout algorithm. We describe programming techniques that proved particularly useful to improve the performance of self-adjustment in practice. Our final results show significant speedups on all examples (up to 37x on an 8-core machine). These speedups are well beyond what can be achieved by parallelization alone, while requiring a comparable effort by the programmer.

...read moreread less

Proceedings Article•DOI•

Hybrid partial evaluation

[...]

Amin Shali¹, William R. Cook¹•Institutions (1)

University of Texas at Austin¹

22 Oct 2011

TL;DR: Civet, a straightforward implementation of HPE as a relatively simple extension of a Java compiler, is described and code optimized by Civet performs as well as the output of a state-of-the-art offline partial evaluator.

...read moreread less

Abstract: Hybrid partial evaluation (HPE) is a pragmatic approach to partial evaluation that borrows ideas from both online and offline partial evaluation. HPE performs offline-style specialization using an online approach without static binding time analysis. The goal of HPE is to provide a practical and predictable level of optimization for programmers, with an implementation strategy that fits well within existing compilers or interpreters. HPE requires the programmer to specify where partial evaluation should be applied. It provides no termination guarantee and reports errors in situations that violate simple binding time rules, or have incorrect use of side effects in compile-time code. We formalize HPE for a small imperative object-oriented language and describe Civet, a straightforward implementation of HPE as a relatively simple extension of a Java compiler. Code optimized by Civet performs as well as the output of a state-of-the-art offline partial evaluator.

...read moreread less

Proceedings Article•DOI•

Product lines of theorems

[...]

Benjamin Delaware¹, William R. Cook¹, Don Batory¹•Institutions (1)

University of Texas at Austin¹

22 Oct 2011

TL;DR: This paper shows how to engineer product lines with theorems and proofs built from feature modules, and formalizes a core calculus for Java in Coq which can be extended with any combination of casts, interfaces, or generics.

...read moreread less

Abstract: Mechanized proof assistants are powerful verification tools, but proof development can be difficult and time-consuming. When verifying a family of related programs, the effort can be reduced by proof reuse. In this paper, we show how to engineer product lines with theorems and proofs built from feature modules. Each module contains proof fragments which are composed together to build a complete proof of correctness for each product. We consider a product line of programming languages, where each variant includes metatheory proofs verifying the correctness of its semantic definitions. This approach has been realized in the Coq proof assistant, with the proofs of each feature independently certifiable by Coq. These proofs are composed for each language variant, with Coq mechanically verifying that the composite proofs are correct. As validation, we formalize a core calculus for Java in Coq which can be extended with any combination of casts, interfaces, or generics.

...read moreread less

Proceedings Article•DOI•

Freedom before commitment: a lightweight type system for object initialisation

[...]

Alexander J. Summers¹, Peter Mueller¹•Institutions (1)

ETH Zurich¹

22 Oct 2011

TL;DR: This work presents a type system that tracks which objects are fully initialised and which are still under initialisation, and believes it to be the first such system suitable for mainstream use.

...read moreread less

Abstract: One of the main purposes of object initialisation is to establish invariants such as a field being non-null or an immutable data structure containing specific values. These invariants are then implicitly assumed by the rest of the implementation, for instance, to ensure that a field may be safely dereferenced or that immutable data may be accessed concurrently. Consequently, letting an object escape from its constructor is dangerous; the escaping object might not yet satisfy its invariants, leading to errors in code that relies on them. Nevertheless, preventing objects entirely from escaping from their constructors is too restrictive; it is often useful to call auxiliary methods on the object under initialisation or to pass it to another constructor to set up mutually-recursive structures.We present a type system that tracks which objects are fully initialised and which are still under initialisation. The system can be used to prevent objects from escaping, but also to allow safe escaping by making explicit which objects might not yet satisfy their invariants. We designed, formalised and implemented our system as an extension to a non-null type system, but it is not limited to this application. Our system is conceptually simple and requires little annotation overhead; it is sound and sufficiently expressive for many common programming idioms. Therefore, we believe it to be the first such system suitable for mainstream use.

...read moreread less

Proceedings Article•DOI•

Enhancing locality for recursive traversals of recursive structures

[...]

Youngjoon Jo¹, Milind Kulkarni¹•Institutions (1)

Purdue University¹

22 Oct 2011

TL;DR: This paper develops a novel optimization, inspired by the classic tiling loop transformation, and shows that it can substantially enhance temporal locality in traversal codes and presents a transformation and optimization framework called TreeTiler, which automatically detects opportunities for applying point blocking and applies the transformation.

...read moreread less

Abstract: While there has been decades of work on developing automatic, locality-enhancing transformations for regular programs that operate over dense matrices and arrays, there has been little investigation of such transformations for irregular programs, which operate over pointer-based data structures such as graphs, trees and lists. In this paper, we argue that, for a class of irregular applications we call traversal codes, there exists substantial data reuse and hence opportunity for locality exploitation. We develop a novel optimization called point blocking, inspired by the classic tiling loop transformation, and show that it can substantially enhance temporal locality in traversal codes. We then present a transformation and optimization framework called TreeTiler that automatically detects opportunities for applying point blocking and applies the transformation. TreeTiler uses autotuning techniques to determine appropriate parameters for the transformation. For a series of traversal algorithms drawn from real-world applications, we show that TreeTiler is able to deliver performance improvements of up to 245% over an optimized (but non-transformed) parallel baseline, and in several cases, significantly better scalability.

...read moreread less

Proceedings Article•DOI•

Gradual typing for generics

[...]

Lintaro Ina¹, Atsushi Igarashi¹•Institutions (1)

Kyoto University¹

22 Oct 2011

TL;DR: A gradual type system for class-based object-oriented languages with generics is developed with a special type to denote dynamically typed parts of a program; unlike dynamic types introduced to C# 4.0, this type system allows for more seamless integration of dynamically and statically typed code.

...read moreread less

Abstract: Gradual typing is a framework to combine static and dynamic typing in a single programming language. In this paper, we develop a gradual type system for class-based object-oriented languages with generics. We introduce a special type to denote dynamically typed parts of a program; unlike dynamic types introduced to C# 4.0, however, our type system allows for more seamless integration of dynamically and statically typed code.We formalize a gradual type system for Featherweight GJ with a semantics given by a translation that inserts explicit run-time checks. The type system guarantees that statically typed parts of a program do not go wrong, even if it includes dynamically typed parts. We also describe a basic implementation scheme for Java and report preliminary performance evaluation.

...read moreread less

Proceedings Article•DOI•

Benefits and barriers of user evaluation in software engineering research

[...]

Raymond P.L. Buse¹, Caitlin Sadowski², Westley Weimer¹•Institutions (2)

University of Virginia¹, University of California, Santa Cruz²

22 Oct 2011

TL;DR: From a corpus of over 3,000 papers spanning ten years, a set of perceived barriers to performing user evaluations is identified and the external measures of impact that are correlated with the presence of user evaluations are identified.

...read moreread less

Abstract: In this paper, we identify trends about, benefits from, and barriers to performing user evaluations in software engineering research. From a corpus of over 3,000 papers spanning ten years, we report on various subtypes of user evaluations (e.g., coding tasks vs. questionnaires) and relate user evaluations to paper topics (e.g., debugging vs. technology transfer). We identify the external measures of impact, such as best paper awards and citation counts, that are correlated with the presence of user evaluations. We complement this with a survey of over 100 researchers from over 40 different universities and labs in which we identify a set of perceived barriers to performing user evaluations.

...read moreread less

Proceedings Article•DOI•

Virtual values for language extension

[...]

Thomas H. Austin¹, Tim Disney¹, Cormac Flanagan¹•Institutions (1)

University of California, Santa Cruz¹

22 Oct 2011

TL;DR: This paper formalizes the semantics of virtual values, and shows how they enable the definition of a variety of language extensions, including additional numeric types; delayed evaluation; taint tracking; contracts; revokable membranes; and units of measure.

...read moreread less

Abstract: This paper focuses on extensibility, the ability of a programmer using a particular language to extend the expressiveness of that language. This paper explores how to provide an interesting notion of extensibility by virtualizing the interface between code and data. A virtual value is a special value that supports behavioral intercession. When a primitive operation is applied to a virtual value, it invokes a trap on that virtual value. A virtual value contains multiple traps, each of which is a user-defined function that describes how that operation should behave on that value. This paper formalizes the semantics of virtual values, and shows how they enable the definition of a variety of language extensions, including additional numeric types; delayed evaluation; taint tracking; contracts; revokable membranes; and units of measure. We report on our experience implementing virtual values for Javascript within an extension for the Firefox browser.

...read moreread less

Proceedings Article•DOI•

Accentuating the positive: atomicity inference and enforcement using correct executions

[...]

Dasarath Weeratunge¹, Xiangyu Zhang¹, Suresh Jaganathan¹•Institutions (1)

Purdue University¹

22 Oct 2011

TL;DR: A new technique to increase program robustness against Heisenbugs is presented, which profiles correct executions from provided test suites to infer fine-grained atomicity properties and injects additional deadlock-free locking into the program to guarantee these properties hold on production runs.

...read moreread less

Abstract: Concurrency bugs are often due to inadequate synchronization that fail to prevent specific (undesirable) thread interleavings Such errors, often referred to as Heisenbugs, are difficult to detect, prevent, and repair In this paper, we present a new technique to increase program robustness against Heisenbugs We profile correct executions from provided test suites to infer fine-grained atomicity properties Additional deadlock-free locking is injected into the program to guarantee these properties hold on production runs Notably, our technique does not rely on witnessing or analyzing erroneous executions The end result is a scheme that only permits executions which are guaranteed to preserve the atomicity properties derived from the profile Evaluation results on large, real-world, open-source programs show that our technique can effectively suppress subtle concurrency bugs, with small runtime overheads (typically less than 15%)

...read moreread less

Proceedings Article•DOI•

JIT compilation policy for modern machines

[...]

Prasad A. Kulkarni¹•Institutions (1)

University of Kansas¹

22 Oct 2011

TL;DR: The effects on performance of increasing compiler aggressiveness for VMs with multiple compiler threads running on existing single/multi-core and future many-core machines are studied and it is indicated that although more aggressive JIT compilation policies show no benefits on single- core machines, these can often improve program performance for multi/many- Core machines.

...read moreread less

Abstract: Dynamic or Just-in-Time (JIT) compilation is crucial to achieve acceptable performance for applications (written in managed languages, such as Java and C#) distributed as intermediate language binary codes for a virtual machine (VM) architecture. Since it occurs at runtime, JIT compilation needs to carefully tune its compilation policy to make effective decisions regarding 'if' and 'when' to compile different program regions to achieve the best overall program performance. Past research has extensively tuned JIT compilation policies, but mainly for VMs with a single compiler thread and for execution on single-processor machines.This work is driven by the need to explore the most effective JIT compilation strategies in their modern operational environment, where (a) processors have evolved from single to multi/many cores, and (b) VMs provide support for multiple concurrent compiler threads. Our results confirm that changing 'if' and 'when' methods are compiled have significant performance impacts. We construct several novel configurations in the HotSpot JVM to facilitate this study. The new configurations are necessitated by modern Java benchmarks that impede traditional static whole-program discovery, analysis and annotation, and are required for simulating future many-core hardware that is not yet widely available. We study the effects on performance of increasing compiler aggressiveness for VMs with multiple compiler threads running on existing single/multi-core and future many-core machines. Our results indicate that although more aggressive JIT compilation policies show no benefits on single-core machines, these can often improve program performance for multi/many-core machines. However, accurately prioritizing JIT method compilations is crucial to realize such benefits.

...read moreread less

Proceedings Article•DOI•

Flexible object layouts: enabling lightweight language extensions by intercepting slot access

[...]

Toon Verwaest¹, Camillo Bruni², Mircea Lungu, Oscar Nierstrasz•Institutions (2)

University of Bern¹, French Institute for Research in Computer Science and Automation²

22 Oct 2011

TL;DR: This work proposes to extend the structural reflective model of the language with object layouts, layout scopes and slots, and shows how many idiomatic use cases that normally require boilerplate code can be more effectively supported.

...read moreread less

Abstract: Programming idioms, design patterns and application libraries often introduce cumbersome and repetitive boilerplate code to a software system. Language extensions and external DSLs (domain specific languages) are sometimes introduced to reduce the need for boilerplate code, but they also complicate the system by introducing the need for language dialects and inter-language mediation. To address this, we propose to extend the structural reflective model of the language with object layouts, layout scopes and slots. Based on the new reflective language model we can 1) provide behavioral hooks to object layouts that are triggered when the fields of an object are accessed and 2) simplify the implementation of state-related language extensions such as stateful traits. By doing this we show how many idiomatic use cases that normally require boilerplate code can be more effectively supported.We present an implementation in Smalltalk, and illustrate its usage through a series of extended examples.

...read moreread less