Showing papers presented at "Conference on Object-Oriented Programming Systems, Languages, and Applications in 2007"

PDF

Open Access

Proceedings Article•DOI•

Statistically rigorous java performance evaluation

[...]

Andy Georges¹, Dries Buytaert¹, Lieven Eeckhout¹•Institutions (1)

21 Oct 2007

TL;DR: A survey of existing Java performance evaluation methodologies is presented and the importance of statistically rigorous data analysis for dealing with non-determinism is discussed, and approaches to quantify startup as well as steady-state performance are advocated.

...read moreread less

Abstract: Java performance is far from being trivial to benchmark because it is affected by various factors such as the Java application, its input, the virtual machine, the garbage collector, the heap size, etc. In addition, non-determinism at run-time causes the execution time of a Java program to differ from run to run. There are a number of sources of non-determinism such as Just-In-Time (JIT) compilation and optimization in the virtual machine (VM) driven by timer-based method sampling, thread scheduling, garbage collection, and various. There exist a wide variety of Java performance evaluation methodologies usedby researchers and benchmarkers. These methodologies differ from each other in a number of ways. Some report average performance over a number of runs of the same experiment; others report the best or second best performance observed; yet others report the worst. Some iterate the benchmark multiple times within a single VM invocation; others consider multiple VM invocations and iterate a single benchmark execution; yet others consider multiple VM invocations and iterate the benchmark multiple times. This paper shows that prevalent methodologies can be misleading, and can even lead to incorrect conclusions. The reason is that the data analysis is not statistically rigorous. In this paper, we present a survey of existing Java performance evaluation methodologies and discuss the importance of statistically rigorous data analysis for dealing with non-determinism. We advocate approaches to quantify startup as well as steady-state performance, and, in addition, we provide the JavaStats software to automatically obtain performance numbers in a rigorous manner. Although this paper focuses on Java performance evaluation, many of the issues addressed in this paper also apply to other programming languages and systems that build on a managed runtime system.

...read moreread less

576 citations

Proceedings Article•DOI•

Randoop: feedback-directed random testing for Java

[...]

Carlos Pacheco¹, Michael D. Ernst¹•Institutions (1)

Massachusetts Institute of Technology¹

20 Oct 2007

TL;DR: RANDOOP, which generates unit tests for Java code using feedback-directed random test generation, and RANDOOP, which is an annotation-based interface for specifying configuration parameters that affect R )'s behavior and output.

...read moreread less

Abstract: RANDOOP for Java generates unit tests for Java code using feedback-directed random test generation. Below we describe RANDOOP's input, output, and test generation algorithm. We also give an overview of RANDOOP's annotation-based interface for specifying configuration parameters that affect RANDOOP's behavior and output.

...read moreread less

438 citations

Proceedings Article•DOI•

Mop: an efficient and generic runtime verification framework

[...]

Feng Chen¹, Grigore Rosu¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

21 Oct 2007

TL;DR: This paper proposes a parametric specification formalism-independent extension of MOP, together with an implementation of JavaMOP that supports parameters, and devised and implemented a decentralized indexing optimization.

...read moreread less

Abstract: Monitoring-Oriented Programming (MOP1) [21, 18, 22, 19] is a formal framework for software development and analysis, in which the developer specifies desired properties using definable specification formalisms, along with code to execute when properties are violated or validated. The MOP framework automatically generates monitors from the specified properties and then integrates them together with the user-defined code into the original system. The previous design of MOP only allowed specifications without parameters, so it could not be used to state and monitor safety properties referring to two or more related objects. In this paper we propose a parametric specification formalism-independent extension of MOP, together with an implementation of JavaMOP that supports parameters. In our current implementation, parametric specifications are translated into AspectJ code and then weaved into the application using off-the-shelf AspectJ compilers; hence, MOP specifications can be seen as formal or logical aspects. Our JavaMOP implementation was extensively evaluated on two benchmarks, Dacapo [14] and Tracematches [8], showing that runtime verification in general and MOP in particular are feasible. In some of the examples, millions of monitor instances are generated, each observing a set of related objects. To keep the runtime overhead of monitoring and event observation low, we devised and implemented a decentralized indexing optimization. Less than 8% of the experiments showed more than 10% runtime overhead; in most cases our tool generates monitoring code as efficient as the hand-optimized code. Despite its genericity, JavaMOP is empirically shown to be more efficient than runtime verification systems specialized and optimized for particular specification formalisms. Many property violations were detected during our experiments; some of them are benign, others indicate defects in programs. Many of these are subtle and hard to find by ordinary testing.

...read moreread less

396 citations

Proceedings Article•DOI•

The jastadd extensible java compiler

[...]

Torbjörn Ekman¹, Görel Hedin²•Institutions (2)

University of Oxford¹, Lund University²

20 Oct 2007

TL;DR: The JastAdd Extensible Java Compiler is a high quality Java compiler that is easy to extend in order to build static analysis tools for Java, and to extend Java with new language constructs.

...read moreread less

Abstract: The JastAdd Extensible Java Compiler is a high quality Java compiler that is easy to extend in order to build static analysis tools for Java, and to extend Java with new language constructs. It is built modularly, with a Java 1.4 compiler that is extended to a Java 5 compiler. Example applications that are built as extensions include an alternative backend that generates Jimple, an extension of Java with AspectJ constructs, and the implementation of a pluggable type system for non-null checking and inferenc. The system is implemented using JastAdd, a declarative Java-like language. We describe the compiler architecture, the major design ideas for building and extending the compiler, in particular, for dealing with complex extensions that affect name and type analysis. Our extensible compiler compares very favorably concerning quality, speed and size with other extensible Java compiler frameworks. It also compares favorably in quality and size compared with traditional non-extensible Java compilers, and it runs within a factor of three compared to javac.

...read moreread less

312 citations

Proceedings Article•DOI•

Modular typestate checking of aliased objects

[...]

Kevin Bierhoff¹, Jonathan Aldrich¹•Institutions (1)

Carnegie Mellon University¹

21 Oct 2007

TL;DR: A sound modular protocol checking approach that allows a great deal of flexibility in aliasing while guaranteeing the absence of protocol violations at runtime, and a novel abstraction, access permissions, that combines typestate and object aliasing information.

...read moreread less

Abstract: Objects often define usage protocols that clients must follow inorder for these objects to work properly. Aliasing makes itnotoriously difficult to check whether clients and implementations are compliant with such protocols. Accordingly, existing approaches either operate globally or severely restrict aliasing. We have developed a sound modular protocol checking approach, based on typestates, that allows a great deal of flexibility in aliasing while guaranteeing the absence of protocol violations at runtime. The main technical contribution is a novel abstraction, access permissions, that combines typestate and object aliasing information. In our methodology, developers express their protocol design intent through annotations based on access permissions. Our checking approach then tracks permissions through method implementations. For each object reference the checker keeps track of the degree of possible aliasing and is appropriately conservativein reasoning about that reference. This helps developers account for object manipulations that may occur through aliases. The checking approach handles inheritance in a novel way, giving subclasses more flexibility in method overriding. Case studies on Java iterators and streams provide evidence that access permissions can model realistic protocols, and protocol checking based on access permissions can be used to reason precisely about the protocols that arise in practice.

...read moreread less

200 citations

Proceedings Article•DOI•

Scalable omniscient debugging

[...]

Guillaume Pothier¹, Éric Tanter¹, José M. Piquer¹•Institutions (1)

University of Chile¹

21 Oct 2007

TL;DR: This paper presents TOD, a portable Trace-Oriented Debugger for Java, which combines an efficient instrumentation for event generation, a specialized distributed database for scalable storage and efficient querying, support for partial traces in order to reduce the trace volume to relevant events, and innovative interface components for interactive trace navigation and analysis in the development environment.

...read moreread less

Abstract: Omniscient debuggers make it possible to navigate backwards in time within a program execution trace, drastically improving the task of debugging complex applications. Still, they are mostly ignored in practice due to the challenges raised by the potentially huge size of the execution traces. This paper shows that omniscient debugging can be realistically realized through the use of different techniques addressing efficiency, scalability and usability. We present TOD, a portable Trace-Oriented Debugger for Java, which combines an efficient instrumentation for event generation, a specialized distributed database for scalable storage and efficient querying, support for partial traces in order to reduce the trace volume to relevant events, and innovative interface components for interactive trace navigation and analysis in the development environment. Provided a reasonable infrastructure, the performance of TOD allows a responsive debugging experience in the face of large programs.

...read moreread less

156 citations

Proceedings Article•DOI•

The causes of bloat, the limits of health

[...]

Nick Mitchell¹, Gary Sevitsky¹•Institutions (1)

IBM¹

21 Oct 2007

TL;DR: This work introduces health signatures to enable them to form value judgments about whether a design or implementation choice is good or bad, and shows how being independent of any application eases comparison across disparate implementations.

...read moreread less

Abstract: Applications often have large runtime memory requirements. In some cases, large memory footprint helps accomplish an important functional, performance, or engineering requirement. A large cache,for example, may ameliorate a pernicious performance problem. In general, however, finding a good balance between memory consumption and other requirements is quite challenging. To do so, the development team must distinguish effective from excessive use of memory. We introduce health signatures to enable these distinctions. Using data from dozens of applications and benchmarks, we show that they provide concise and application-neutral summaries of footprint. We show how to use them to form value judgments about whether a design or implementation choice is good or bad. We show how being independent ofany application eases comparison across disparate implementations. We demonstrate the asymptotic nature of memory health: certain designsare limited in the health they can achieve, no matter how much the data size scales up. Finally, we show how to use health signatures to automatically generate formulas that predict this asymptotic behavior, and show how they enable powerful limit studies on memory health.

...read moreread less

128 citations

Proceedings Article•DOI•

Probabilistic calling context

[...]

Michael D. Bond¹, Kathryn S. McKinley¹•Institutions (1)

University of Texas at Austin¹

21 Oct 2007

TL;DR: Probabilistic calling context is introduced, a new online approach that continuously maintains a probabilistically unique value representing the current calling context and is efficient and accurate enough to use in deployed software for residual testing, bug detection, and intrusion detection.

...read moreread less

Abstract: Calling context enhances program understanding and dynamic analyses by providing a rich representation of program location. Compared to imperative programs, object-oriented programs use more interprocedural and less intraprocedural control flow, increasing the importance of context sensitivity for analysis. However, prior online methods for computing calling context, such as stack-walking or maintaining the current location in a calling context tree, are expensive in time and space. This paper introduces a new online approach called probabilistic calling context (PCC) that continuously maintains a probabilistically unique value representing the current calling context. For millions of unique contexts, a 32-bit PCC value has few conflicts. Computing the PCC value adds 3% average overhead to a Java virtual machine. PCC is well-suited to clients that detect new or anomalous behavior since PCC values from training and production runs can be compared easily to detect new context-sensitive behavior; clients that query the PCC value at every system call, Java utility call, and Java API call add 0-9% overhead on average. PCC adds space overhead proportional to the distinct contexts stored by the client (one word per context). Our results indicate PCC is efficient and accurate enough to use in deployed software for residual testing, bug detection, and intrusion detection.

...read moreread less

121 citations

Proceedings Article•DOI•

Making trace monitors feasible

[...]

Pavel Avgustinov¹, Julian Tibble¹, Oege de Moor¹•Institutions (1)

University of Oxford¹

21 Oct 2007

TL;DR: Two optimisations for generating feasible trace monitors from declarative specifications of the relevant event pattern are identified: the first is an important improvement over an earlier proposal in [2] to avoid space leaks, and the second is a form of indexing for partial matches.

...read moreread less

Abstract: A trace monitor observes an execution trace at runtime; when it recognises a specified sequence of events, the monitor runs extra code. In the aspect-oriented programming community, the idea originatedas a generalisation of the advice-trigger mechanism: instead of matchingon single events (joinpoints), one matches on a sequence of events. The runtime verification community has been investigating similar mechanisms for a number of years, specifying the event patterns in terms of temporal logic, and applying the monitors to hardware and software. In recent years trace monitors have been adapted for use with mainstream object-oriented languages. In this setting, a crucial feature is to allow the programmer to quantify over groups of related objects when expressing the sequence of events to match. While many language proposals exist for allowing such features, until now no implementation had scalable performance: execution on all but very simple examples was infeasible. This paper rectifies that situation, by identifying two optimisations for generating feasible trace monitors from declarative specifications of the relevant event pattern. We restrict ourselves to optimisations that do not have a significant impact on compile-time: they only analyse the event pattern, and not the monitored code itself. The first optimisation is an important improvement over an earlier proposal in [2] to avoid space leaks. The second optimisation is a form of indexing for partial matches. Such indexing needs to be very carefully designed to avoid introducing new space leaks, and the resulting data structure is highly non-trivial.

...read moreread less

113 citations

Proceedings Article•DOI•

Streamflex: high-throughput stream programming in java

[...]

Jesper Honig Spring¹, Jean Privat², Rachid Guerraoui¹, Jan Vitek²•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, Purdue University²

21 Oct 2007

TL;DR: StreamFlex is an extension to Java which marries streams with objects and thus enables to combine, in the same Java virtual machine, stream processing code with traditional object-oriented components and the result is a rich and expressive language that can be implemented efficiently.

...read moreread less

Abstract: The stream programming paradigm aims to expose coarse-grained parallelism inapplications that must process continuous sequences of events. The appeal ofstream programming comes from its conceptual simplicity. A program is acollection of independent filters which communicate by the means ofuni-directional data channels. This model lends itself naturally toconcurrent and efficient implementations on modern multiprocessors. As theoutput behavior of filters is determined by the state of their inputchannels, stream programs have fewer opportunities for the errors (such asdata races and deadlocks) that plague shared memory concurrent programming. This paper introduces StreamFlex, an extension to Java which marries streams with objects and thus enables to combine, in the same Java virtual machine, stream processing code with traditional object-oriented components. StreamFlex targets high-throughput low-latency applications with stringent quality-of-service requirements. To achieve these goals, it must, at the same time, extend and restrict Java. To allow for program optimization and provide latency guarantees, the StreamFlex compiler restricts Java by imposing a stricter typing discipline on filters. On the other hand, StreamFlex extends the Java virtual machine with real-time capabilities, transactional memory and type-safe region-based allocation. The result is a rich and expressive language that can be implemented efficiently.

...read moreread less

110 citations

Proceedings Article•DOI•

Ownership transfer in universe types

[...]

Peter Müller¹, Arsenii Rudich²•Institutions (2)

Microsoft¹, ETH Zurich²

21 Oct 2007

TL;DR: TT combines ownership type checking with a modular static analysis to control references to transferable objects and guarantees statically that a cluster of objects is externally-unique when it is transferred and, thus, that ownership transfer is type safe.

...read moreread less

Abstract: Ownership simplifies reasoning about object-oriented programs by controlling aliasing and modifications of objects. Several type systems have been proposed to express and check ownership statically. For ownership systems to be practical, they must allow objects to migrate from one owner to another. This ownership transfer is common and occurs, for instance, during the initialization of data structures and when data structures are merged. However, existing ownership type systems either do not support ownership transfer at all or they are too restrictive, give rather weak static guarantees, or require a high annotation overhead. In this paper, we present UTT, an extension of Universe Types that supports ownership transfer. UTT combines ownership type checking with a modular static analysis to control references to transferable objects. UTT is very flexible because it permits temporary aliases, even across certain method calls. Nevertheless, it guarantees statically that a cluster of objects is externally-unique when it is transferred and, thus, that ownership transfer is type safe. UTT provides the same encapsulation as Universe Types and requires only negligible annotation overhead.

...read moreread less

Proceedings Article•

Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion

[...]

Richard P. Gabriel¹, David F. Bacon¹, Cristina V. Lopes², Guy L. Steele•Institutions (2)

IBM¹, University of California, Irvine²

20 Oct 2007

TL;DR: The committee met for two days, May 3rd and 4th, and set a goal for the committee to decide all papers by consensus, and in the event that a vote was required to break a stalemate for the vote to only be among those committee members who had provided formal written reviews, all papers were decided by consensus and without recourse to voting.

...read moreread less

Abstract: In conjunction with the program committee, it is my pleasure to present to you the research papers for the 2007 Conference on Object-Oriented Programming Systems, Languages, and Applications.After more than twenty years, ooPSLA remains a dynamic force for change and advance in the state of the art, as evidenced by the diverse program of 33 papers.I hope you find it as interesting and enjoyable as I do. This year we accepted 33 out of 156 submissions, the highest number over the past ten years. This was a deliberate choice. In recent years there has been much debate in the field about whether our conference system (which is now widely accepted as a publication venue for tenure cases) has become overly selective to the point that authors tend to submit more conservative papers, as ACM president David Patterson cogently argued [1]. ooPSLA has been at the vanguard of addressing this problem with its Onward! And Essays programs, but I felt that we should apply some of the same ideas to the research program as well. To implement this I first removed any particular limit or target for the number of papers accepted. This had the beneficial effect of allowing each paper to be considered independently, and avoiding considerations of whether one paper's acceptance would jeopardize another paper's chances. Secondly, I charged the committee to be "acceptance-positive," to forgive small faults (but correct them), and most importantly for the detractors of papers to give extra weight to the arguments of the proponents. However, continuity was also important. We continued the use of Oscar Nierstrasz's "Identify the Champion" paradigm, which tries to promote the selection of papers that are strongly advocated as opposed to those with good average scores [2]. Finally, I set a goal for the committee to decide all papers by consensus, and in the event that a vote was required to break a stalemate for the vote to only be among those committee members who had provided formal written reviews. The committee met for two days, May 3rd and 4th , at IBM Research in Hawthorne, New York. In the event, all papers were decided by consensus and without recourse to voting. William Cook, the past chair, chaired the discussion of papers with which I had a conflict of interest. The modified system resulted in the acceptance of some papers that might otherwise have been rejected as too controversial or as more intriguing but less fully developed. The acceptance rate was 21% (up from 17%in 2006), so while we accommodated additional papers, ooPSLA remains a highly selective conference -- a healthy balance. Two other issues regarding our community's conference system have recently been the subject of debate (and experimentation): double-blind review and submission of papers by members of the program committee. Allowing submissions by the committee increases the pool of submitters and increases the quality of the committee since it does not force them to choose between serving the community and publishing their own work, an especially difficult choice for academics who must consider not only their careers but those of their students. However, there is also the danger that such papers might receive preferential consideration. I chose the middle ground of allowing submissions by the committee but subjecting them to a quantitatively higher standard than other papers, and obtaining five reviews (rather than three for other submissions). Of the seven submissions by committee members, two were accepted. Of the rejected committee submissions, two had rankings that would otherwise likely have led to acceptance, but were not accepted according to the more stringent requirements (at least one A and no C's or D's, or else at least three A's and no more than one D). Several SIGPLAN conferences have recently begun using double-blind review, a practice that is prevalent in some other subfields of computer science. The purpose of double-blind review is to increase fairness by eliminating bias (either conscious or unconscious) based on the identity of the authors. However, double-blind review can introduce other fairness issues:the required anonymization can make it more difficult to evaluate the work in the context of its infrastructure, and there is the potential for primary or secondary reviewers to be unknowingly assigned to review a paper with which they have a conflict of interest. I chose to use non-blind submission for three reasons: first of all for continuity, since I had made other changes to the policies and processes, secondly, because of the fairness trade-offs mentioned above, and thirdly, in consultation with the chairs of other primary SIGPLAN conferences, to provide a basis for direct comparison of the two processes within a single year. I welcome your feedback on these and other issues regarding the review process.

...read moreread less

Proceedings Article•DOI•

Multiple ownership

[...]

Nicholas Cameron¹, Sophia Drossopoulou¹, James Noble², Matthew J. Smith¹•Institutions (2)

Imperial College London¹, Victoria University of Wellington²

21 Oct 2007

TL;DR: This work gives a straightforward model for multiple ownership, focusing in particular on how multiple ownership can support a powerful effects system that determines when two computations interfere-in spite of the DAG structure.

...read moreread less

Abstract: Existing ownership type systems require objects to have precisely one primary owner, organizing the heap into an ownership tree. Unfortunately, a tree structure is too restrictive for many programs, and prevents many common design patterns where multiple objects interact. Multiple Ownership is an ownership type system where objects can have more than one owner, and the resulting ownership structure forms a DAG. We give a straightforward model for multiple ownership, focusing in particular on how multiple ownership can support a powerful effects system that determines when two computations interfere-in spite of the DAG structure. We present a core programming language MOJO, Multiple ownership for Java-like Objects, including a type and effects system, and soundness proof. In comparison to other systems, MOJO imposes absolutely no restrictions on pointers, modifications or programs' structure, but in spite of this, MOJO's effects can be used to reason about or describe programs' behaviour.

...read moreread less

Proceedings Article•DOI•

Establishing object invariants with delayed types

[...]

Manuel Fähndrich¹, Songtao Xia¹•Institutions (1)

Microsoft¹

21 Oct 2007

TL;DR: Mainstream object-oriented languages such as C# and Java provide an initialization model for objects that does not guarantee programmer controlled initialization of fields, and delayed types are introduced to express and formalize prevalent initialization patterns in object- oriented languages.

...read moreread less

Abstract: Mainstream object-oriented languages such as C# and Java provide an initialization model for objects that does not guarantee programmer controlled initialization of fields. Instead, all fields are initialized to default values (0 for scalars and null for non-scalars) on allocation. This is in stark contrast to functional languages, where all parts of an allocation are initialized to programmer-provided values. These choices have a direct impact on two main issues: 1) the prevalence of null in object oriented languages (and its general absence in functional languages), and 2) the ability to initialize circular data structures. This paper explores connections between these differing approaches and proposes a fresh look at initialization. Delayed types are introduced to express and formalize prevalent initialization patterns in object-oriented languages.

...read moreread less

Proceedings Article•DOI•

The java module system: core design and semantic definition

[...]

Rok Strniša¹, Peter Sewell¹, Matthew Parkinson¹•Institutions (1)

University of Cambridge¹

21 Oct 2007

TL;DR: This work design and formalize a core module system for Java, and defines the syntax, the type system, and the operational semantics of an LJAM language, defining these rigorously in the Isabelle/HOL automated proof assistant.

...read moreread less

Abstract: Java has no module system. Its packages only subdivide the class name space, allowing only a very limited form of component-level information hiding and reuse. Two Java Community Processes have started addressing this problem: one describes the runtime system and has reached an early draft stage, while the other considers the developer's view and only has a straw-man proposal. Both are natural language documents, which inevitably contain ambiguities. In this work we design and formalize a core module system for Java. Where the JCP documents are complete, we follow them closely; elsewhere we make reasonable choices. We define the syntax, the type system, and the operational semantics of an LJAM language, defining these rigorously in the Isabelle/HOL automated proof assistant. Using this formalization, we identify various issues with the module system. We highlight the underlying design decisions, and discuss several alternatives and their benefits. Our Isabelle/HOL definitions should provide a basis for further consideration of the design alternatives, for reference implementations, and for proofs of soundness.

...read moreread less

Proceedings Article•DOI•

Can programming be liberated from the two-level style: multi-level programming with deepjava

[...]

Thomas Kühne¹, Daniel Schreiber¹•Institutions (1)

Technische Universität Darmstadt¹

21 Oct 2007

TL;DR: This paper conservatively extend the object-oriented programming paradigm to feature an unbounded number of domain classification levels to avoid the introduction of accidental complexity into programs caused by accommodating multiple domain levels within only two programming levels.

...read moreread less

Abstract: Since the introduction of object-oriented programming few programming languages have attempted to provide programmers with more than objects and classes, i.e., more than two levels. Those that did, almost exclusively aimed at describing language properties-i.e., their metaclasses exert linguistic control on language concepts and mechanisms-often in order to make the language extensible. In terms of supporting logical domain classification levels, however, they are still limited to two levels. In this paper we conservatively extend the object-oriented programming paradigm to feature an unbounded number of domain classification levels. We can therefore avoid the introduction of accidental complexity into programs caused by accommodating multiple domain levels within only two programming levels. We present a corresponding language design featuring ``deep instantiation'' and demonstrate its features with a running example. Finally, we outline the implementation of our compiler prototype and discuss the potentials of further developing our language design.

...read moreread less

Proceedings Article•DOI•

Using early phase termination to eliminate load imbalances at barrier synchronization points

[...]

Martin Rinard¹•Institutions (1)

Massachusetts Institute of Technology¹

21 Oct 2007

TL;DR: A general computational pattern that works well with early phase termination is identified and it is explained why computations that exhibit this pattern can tolerate the early termination of parallel tasks without producing unacceptable results.

...read moreread less

Abstract: We present a new technique, early phase termination, for eliminating idle processors in parallel computations that use barrier synchronization. This technique simply terminates each parallel phaseas soon as there are too few remaining tasks to keep all of the processors busy. Although this technique completely eliminates the idling that would other wise occur at barrier synchronization points, it may also change the computation and therefore the result that the computation produces. We address this issue by providing probabilistic distortion models that characterize how the use of early phase termination distorts the result that the computation produces. Our experimental results show that for our set of benchmark applications, 1) early phase termination can improve the performance of the parallel computation, 2) the distortion is small (or can be made to be small with the use of an appropriate compensation technique) and 3) the distortion models provide accurate and tight distortion bounds. These bounds can enable users to evaluate the effect of early phase termination and confidently accept results from parallel computations that use this technique if they find the distortion bounds to be acceptable. Finally, we identify a general computational pattern that works well with early phase termination and explain why computations that exhibit this pattern can tolerate the early termination of parallel tasks without producing unacceptable results.

...read moreread less

Proceedings Article•DOI•

Living it up with a live programming language

[...]

Sean McDirmid¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

21 Oct 2007

TL;DR: The design of a textual live language that is based on reactive data-flow values known as signals and dynamic inheritance is described, which supports live programming with responsive semantic feedback and is demonstrated with a working prototype.

...read moreread less

Abstract: A dynamic language promotes ease of use through flexible typing, a focus on high-level programming, and by streamlining the edit-compile-debug cycle. Live languages go beyond dynamic languages with more ease of use features. A live language supports live programming that provides programmers with responsive and continuous feedback about how their edits affect program execution. A live language is also based on high-level constructs such as declarative rules so that programmers can write less code. A live language could also provide programmers with responsive semantic feedback to enable time-saving services such as code completion. This paper describes the design of a textual live language that is based on reactive data-flow values known as signals and dynamic inheritance. Our language, SuperGlue, supports live programming with responsive semantic feedback, which we demonstrate with a working prototype.

...read moreread less

Proceedings Article•DOI•

Jeannie: granting java native interface developers their wishes

[...]

Martin Hirzel¹, Robert Grimm²•Institutions (2)

IBM¹, New York University²

21 Oct 2007

TL;DR: Jeannie is presented, a new language design for integrating Java with C that eliminates verbose boiler-plate code, enables static error detection across the language boundary, and simplifies dynamic resource management.

...read moreread less

Abstract: Higher-level languages interface with lower-level languages such as C to access platform functionality, reuse legacy libraries, or improve performance. This raises the issue of how to best integrate different languages while also reconciling productivity, safety, portability, and efficiency. This paper presents Jeannie, a new language design for integrating Java with C. In Jeannie, both Javaand C code are nested within each other in the same file and compile down to JNI, the Java platform's standard foreign function interface. By combining the two languages' syntax and semantics, Jeannie eliminates verbose boiler-plate code, enables static error detection across the language boundary, and simplifies dynamic resource management. We describe the Jeannie language and its compiler, while also highlighting lessons from composing two mature programming languages.

...read moreread less

Proceedings Article•DOI•

Tracking bad apples: reporting the origin of null and undefined value errors

[...]

Michael D. Bond¹, Nicholas Nethercote, Stephen W. Kent¹, Samuel Z. Guyer², Kathryn S. McKinley¹ - Show less +1 more•Institutions (2)

University of Texas at Austin¹, Tufts University²

21 Oct 2007

TL;DR: This paper presents efficient origin tracking of unusable values; it shows how to record where these values come into existence, correctly propagate them, and report them if they cause an error.

...read moreread less

Abstract: Programs sometimes crash due to unusable values, for example, when Java and C# programs dereference null pointers and when C and C++ programs use undefined values to affect program behavior. A stack trace produced on such a crash identifies the effect of the unusable value, not its cause, and is often not much help to the programmer. This paper presents efficient origin tracking of unusable values; it shows how to record where these values come into existence, correctly propagate them, and report them if they cause an error. The key idea is value piggybacking: when the original program stores an unusable value, value piggybacking instead stores origin information in the spare bits of the unusable value. Modest compiler support alters the program to propagate these modified values through operations such as assignments and comparisons. We evaluate two implementations: the first tracks null pointer origins in a JVM, and the second tracks undefined value origins in a memory-checking tool built with Valgrind. These implementations show that origin tracking via value piggybacking is fast and often useful, and in the Java case, has low enough overhead for use in a production environment.

...read moreread less

Proceedings Article•DOI•

Lost in translation: formalizing proposed extensions to c#

[...]

Gavin Bierman¹, Erik Meijer¹, Mads Torgersen¹•Institutions (1)

Microsoft¹

21 Oct 2007

TL;DR: This paper considers proposals for C# 3.0, the next version of the C# programming language, and gives both an informal introduction to the new language features, and a precise formal account by defining a translation from C#3.0 to C# 2.0.

...read moreread less

Abstract: Current real-world software applications typically involve heavy use of relational and XML data and their query languages. Unfortunately object-oriented languages and database query languages are based on different semantic foundations and optimization strategies. The resulting ''ROX (Relations, Objects, XML) impedance mismatc'' makes life very difficult for developers. Microsoft Corporation is developing extensions to the .NET framework to facilitate easier processing of non-object-oriented data models. Part of this project (known as "LINQ") includes various extensions to the .NET languages to leverage this support. In this paper we consider proposals for C# 3.0, the next version of the C# programming language. We give both an informal introduction to the new language features, and a precise formal account by defining a translation from C# 3.0 to C# 2.0. This translation also demonstrates how these language extensions do not require any changes to the underlying CLR.

...read moreread less

Proceedings Article•DOI•

Notation and representation in collaborative object-oriented design: an observational study

[...]

Uri Dekel¹, James D. Herbsleb¹•Institutions (1)

Carnegie Mellon University¹

21 Oct 2007

TL;DR: Observational studies of collaborative design exercises found that teams intentionally improviserepresentations and organize design information in responseto ad-hoc needs, which arise from the evolution of the design, and which are difficult to meet with fixed standard notations.

...read moreread less

Abstract: Software designers in the object-oriented paradigm can make use of modeling tools and standard notations such as UML. Nevertheless, casual observations from collocated design collaborations suggest that teams tend to use physical mediums to sketch a plethora of informal diagrams in varied representations that often diverge from UML. To better understand such collaborations and support them with tools, we need to understand the origins, roles, uses, and implications of these alternate representations. To this end we conducted observational studies of collaborative design exercises, in which we focused on representation use. Our primary finding is that teams intentionally improviserepresentations and organize design information in responseto ad-hoc needs, which arise from the evolution of the design, and which are difficult to meet with fixed standard notations. This behavior incurs orientation and grounding difficulties for which teams compensate by relying on memory, other communication mediums, and contextual cues. Without this additional information the artifacts are difficult to interpret and have limited documentation potential. Collaborative design tools and processes should therefore focus on preserving contextual information while permitting unconstrained mixing and improvising of notations.

...read moreread less

Proceedings Article•DOI•

Transactions with isolation and cooperation

[...]

Yannis Smaragdakis¹, Anthony Kay¹, Reimer Behrends¹, Michal Young¹•Institutions (1)

University of Oregon¹

21 Oct 2007

TL;DR: The TIC model protects against unanticipated interference by having the type system keep track of all operations that may (transitively) violate the atomicity of a transaction and require the programmer to establish consistency at appropriate points, resulting in a programming model that is both general and simple.

...read moreread less

Abstract: We present the TIC (Transactions with Isolation and Cooperation) model for concurrent programming. TIC adds to standard transactional memory the ability for a transaction to observe the effects of other threads at selected points. This allows transactions to cooperate, as well as to invoke nonrepeatable or irreversible operations, such as I/O. Cooperating transactions run the danger of exposing intermediate state and of having other threads change the transaction's state. The TIC model protects against unanticipated interference by having the type system keep track of all operations that may (transitively) violate the atomicity of a transaction and require the programmer to establish consistency at appropriate points. The result is a programming model that is both general and simple. We have used the TIC model to re-engineer existing lock-based applications including a substantial multi-threaded web mail server and a memory allocator with coarse-grained locking. Our experience confirms the features of the TIC model: It is convenient for the programmer, while maintaining the benefits of transactional memory.

...read moreread less

Proceedings Article•DOI•

Using FindBugs on production software

[...]

Nathaniel Ayewah¹, William Pugh¹, J. David Morgenthaler², John Penix², YuQian Zhou² - Show less +1 more•Institutions (2)

University of Maryland, College Park¹, Google²

20 Oct 2007

TL;DR: This poster will present the experiences using FindBugs in production software development environments, including both open source efforts and Google's internal code base, to summarize the defects found and describe the issue of real but trivial defects.

...read moreread less

Abstract: This poster will present our experiences using FindBugs in production software development environments, including both open source efforts and Google's internal code base. We summarize the defects found, describe the issue of real but trivial defects, and discuss the integration of FindBugs into Google's Mondrian code review system.

...read moreread less

Proceedings Article•DOI•

CodeGenie: a tool for test-driven source code search

[...]

Otávio Augusto Lazzarini Lemos¹, Sushil Bajracharya², Joel Ossher²•Institutions (2)

University of São Paulo¹, University of California, Irvine²

20 Oct 2007

TL;DR: This work presents CodeGenie, a tool that implements a test-driven approach to search and reuse of code available on largescale code repositories, which automatically searches for an existing implementation based on information available in the tests.

...read moreread less

Abstract: We present CodeGenie, a tool that implements a test-driven approach to search and reuse of code available on largescale code repositories. With CodeGenie, developers designtest cases for a desired feature first, similar to Test-driven Development (TDD). However, instead of implementing the feature from scratch, CodeGenie automatically searches foran existing implementation based on information available in the tests. To check the suitability of the candidate results in the local context, each result is automatically woven into the developer's project and tested using the original tests. The developer can then reuse the most suitable result. Later, reused code can also be unwoven from the project as wished. For the code searching and wrapping facilities, CodeGenie relies on Sourcerer, an Internet-scale source code infrastructure that we have developed.

...read moreread less

Proceedings Article•DOI•

The transactional memory / garbage collection analogy

[...]

Dan Grossman¹•Institutions (1)

University of Washington¹

21 Oct 2007

TL;DR: This essay presents remarkable similarities between transactional memory and garbage collection, and lets us better understand one technology by thinking about the corresponding issues for the other.

...read moreread less

Abstract: This essay presents remarkable similarities between transactional memory and garbage collection. The connections are fascinating in their own right, and they let us better understand one technology by thinking about the corresponding issues for the other.

...read moreread less

Proceedings Article•DOI•

Elephant 2000: a programming language based on speech acts

[...]

John J. McCarthy¹•Institutions (1)

Stanford University¹

20 Oct 2007

TL;DR: Elephant 2000 is a proposed programming language good for writing and verifying programs that interact with people or interact with programs belonging to other organizations and has both input-output specifications and accomplishment specifications concerning what the program accomplishes in the world.

...read moreread less

Abstract: Elephant 2000 is a proposed programming language good for writing and verifying programs that interact with people (e.g., transaction processing) or interact with programs belonging to other organizations (e.g., electronic data interchange). Communication inputs and outputs are in an I/O language whose sentences are meaningful speech acts identified in the language as questions, answers, offers, acceptances, declinations, requests, permissions, and promises. The correctness of programs is partly defined in terms of proper performance of the speech acts. Answers should be truthful and responsive, and promises should be kept. Sentences of logic expressing these forms of correctness can be generated automatically from the form of the program.Elephant source programs may not need data structures, because they can refer directly to the past. Thus a program can say that an airline passenger has a reservation if he has made one and hasn't cancelled it. Elephant programs themselves can be represented as sentences of logic. Their extensional properties follow from this representation without an intervening theory of programming or anything like Hoare axioms.Elephant programs that interact non-trivially with the outside world can have both input-output specifications, relating the programs inputs and outputs, and accomplishment specifications concerning what the program accomplishes in the world. These concepts are respectively generalizations of the philosophers' illocutionary and perlocutionary speech acts.Programs that engage in commercial transactions assume obligations on behalf of their owners in exchange for obligations assumed by other entities. It may be part of the specifications of an Elephant 2000 program that these obligations are exchanged as intended, and this too can be expressed by a logical sentence.Human speech acts involve intelligence. Elephant 2000 is on the borderline of AI, but the talk emphasizes the Elephant usages that do not require AI.

...read moreread less

Proceedings Article•DOI•

Type qualifier inference for java

[...]

David Greenfieldboyce¹, Jeffrey S. Foster¹•Institutions (1)

University of Maryland, College Park¹

21 Oct 2007

TL;DR: JQual is a tool that adds user-defined type qualifiers to Java, allowing programmers to quickly and easily incorporate lightweight, application-specific type checking into their programs, and type qualifier inference is provided.

...read moreread less

Abstract: Java's type system provides programmers with strong guarantees of type and memory safety, but there are many important properties not captured by standard Java types. We describe JQual, a tool that adds user-defined type qualifiers to Java, allowing programmers to quickly and easily incorporateextra lightweight, application-specific type checking into their programs. JQual provides type qualifier inference, so that programmers need only add a few key qualifier annotations to their program, and then JQual infers any remaining qualifiers and checks their consistency. We explore two applications of JQual. First, we introduce opaque and enumqualifiers to track C pointers and enumerations that flow through Java code via the JNI. In our benchmarks we found that these C values are treated correctly, but there are some places where a client could potentially violate safety. Second,we introduce a read only qualifier for annotating references that cannot be used to modify the objects they refer to. We found that JQual is able to automatically infer read only in many places on method signatures. These results suggest that type qualifiers and type qualifier inference are a useful addition to Java.

...read moreread less

Proceedings Article•DOI•

Ilea: inter-language analysis across java and c

[...]

Gang Tan¹, Greg Morrisett²•Institutions (2)

Boston College¹, Harvard University²

21 Oct 2007

TL;DR: ILEA (stands for Inter-LanguagE Analysis) is proposed, which is a framework that enables existing Java analyses to understand the behavior of C code and demonstrates the utility of the specifications generated, by modifying an existing non-null analysis to identify null-related bugs in Java applications that contain C libraries.

...read moreread less

Abstract: Java bug finders perform static analysis to find implementation mistakes that can lead to exploits and failures; Java compilers perform static analysis for optimization.allIf Java programs contain foreign function calls to C libraries, however, static analysis is forced to make either optimistic or pessimistic assumptions about the foreign function calls, since models of the C libraries are typically not available. We propose ILEA (stands for Inter-LanguagE Analysis), which is a framework that enables existing Java analyses to understand the behavior of C code. Our framework includes: (1) a novel specification language, which extends the Java Virtual Machine Language (JVML) with a few primitives that approximate the effects that the C code might have; (2) an automatic specification extractor, which builds models of the C code. Comparing to other possible specification languages, our language is expressive, yet facilitates construction of automatic specification extractors. Furthermore, because the specification language is based on the JVML, existing Java analyses can be easily migrated to utilize specifications in the language. We also demonstrate the utility of the specifications generated, by modifying an existing non-null analysis to identify null-related bugs in Java applications that contain C libraries. Our preliminary experiments identified dozens of null-related bugs.

...read moreread less

Proceedings Article•DOI•

BigLever software gears and the 3-tiered SPL methodology

[...]

Charles W. Krueger

20 Oct 2007

TL;DR: G Gears is designed to support and enable all three tiers in the new generation 3-Tiered Software Product Line Methodology, across the full SPL engineering lifecycle.

...read moreread less

Abstract: BigLever Software Gears is a software product line development tool that allows you to engineer your product line portfolio as though it is a single system. Gears is designed to support and enable all three tiers in the new generation 3-Tiered Software Product Line (SPL) Methodology, across the full SPL engineering lifecycle. Gears and the 3-Tiered SPL Methodology have played an instrumental role in some of the industry's most notable real-world success stories including Salion, 2004 Software Product line Hall of Fame Inductee, and Engenio/LSI Logic, 2006 Software Product Line Hall of Fame inductee.

...read moreread less