scispace - formally typeset
Search or ask a question

Showing papers presented at "Conference on Object-Oriented Programming Systems, Languages, and Applications in 2005"


Proceedings ArticleDOI
12 Oct 2005
TL;DR: A modern object-oriented programming language, X10, is designed for high performance, high productivity programming of NUCC systems and an overview of the X10 programming model and language, experience with the reference implementation, and results from some initial productivity comparisons between the X 10 and Java™ languages are presented.
Abstract: It is now well established that the device scaling predicted by Moore's Law is no longer a viable option for increasing the clock frequency of future uniprocessor systems at the rate that had been sustained during the last two decades. As a result, future systems are rapidly moving from uniprocessor to multiprocessor configurations, so as to use parallelism instead of frequency scaling as the foundation for increased compute capacity. The dominant emerging multiprocessor structure for the future is a Non-Uniform Cluster Computing (NUCC) system with nodes that are built out of multi-core SMP chips with non-uniform memory hierarchies, and interconnected in horizontally scalable cluster configurations such as blade servers. Unlike previous generations of hardware evolution, this shift will have a major impact on existing software. Current OO language facilities for concurrent and distributed programming are inadequate for addressing the needs of NUCC systems because they do not support the notions of non-uniform data access within a node, or of tight coupling of distributed nodes.We have designed a modern object-oriented programming language, X10, for high performance, high productivity programming of NUCC systems. A member of the partitioned global address space family of languages, X10 highlights the explicit reification of locality in the form of places}; lightweight activities embodied in async, future, foreach, and ateach constructs; a construct for termination detection (finish); the use of lock-free synchronization (atomic blocks); and the manipulation of cluster-wide global data structures. We present an overview of the X10 programming model and language, experience with our reference implementation, and results from some initial productivity comparisons between the X10 and Java™ languages.

1,469 citations


Proceedings ArticleDOI
12 Oct 2005
TL;DR: This paper presents a language called PQL (Program Query Language) that allows programmers to express such questions easily in an application-specific context and develops both static and dynamic techniques to find solutions to PQL queries.
Abstract: A number of effective error detection tools have been built in recent years to check if a program conforms to certain design rules. An important class of design rules deals with sequences of events asso-ciated with a set of related objects. This paper presents a language called PQL (Program Query Language) that allows programmers to express such questions easily in an application-specific context. A query looks like a code excerpt corresponding to the shortest amount of code that would violate a design rule. Details of the tar-get application's precise implementation are abstracted away. The programmer may also specify actions to perform when a match is found, such as recording relevant information or even correcting an erroneous execution on the fly.We have developed both static and dynamic techniques to find solutions to PQL queries. Our static analyzer finds all potential matches conservatively using a context-sensitive, flow-insensitive, inclusion-based pointer alias analysis. Static results are also use-ful in reducing the number of instrumentation points for dynamic analysis. Our dynamic analyzer instruments the source program to catch all violations precisely as the program runs and to optionally perform user-specified actions.We have implemented the techniques described in this paper and found 206 errors in 6 large real-world open-source Java applica-tions containing a total of nearly 60,000 classes. These errors are important security flaws, resource leaks, and violations of consis-tency invariants. The combination of static and dynamic analysis proves effective at addressing a wide range of debugging and pro-gram comprehension queries. We have found that dynamic analysis is especially suitable for preventing errors such as security vulner-abilities at runtime.

546 citations


Proceedings ArticleDOI
12 Oct 2005
TL;DR: A new history-based language feature called tracematches is presented that enables the programmer to trigger the execution of extra code by specifying a regular pattern of events in a computation trace by exploiting the introduction of free variables in the matching patterns.
Abstract: An aspect observes the execution of a base program; when certain actions occur, the aspect runs some extra code of its own. In the AspectJ language, the observations that an aspect can make are confined to the current action: it is not possible to directly observe the history of a computation.Recently, there have been several interesting proposals for new history-based language features, most notably by Douence et al. and by Walker and Viggers. In this paper, we present a new history-based language feature called tracematches that enables the programmer to trigger the execution of extra code by specifying a regular pattern of events in a computation trace. We have fully designed and implemented tracematches as a seamless extension of AspectJ.A key innovation in our tracematch approach is the introduction of free variables in the matching patterns. This enhancement enables a whole new class of applications in which events can be matched not only by the event kind, but also by the values associated with the free variables. We provide several examples of applications enabled by this feature.After introducing and motivating the idea of tracematches via examples, we present a detailed semantics of our language design, and we derive an implementation from that semantics. The implementation has been realised as an extension of the abc compiler for AspectJ.

456 citations


Proceedings ArticleDOI
12 Oct 2005
TL;DR: An approach to managing the architecture of large software systems is presented, and dependencies are extracted from the code by a conventional static analysis, and shown in a tabular form known as the 'Dependency Structure Matrix' (DSM).
Abstract: An approach to managing the architecture of large software systems is presented. Dependencies are extracted from the code by a conventional static analysis, and shown in a tabular form known as the 'Dependency Structure Matrix' (DSM). A variety of algorithms are available to help organize the matrix in a form that reflects the architecture and highlights patterns and problematic dependencies. A hierarchical structure obtained in part by such algorithms, and in part by input from the user, then becomes the basis for 'design rules' that capture the architect's intent about which dependencies are acceptable. The design rules are applied repeatedly as the system evolves, to identify violations, and keep the code and its architecture in conformance with one another. The analysis has been implemented in a tool called LDM which has been applied in several commercial projects; in this paper, a case study application to Haystack, an information retrieval system, is described.

433 citations


Proceedings ArticleDOI
12 Oct 2005
TL;DR: Three programming language abstractions are identified for the construction of reusable components: abstract type members, explicit selftypes, and modular mixin composition, which enable an arbitrary assembly of static program parts with hard references between them to be transformed into a system of reusable component.
Abstract: We identify three programming language abstractions for the construction of reusable components: abstract type members, explicit selftypes, and modular mixin composition. Together, these abstractions enable us to transform an arbitrary assembly of static program parts with hard references between them into a system of reusable components. The transformation maintains the structure of the original system. We demonstrate this approach in two case studies, a subject/observer framework and a compiler front-end.

268 citations


Proceedings ArticleDOI
12 Oct 2005
TL;DR: A regularization and refinement approach achieves nearly the precision of field-sensitive Andersen's analysis in time budgets as small as 2ms per query, which can yield speedups of up to 16x over computing an exhaustive Andersen'sAnalysis for some clients, with little to no precision loss.
Abstract: We present a points-to analysis technique suitable for environments with small time and memory budgets, such as just-in-time (JIT) compilers and interactive development environments (IDEs). Our technique is demand-driven, performing only the work necessary to answer each query (a request for a variable's points-to information) issued by a client. In cases where even the demand-driven approach exceeds the time budget for a query, we employ early termination, i.e., stopping the analysis prematurely and returning an over-approximated result to the client. Our technique improves on previous demand-driven points-to analysis algorithms [17, 33] by achieving much higher precision under small time budgets and early termination.We formulate Andersen's analysis [5] for Java as a CFL-reachability problem [33]. This formulation shows that Andersen's analysis for Java is a balanced-parentheses problem, an insight that enables our new techniques. We exploit the balanced parentheses structure to approximate Andersen's analysis by regularizing the CFL-reachability problem, yielding an asymptotically cheaper algorithm. We also show how to regain most of the precision lost in the regular approximation as needed through refinement. Our evaluation shows that our regularization and refinement approach achieves nearly the precision of field-sensitive Andersen's analysis in time budgets as small as 2ms per query. Our technique can yield speedups of up to 16x over computing an exhaustive Andersen's analysis for some clients, with little to no precision loss.

203 citations


Proceedings ArticleDOI
12 Oct 2005
TL;DR: The definition and implementation of safe futures for Java are explored and it is indicated that for programs with modest mutation rates on shared data, applications can use futures to profitably exploit parallelism, without sacrificing safety.
Abstract: A future is a simple and elegant abstraction that allows concurrency to be expressed often through a relatively small rewrite of a sequential program. In the absence of side-effects, futures serve as benign annotations that mark potentially concurrent regions of code. Unfortunately, when computation relies heavily on mutation as is the case in Java, its meaning is less clear, and much of its intended simplicity lost.This paper explores the definition and implementation of safe futures for Java. One can think of safe futures as truly transparent annotations on method calls, which designate opportunities for concurrency. Serial programs can be made concurrent simply by replacing standard method calls with future invocations. Most significantly, even though some parts of the program are executed concurrently and may indeed operate on shared data, the semblance of serial execution is nonetheless preserved. Thus, program reasoning is simplified since data dependencies present in a sequential program are not violated in a version augmented with safe futures.Besides presenting a programming model and API for safe futures, we formalize the safety conditions that must be satisfied to ensure equivalence between a sequential Java program and its future-annotated counterpart. A detailed implementation study is also provided. Our implementation exploits techniques such as object versioning and task revocation to guarantee necessary safety conditions. We also present an extensive experimental evaluation of our implementation to quantify overheads and limitations. Our experiments indicate that for programs with modest mutation rates on shared data, applications can use futures to profitably exploit parallelism, without sacrificing safety.

195 citations


Proceedings ArticleDOI
12 Oct 2005
TL;DR: A catalog of 27 micro-patterns defined on Java classes and interfaces that captures a wide spectrum of common programming practices, including a particular and (intentionally restricted) use of inheritance, immutability, data management and wrapping, restricted creation, and emulation of procedural-, modular-, and even functional- programming paradigms with object oriented constructs.
Abstract: Micro patterns are similar to design patterns, except that micro patterns are stand at a lower, closer to the implementation, level of abstraction. Micro patterns are also unique in that they are mechanically recognizable, since each such pattern can be expressed as a formal condition on the structure of a class.This paper presents a catalog of 27 micro-patterns defined on Java classes and interfaces. The catalog captures a wide spectrum of common programming practices, including a particular and (intentionally restricted) use of inheritance, immutability, data management and wrapping, restricted creation, and emulation of procedural-, modular-, and even functional- programming paradigms with object oriented constructs. Together, the patterns present a set of prototypes after which a large portion of all Java classes and interfaces are modeled. We provide empirical indication that this portion is as high as 75%.A statistical analysis of occurrences of micro patterns in a large software corpus, spanning some 70,000 Java classes drawn from a rich set of application domains, shows, with high confidence level that the use of these patterns is not random. These results indicate consciousness and discernible design decisions, which are sustained in the software evolution. With high confidence level, we can also show that the use of these patterns is tied to the specification, or the purpose, that the software realizes.The traceability, abundance and the statistical significance of micro pattern occurrence raise the hope of using the classification of software into these patterns for a more founded appreciation of its design and code quality.

194 citations


Proceedings ArticleDOI
12 Oct 2005
TL;DR: This work proposes Program Trace Query Language (PTQL), a language based on relational queries over program traces, in which programmers can write expressive, declarative queries about program behavior, and describes the compiler, Partiqle, which instruments the program to execute the query on-line.
Abstract: Instrumenting programs with code to monitor runtime behavior is a common technique for profiling and debugging. In practice, instrumentation is either inserted manually by programmers, or automatically by specialized tools that monitor particular properties. We propose Program Trace Query Language (PTQL), a language based on relational queries over program traces, in which programmers can write expressive, declarative queries about program behavior. We also describe our compiler, Partiqle. Given a PTQL query and a Java program, Partiqle instruments the program to execute the query on-line. We apply several PTQL queries to a set of benchmark programs, including the Apache Tomcat Web server. Our queries reveal significant performance bugs in the jack SpecJVM98 benchmark, in Tomcat, and in the IBM Java class library, as well as some correct though uncomfortably subtle code in the Xerces XML parser. We present performance measurements demonstrating that our prototype system has usable performance.

165 citations


Proceedings ArticleDOI
12 Oct 2005
TL;DR: This work presents an approach in which mappings between legacy classes and their replacements are specified by the programmer, and an analysis based on type constraints determines where declarations and allocation sites can be updated.
Abstract: As object-oriented class libraries evolve, classes are occasionally deprecated in favor of others with roughly the same functionality. In Java's standard libraries, for example, class Hashtable has been superseded by HashMap, and Iterator is now preferred over Enumeration. Migrating client applications to use the new idioms is often desirable, but making the required changes to declarations and allocation sites can be quite labor-intensive. Moreover, migration becomes complicated---and sometimes impossible---if an application interacts with external components, if a legacy class is not completely equivalent to its replacement, or if multiple interdependent classes must be migrated simultaneously. We present an approach in which mappings between legacy classes and their replacements are specified by the programmer. Then, an analysis based on type constraints determines where declarations and allocation sites can be updated. The method was implemented in Eclipse, and evaluated on a number of Java applications. On average, our tool could migrate more than 90% of the references to legacy classes.

162 citations


Proceedings ArticleDOI
16 Oct 2005
TL;DR: This paper discusses the rationale behind the decision for SOA, process choreography, and Web services, and gives an overview of the BPEL-centric process choreographic architecture and features lessons learned and best practices identified during design, implementation, and rollout of the solution.
Abstract: Effective and affordable business-to-business process integration is a key success factor in the telecommunications industry. A large telecommunication wholesaler, supplying its services to more than 150 different service retailers, enhanced the process integration capabilities of its core order management system through widespread use of SOA, business process choreography and Web services concepts. This core order management system processes 120 different complex order types.On this project, challenging requirements such as complexity of business process models and multi-channel accessibility turned out to be true proof points for the applied SOA concepts, tools, and runtime environments. To implement an automated and secured business-to-business Web services channel and to introduce a process choreography layer into a large existing application were two of the key requirements that had to be addressed. The solution complies with the Web Services Interoperability Basic Profile 1.0 and makes use of executable business process models defined in the Business Process Execution Language (BPEL).This paper discusses the rationale behind the decision for SOA, process choreography, and Web services, and gives an overview of the BPEL-centric process choreography architecture. Furthermore, it features lessons learned and best practices identified during design, implementation, and rollout of the solution.

Proceedings ArticleDOI
12 Oct 2005
TL;DR: This paper demonstrates how classboxes can be implemented in statically-typed languages like Java and shows how Classbox/J, a prototype implementation of classboxes for Java, is used to provide a cleaner implementation of Swing using local refinement rather than subclassing.
Abstract: Unanticipated changes to complex software systems can introduce anomalies such as duplicated code, suboptimal inheritance relationships and a proliferation of run-time downcasts. Refactoring to eliminate these anomalies may not be an option, at least in certain stages of software evolution. Classboxes are modules that restrict the visibility of changes to selected clients only, thereby offering more freedom in the way unanticipated changes may be implemented, and thus reducing the need for convoluted design anomalies. In this paper we demonstrate how classboxes can be implemented in statically-typed languages like Java. We also present an extended case study of Swing, a Java GUI package built on top of AWT, and we document the ensuing anomalies that Swing introduces. We show how Classbox/J, a prototype implementation of classboxes for Java, is used to provide a cleaner implementation of Swing using local refinement rather than subclassing.

Proceedings ArticleDOI
12 Oct 2005
TL;DR: Improvements that are new in this paper include distinguishing the notions of assignability and mutability; integration with Java 5's generic types and with multi-dimensional arrays; a mutability polymorphism approach to avoiding code duplication; type-safe support for reflection and serialization; and formal type rules and type soundness proof for a core calculus.
Abstract: This paper describes a type system that is capable of expressing and enforcing immutability constraints. The specific constraint expressed is that the abstract state of the object to which an immutable reference refers cannot be modified using that reference. The abstract state is (part of) the transitively reachable state: that is, the state of the object and all state reachable from it by following references. The type system permits explicitly excluding fields from the abstract state of an object. For a statically type-safe language, the type system guarantees reference immutability. If the language is extended with immutability downcasts, then run-time checks enforce the reference immutability constraints.This research builds upon previous research in language support for reference immutability. Improvements that are new in this paper include distinguishing the notions of assignability and mutability; integration with Java 5's generic types and with multi-dimensional arrays; a mutability polymorphism approach to avoiding code duplication; type-safe support for reflection and serialization; and formal type rules and type soundness proof for a core calculus. Furthermore, it retains the valuable features of the previous dialect, including usability by humans (as evidenced by experience with 160,000 lines of Javari code) and interoperability with Java and existing JVMs.

Proceedings ArticleDOI
12 Oct 2005
TL;DR: The results quantify the time-space tradeoff of garbage collection: with five times as much memory, an Appel-style generational collector with a non-copying mature space matches the performance of reachability-based explicit memory management.
Abstract: Garbage collection yields numerous software engineering benefits, but its quantitative impact on performance remains elusive. One can compare the cost of conservative garbage collection to explicit memory management in C/C++ programs by linking in an appropriate collector. This kind of direct comparison is not possible for languages designed for garbage collection (e.g., Java), because programs in these languages naturally do not contain calls to free. Thus, the actual gap between the time and space performance of explicit memory management and precise, copying garbage collection remains unknown.We introduce a novel experimental methodology that lets us quantify the performance of precise garbage collection versus explicit memory management. Our system allows us to treat unaltered Java programs as if they used explicit memory management by relying on oracles to insert calls to free. These oracles are generated from profile information gathered in earlier application runs. By executing inside an architecturally-detailed simulator, this "oracular" memory manager eliminates the effects of consulting an oracle while measuring the costs of calling malloc and free. We evaluate two different oracles: a liveness-based oracle that aggressively frees objects immediately after their last use, and a reachability-based oracle that conservatively frees objects just after they are last reachable. These oracles span the range of possible placement of explicit deallocation calls.We compare explicit memory management to both copying and non-copying garbage collectors across a range of benchmarks using the oracular memory manager, and present real (non-simulated) runs that lend further validity to our results. These results quantify the time-space tradeoff of garbage collection: with five times as much memory, an Appel-style generational collector with a non-copying mature space matches the performance of reachability-based explicit memory management. With only three times as much memory, the collector runs on average 17% slower than explicit memory management. However, with only twice as much memory, garbage collection degrades performance by nearly 70%. When physical memory is scarce, paging causes garbage collection to run an order of magnitude slower than explicit memory management.

Proceedings ArticleDOI
16 Oct 2005
TL;DR: A new field in distributed computing, called Ambient Intelligence, has emerged as a consequence of the increasing availability of wireless devices and the mobile networks they induce as mentioned in this paper, and developing software for such mobile networks is extremely hard in conventional programming languages because of new distribution issues related to volatile network connections, dynamic network topologies and partial failures.
Abstract: A new field in distributed computing, called Ambient Intelligence, has emerged as a consequence of the increasing availability of wireless devices and the mobile networks they induce. Developing software for such mobile networks is extremely hard in conventional programming languages because of new distribution issues related to volatile network connections, dynamic network topologies and partial failures.

Proceedings ArticleDOI
12 Oct 2005
TL;DR: This paper revisits the abstractions comprising the Boost Graph Library in the context of distributed-memory parallelism, lifting away the implicit requirements of sequential execution and a single shared address space and develops general principles and patterns for using (and reusing) generic, object-oriented parallel software libraries.
Abstract: This paper describes the process used to extend the Boost Graph Library (BGL) for parallel operation with distributed memory. The BGL consists of a rich set of generic graph algorithms and supporting data structures, but it was not originally designed with parallelism in mind. In this paper, we revisit the abstractions comprising the BGL in the context of distributed-memory parallelism, lifting away the implicit requirements of sequential execution and a single shared address space. We illustrate our approach by describing the process as applied to one of the core algorithms in the BGL, breadth-first search. The result is a generic algorithm that is unchanged from the sequential algorithm, requiring only the introduction of external (distributed) data structures for parallel execution. More importantly, the generic implementation retains its interface and semantics, such that other distributed algorithms can be built upon it, just as algorithms are layered in the sequential case. By characterizing these extensions as well as the extension process, we develop general principles and patterns for using (and reusing) generic, object-oriented parallel software libraries. We demonstrate that the resulting algorithm implementations are both efficient and scalable with performance results for several algorithms.

Proceedings ArticleDOI
12 Oct 2005
TL;DR: This paper examines a number of architectural patterns to discover those primitive abstractions that are common among the patterns, and demonstrates an initial set of primitives that participate in several well-known architectural patterns.
Abstract: Architectural patterns are a key point in architectural documentation. Regrettably, there is poor support for modeling architectural patterns, because the pattern elements are not directly matched by elements in modeling languages, and, at the same time, patterns support an inherent variability that is hard to model using a single modeling solution. This paper proposes tackling this problem by finding and representing architectural primitives, as the participants in the solutions that patterns convey. In particular, we examine a number of architectural patterns to discover those primitive abstractions that are common among the patterns, and at the same time demonstrate a degree of variability in each pattern. These abstractions belong in the components and connectors architectural view, though more abstractions can be found in other views. We have selected UML 2 as the language for representing these primitive abstractions as extensions of the standard UML elements. The added value of this approach is twofold: it proposes a generic and extensible approach for modeling architectural patterns by means of architectural primitives; it demonstrates an initial set of primitives that participate in several well-known architectural patterns.

Proceedings ArticleDOI
12 Oct 2005
TL;DR: Subtext is a new medium in which the representation of a program is the same thing as its execution, and unifies traditionally distinct programming tools and concepts, and enables some novel ones.
Abstract: Representing programs as text strings makes programming harder then it has to be. The source text of a program is far removed from its behavior. Bridging this conceptual gulf is what makes programming so inhumanly difficult -- we are not compilers. Subtext is a new medium in which the representation of a program is the same thing as its execution. Like a spreadsheet, a program is visible and alive, constantly executing even as it is edited. Program edits are coherent semantic transformations.The essence of this new medium is copying. Programs are constructed by copying and executed by copy flow: the projection of changes through copies. The simple idea of copying develops into a rich theory of higher-order continual copying of trees. Notably absent are symbolic names, the workhorse of textual notation, replaced by immediately-bound explicit relationships. Subtext unifies traditionally distinct programming tools and concepts, and enables some novel ones. Ancestral structures are a new primitive data type that combines the features of lists and records, along with unproblematic multiple inheritance. Adaptive conditionals use first-class program edits to dynamically adapt behavior.A prototype implementation shows promise, but calls for much further research. Subtext suggests that we can make programming radically easier, if we are willing to be radical.

Proceedings ArticleDOI
16 Oct 2005
TL;DR: This article describes how a large multi-team software engineering organization (over 450 engineers) estimates project cost accurately and early in the software development lifecycle using Use Case Points, and the process of evaluating metrics to ensure the accuracy of the model.
Abstract: It is well documented that software product cost estimates are notoriously inaccurate across the software industry. Creating accurate cost estimates for software product development projects early in the product development lifecycle has always been a challenge for the industry. This article describes how a large multi-team software engineering organization (over 450 engineers) estimates project cost accurately and early in the software development lifecycle using Use Case Points, and the process of evaluating metrics to ensure the accuracy of the model.The engineering teams of Agilis Solutions in partnership with FPT Software, provide our customers with accurate estimates for software product projects early in the product lifecycle. The bases for these estimates are initial definitions of Use Cases, given point factors and modified for technical and environmental factors according to the Use Case Point method defined within the Rational Unified Process. After applying the process across hundreds of sizable (60 man-months average) software projects, we have demonstrated metrics that prove an estimating accuracy of less than 9% deviation from actual to estimated cost on 95% of our projects. Our process and this success factor is documented over a period of five years, and across more than 200 projects.

Proceedings ArticleDOI
16 Oct 2005
TL;DR: This short contribution motivates the idea of model-drivenSoftware Product Line Engineering and briefly explains the concepts underlying feature-based model templates, which is a particular technique for modeling software product lines.
Abstract: Model-driven software product lines combine the abstraction capability of Model Driven Software Development (MDSD) and the variability management capability of Software Product Line Engineering (SPLE). This short contribution motivates the idea of model-driven software product lines and briefly explains the concepts underlying feature-based model templates, which is a particular technique for modeling software product lines.

Proceedings ArticleDOI
12 Oct 2005
TL;DR: This paper describes and evaluates techniques for automating two significant activities of vertical profiling: trace alignment and correlation, and identifies highly-effective approaches for both activities.
Abstract: Last year at OOPSLA we presented a methodology, vertical profiling, for understanding the performance of object-oriented programs. The key insight behind this methodology is that modern programs run on top of many layers (virtual machine, middleware, etc) and thus we need to collect and combine information from all layers in order to understand system performance. Although our methodology was able to explain previously unexplained performance phenomena, it was extremely labor intensive. In this paper we describe and evaluate techniques for automating two significant activities of vertical profiling: trace alignment and correlation. Trace alignment aligns traces obtained from separate runs so that one can reason across the traces. We are not aware of any prior approach that effectively and automatically aligns traces. Correlation sifts through hundreds of metrics to find ones that have a bearing on a performance anomaly of interest. In prior work we found that statistical correlation was only sometimes effective. We have identified highly-effective approaches for both activities.For aligning traces we explore dynamic time warping, and for correlation we explore eight correlators based on statistical correlation, distance measures, and piecewise linear segmentation. Although we explore these activities in the context of vertical profiling, both activities are widely applicable in the performance analysis area.

Proceedings Article
07 Oct 2005
TL;DR: FTGreedy, a new contention manager that is able to cope with faulty transactions, is introduced, which has good performance in the face of failures, and provable worst case properties even if transactions can fail, as long as the system features some synchrony assumptions.
Abstract: Software transactional memory (STM) systems use lightweight, in-memory software transactions to address concurrency in multi-threaded applications, ensuring safety at all times. A contention manager is responsible for the system as a whole to make progress (liveness). In this paper, we study the impact of transaction failures on contention management in the context of STM systems. The failures we consider include page faults as well as actual process or thread crashes. We observe that, even with a small number of failures, many of the previously defined contention managers do not behave well, in the average case, and none provides worst case guarantees. We introduce FTGreedy, a new contention manager that is able to cope with faulty transactions. In short, FTGreedy (a) compares well with previous contention managers when no failures occur, (b) has good performance in the face of failures, and (c) provable worst case properties even if transactions can fail, as long as the system features some synchrony assumptions (which need only be weak and eventual).

Proceedings ArticleDOI
16 Oct 2005
TL;DR: A suite of characteristics of future Ambient-Oriented Programming languages are postulated and a simple programming language kernel, called AmbientTalk, that meets these characteristics is subsequently presented.
Abstract: A new field in distributed computing, called Ambient In-telligence, has emerged as a consequence of the increasing availability of wireless devices and the mobile networks they induce. Developing software for such mobile networks is extremely hard in conventional programming languages because the network is dynamically defined. This hardware phenomenon leads us to postulate a suite of characteristics of future Ambient-Oriented Programming languages. A simple re ective programming language kernel, called AmbientTalk, that meets these characteristics is subsequently presented. The power of the re ective kernel is illustrated by using it to conceive a collection of high level tentative ambient-oriented programming language features.

Proceedings ArticleDOI
16 Oct 2005
TL;DR: A new algorithm, namely SDD (Similar Data Detection), is devised that can detect duplicated parts of source code in huge software with high performance and is adequate for large systems and detecting not only the exact but also similar parts ofsource code.
Abstract: Code clones in software increase maintenance cost and lower software quality. We have devised a new algorithm to detect duplicated parts of source code in large software. Our algorithm is adequate for large systems and detecting not only the exact but also similar parts of source code. Our simulation of this new algorithm, namely SDD (Similar Data Detection), indicates that it can detect duplicated parts of source code in huge software with high performance.

Proceedings ArticleDOI
16 Oct 2005
TL;DR: This work is concurrently constructing two artifacts--a Self VM entirely in Self (the Klein VM), and a specialized development environment--with strict adherence to pure object-orientation, metacircularity, heavy code reuse, reactivity, and mirror-based reflection.
Abstract: Can virtual machine developers benefit from religiously observing the principles more often embraced for exploratory programming? To find out, we are concurrently constructing two artifacts--a Self VM entirely in Self (the Klein VM), and a specialized development environment--with strict adherence to pure object-orientation, metacircularity, heavy code reuse, reactivity, and mirror-based reflection. Although neither artifact is yet complete, the environ-ment supports many remote debugging and incremental update operations, and the exported virtual machine has successfully run the whole compiler.As a result of our adherence to these principles, there have been both positive and negative consequences. We have been able to find and exploit many opportunities for parsimony. For example, the very same code creates objects in the bootstrap image, builds ob-jects in the running VM, and implements a remote debugger. On the other hand, we have been forced to expend effort to optimize the performance of the environment. Overall, this approach trades off the performance of the environment against the architectural sim-plicity and ease of development of the resulting VM artifact. As computers continue to improve in performance, we believe that this approach will increase in value.

Proceedings ArticleDOI
12 Oct 2005
TL;DR: A new model of interoperability is presented that builds on the ideas of mirrors and contracts, and an interoperable implementation of Java and Scheme that is guided by the model is described.
Abstract: As a value flows across the boundary between interoperating languages, it must be checked and converted to fit the types and representations of the target language. For simple forms of data, the checks and coercions can be immediate; for higher order data, such as functions and objects, some must be delayed until the value is used in a particular way. Typically, these coercions and checks are implemented by an ad-hoc mixture of wrappers, reflection, and dynamic predicates. We observe that 1) the wrapper and reflection operations fit the profile of mirrors, 2) the checks correspond to contracts, and 3) the timing and shape of mirror operations coincide with the timing and shape of contract operations. Based on these insights, we present a new model of interoperability that builds on the ideas of mirrors and contracts, and we describe an interoperable implementation of Java and Scheme that is guided by the model.

Proceedings ArticleDOI
16 Oct 2005
TL;DR: The Squawk virtual machine is a small Java(TM) VM written in Java that runs without an OS on small devices and implements an isolate mechanism allowing applications to be reified.
Abstract: The Squawk virtual machine is a small Java(TM) VM written in Java that runs without an OS on small devices. Squawk implements an isolate mechanism allowing applications to be reified. Multiple isolates can run in the one VM, and isolates can be migrated between different instances of the VM.

Proceedings ArticleDOI
12 Oct 2005
TL;DR: This paper presents a fully automated architecture for exploiting cross-run profile data in virtual machines, and applies this architecture to address the problem of selective optimization, and describes the implementation in IBM's J9 Java virtual machine.
Abstract: Virtual machines for languages such as the Java programming language make extensive use of online profiling and dynamic optimization to improve program performance. But despite the important role that profiling plays in achieving high performance, current virtual machines discard a program's profile data at the end of execution, wasting the opportunity to use past knowledge to improve future performance. In this paper, we present a fully automated architecture for exploiting cross-run profile data in virtual machines. Our work addresses a number of challenges that previously limited the practicality of such an approach.We apply this architecture to address the problem of selective optimization, and describe our implementation in IBM's J9 Java virtual machine. Our results demonstrate substantial performance improvements on a broad suite of Java programs, with the average performance ranging from 8.8% -- 16.6% depending on the execution scenario.

Proceedings ArticleDOI
12 Oct 2005
TL;DR: This work proposes a generalization of the type constraint mechanisms of C# and Java to both avoid the need for casts in GADT programs and higher-order contortions in PADTs programs, and presents a Visitor pattern for GADTs, and describes a refined switch construct as an alternative to virtual dispatch on datatypes.
Abstract: Generalized algebraic data types (GADTs) have received much attention recently in the functional programming community. They generalize the (type) parameterized algebraic datatypes (PADTs) of ML and Haskell by permitting value constructors to return specific, rather than parametric, type-instantiations of their own datatype. GADTs have a number of applications, including strongly-typed evaluators, generic pretty-printing, generic traversals and queries, and typed LR parsing. We show that existing object-oriented programming languages such as Java and C# can express GADT definitions, and a large class of GADT-manipulating programs, through the use of generics, subclassing, and virtual dispatch. However, some programs can be written only through the use of redundant runtime casts. Moreover, instantiation-specific, yet safe, operations on ordinary PADTs only admit indirect cast-free implementations, via higher-order encodings. We propose a generalization of the type constraint mechanisms of C# and Java to both avoid the need for casts in GADT programs and higher-order contortions in PADT programs; we present a Visitor pattern for GADTs, and describe a refined switch construct as an alternative to virtual dispatch on datatypes. We formalize both extensions and prove type soundness.

Proceedings Article
01 Jan 2005
TL;DR: This paper describes the first results of a project in this area and presents the lessons learnt in this work on exchange of models created in these two major industrial platforms: EMF and Microsoft DSL.
Abstract: Model Driven Engineering is based on a number of principles that may be applied in different contexts. Nowadays several environments employ the MDE principles: Model Driven Architecture (MDA™), Eclipse Modeling Framework (EMF), Microsoft Domain-Specific Language tools (MS/DSL), and many more. Focusing only on one context and ignoring other environments and platforms, based on different conventions, standards or protocols would be unwise because one of the desired properties of models is their ability to be exchanged between different contexts. Due to their abstraction expression level, models should ideally be more adaptable to various operational environments than conventional code. In other words, OMG models and Microsoft models among others should be able to be exchanged between the corresponding environments. In this paper we focus on exchange of models created in these two major industrial platforms: EMF and Microsoft DSL. The capability to exchange models between an EMF and a corresponding MS/DSL based system requires an abstract understanding of both architectures and a precise organization of the interoperability scheme. This paper describes the first results of a project in this area and presents the lessons learnt in this work.