scispace - formally typeset
Open AccessBook ChapterDOI

Altering Java Semantics via Bytecode Manipulation

Reads0
Chats0
TLDR
Jinline makes it possible to inline a method body before, after, or instead of occurrences of language mechanisms within a method, providing appropriate high-level abstractions for fine-grained alterations while offering a good expressive power and a great ease of use.
Abstract
Altering the semantics of programs has become of major interest. This is due to the necessity of adapting existing software, for instance to achieve interoperability between off-the-shelf components. A system allowing such alterations should operate at the bytecode level in order to preserve portability and to be useful for pieces of software whose source code is not available. Furthermore, working at the bytecode level should be done while keeping high-level abstractions so that it can be useful to a wide audience. In this paper, we present Jinline, a tool that operates at load time through bytecode manipulation. Jinline makes it possible to inline a method body before, after, or instead of occurrences of language mechanisms within a method. It provides appropriate high-level abstractions for fine-grained alterations while offering a good expressive power and a great ease of use.

read more

Content maybe subject to copyright    Report

Altering Java Semantics via Bytecode
Manipulation
´
Eric Tanter
1
, Marc S´egura-Devillechaise
2
, Jacques Noe
2
, and Jos´e Piquer
1
1
University of Chile, Computer Science Dept.
Avenida Blanco Encalada 2120, Santiago, Chile,
{etanter,jpiquer}@dcc.uchile.cl
2
Ecole des Mines de Nantes, OCM group
La Chantrerie, 4, rue Alfred Kastler. B.P. 20722,
F-44307 Nantes Cedex 3, France,
{msegura,noye}@emn.fr
Abstract. Altering the semantics of programs has become of major
interest. This is due to the necessity of adapting existing software, for
instance to achieve interoperability between off-the-shelf components. A
system allowing such alterations should operate at the bytecode level in
order to preserve portability and to be useful for pieces of software whose
source code is not available. Furthermore, working at the bytecode level
should be done while keeping high-level abstractions so that it can be
useful to a wide audience. In this paper, we present Jinline, a tool that
operates at load time through bytecode manipulation. Jinline makes it
possible to inline a method body before, after, or instead of occurrences of
language mechanisms within a method. It provides appropriate high-level
abstractions for fine-grained alterations while offering a good expressive
power and a great ease of use.
1 Introduction
Altering the semantics of programs serves many objectives in software engineer-
ing, related to software adaptation. A particular case of software adaptation,
highlighted by Keller and olzle in [1], is to make several off-the-shelf com-
ponents interoperable [2]. To this end, Keller and olzle proposed binary
component adaptation (BCA), a tool for performing coarse-grained alterations
on component binaries. However, coarse-grained alterations, usually limited
to modifications of the interface or of the type hierarchy, may turn out to be
insufficient. Another objective addressed by alteration of program semantics is
that of separation of concerns [3], as emphasized by the work carried out within
the reflection community [4,5,6], and more recently, by the emerging paradigm of
aspect-oriented programming (AOP) [7]. In both cases, an important objective
is to separate the development of the functional core of an application from the
implementation of its non-functional concerns, such as persistency, distribution,
or security. The complete application is then obtained by merging the different
parts together. Such a merging requires to perform fined-grained alterations
D. Batory, C. Consel, and W. Taha (Eds.): GPCE 2002, LNCS 2487, pp. 283–298, 2002.
c
Springer-Verlag Berlin Heidelberg 2002

284
´
E. Tanter et al.
within method bodies. The purpose of the work we present in this paper is to
provide a tool enabling such alterations with the appropriate level of abstraction.
In Java, portable transformation mechanisms require code rewriting. This
usually automated rewriting can be performed on source code or on bytecode.
The Java community has already developed an impressive set of tools trans-
forming source code: AspectJ [8] to support AOP, Sun’s JavaScope project to
instrument source code, a Dylan-like macro system called Java Syntactic Exten-
der [9] and a class-based macro system, OpenJava [10]. Nevertheless, in many
contexts, expecting source code availability is a mistake: off-the-shelf compo-
nents usually ship in binary form, and sophisticated distributed systems, like
mobile agent platforms, usually rely on dynamic class loading. Therefore, while
still interesting in themselves, these tools are not generally applicable. This is
why we claim that transformation tools should operate on bytecode.
Available transformation tools based on bytecode rewriting are usually
inadequate for a wide and generic use. First, most of these tools offer bytecode-
level abstractions. This is inadequate if the tool has to be used by a wide
audience, since precise knowledge of the bytecode language is required. This
point has been addressed by Javassist [11], which offers high-level abstrac-
tions. Though targeted to structural reflection, Javassist can be used to
perform fine-grained alterations. However, in this domain, Javassist suffers from
a limited expressive power and a lack of generality, as we will discuss in section 2.
In this perspective, we propose Jinline, a tool for altering Java semantics.
Jinline operates on bytecode, keeps high-level abstractions, offers a good ex-
pressive power and generality. To summarize, Jinline makes it possible to inline
a method body before, after, or instead of a language mechanism occurrence
1
within a method.
Traditionally, inlining means replacing a call to a function by an instance of
the function body [12]. What Jinline actually does is inserting code or replacing
code. The new code is defined by a method and therefore the inserted code is
conceptually a method call, except that Jinline actually inlines this new method.
Hence, although Jinline cannot be qualified as an inliner, most of its job consists
of inlining pieces of code into others. In addition to this, Jinline provides two
different sets of information:
1. Static information at inlining time. Jinline provides static information
that can be used to drive the inlining process. For instance, in the case of
a message send, it will provide the signature of the invoked method. This
helps to decide whether inlining should occur or not, which method should
be inlined and where (before, after, instead of).
2. Dynamic information at run time. Jinline ensures that the inlined
method will receive as arguments all the useful dynamic information that
1
By language mechanisms we refer to the standard mechanisms offered by the lan-
guage, such as message sending, accessing fields, casting, etc. A language mechanism
occurrence is a particular instance of a language mechanism in a piece of code.

Altering Java Semantics via Bytecode Manipulation 285
can be extracted. This point is very important since it makes the tool partic-
ularly suited for implementing generic extensions, as we will exemplify in the
rest of this paper. In the case of a message send, the dynamic information
includes the method invoked, the method from which the invocation is done,
references to the caller and the callee, in addition to the actual arguments
of the invocation.
Applications of such an alteration tool are manifold. We have already
mentioned the issue of off-the-shelf components integration. Two of the authors
are actually working on an open implementation of a run-time MetaObject
Protocol (MOP), Reflex [13]. Many transformers for the Reflex framework can
be implemented with Jinline, thus increasing its expressiveness with caller-side
interceptions. Jinline is also particularly adapted for implementing custom
extensions and AOP systems.
The rest of this paper is organized as follows: in section 2, we will review
the different Java bytecode manipulation tools and relate our work to them.
In section 3 we will present Jinline, its interface to the outside world and an
overview of its architecture. In section 4 we will present a simple example of
applying Jinline. Section 5 will conclude the paper.
2 An Overview of Bytecode Manipulation Tools
One way of modifying a program is to alter its semantics by using reflection [14,
15]. However, the Java programming language does not provide support for
altering the semantics of programs. Since the class model is closed (class Class
and all the classes of the Reflection API are final), it is not possible to refine
the semantics of language mechanisms by specializing the class model, as can
be done in Smalltalk [16]. Therefore, alterations have to be implemented either
at the virtual machine level, like in VM-based run-time metaobject protocols
like Metaxa [17], Guaran´a [18] and Iguana/J [19] thus sacrificing portability, or
at the code level, through code transformation. We have already discarded the
possibility of operating on source code for reasons of availability of the source
code itself. This is why a number of propositions have been made to transform
bytecode. These propositions differ in terms of the abstraction level of the entities
a user is expected to program with, and in the expressive power or granularity
of the transformations permitted.
2.1 Transformations Based on Bytecode-Level Abstractions
A number of extensions allow programmers to transform classes at load time at
the expense of manipulating abstractions representing bytecode.
BIT [20] suffers from a too restricted scope: it only offers the possibility to
insert before/after methods, but does not address transformation of interfaces
or method bodies.

286
´
E. Tanter et al.
There are several general-purpose implementations of bytecode manipulation
available: BCEL [21], JikesBT [22], and JOIE [23]. All of them translate the
class file data structure into an intermediate representation, allow the user to
perform modifications and to finally regenerate a valid class file data structure
from the transformed intermediate representation. The bytecode-level API of
Javassist [11] could fit into this category although bytecode instructions are not
reified: the programmer is just provided with an iterator over a sequence of
bytes. The main strength of these general-purpose extensions is their expressive
power, since they are able to express anything that can be written in bytecode.
However, their main drawback is to be low-level and therefore difficult to use.
2.2 Transformations Based on Source-Level Abstractions
Metaobject protocols (MOPs) are a natural framework for reifying high-level
language entities [24]. Run-time MOPs are an approach to enable the run-time
alteration of program semantics. Compared to static transformation systems
such as macro systems, inlining systems, and compile-time MOPs –, where the
link between the modifier and the modified entity is merged at some point, run-
time MOPs maintain this link, known as the causal connection link [14,15], at
run time, thus enabling dynamic updates of this link at the expense of a certain
overhead.
Reflex [13] and Kava [25] are run-time MOPs for Java that rely on load-
time insertion of pieces of code (hooks) to transfer control to the metalevel
at run time. These systems are bound to behavioral reflection, which is the
ability of dynamically altering the behavior of objects. This approach is in fact
complementary to static code transformation approaches in cases where dynamic
adaptability or instance-specific alterations are needed (see for instance [26]).
BCA [1] is a bytecode modification tool with a high-level interface, but it
only deals with external interfaces and class hierarchies, ignoring method bod-
ies. Javassist [11] is a mature tool for load-time structural reflection in Java.
Structural reflection is the ability of a program to alter the definitions of data
structures such as classes and methods. With Javassist, the transformations that
can be made are at the granularity of class or members. The main goal achieved
by Javassist is a high-level and easy-to-use interface. To allow finer-grained trans-
formations, Javassist has recently made public its bytecode-level API, which we
mentioned in subsection 2.1. Recall that it lacks a concrete reification of bytecode
instructions. To bridge the gap between its high-level and low-level APIs, Javas-
sist offers a code converter to instrument method bodies through a high-level
interface.
2.3 Limitations of the Code Converter of Javassist
The code converter of Javassist the closest tool to our proposal offers a simple
high-level API to alter method bodies. This API allows inserting before/after
methods, redirecting method invocations or field accesses, and replacing cre-
ations. We claim that its expressiveness is limited and that it lacks generality.

Altering Java Semantics via Bytecode Manipulation 287
Its limited expressiveness is in fact not that much an issue since it can actually
be upgraded, and also, in many cases, it is sufficient to alter such mechanisms
as method invocations, field accesses and object creations. A more annoying
problem is the limitation about the possible transformations: for instance, a field
access can only be replaced by a static method call, and a method invocation
can only be replaced by another method invocation on the same object with the
same parameters.
But all in all, the major drawback of the code converter lies in the fact that
it is not well-suited to designing generic solutions. Since Javassist lacks semantic
information in the process of modifying bytecode (remember that Javassist
does not reify bytecode instructions as such), the possible transformations
are limited. The code converter does not perform any reification of what is
actually occurring. For instance, an object creation can be replaced by a method
invocation, but this method will not receive as argument the name of the class
that was to be instantiated: it has to be specific to a type. This limitation is
common to all transformations.
To illustrate this limitation, consider the following simple example: we want
to set up a factory pattern [27] for instantiating any class in an existing appli-
cation. That is to say, instead of calling directly new, we want to call a factory
method. Designed with generality and extensibility in mind, the factory method
would be:
public Object getInstance(String classname, Object[] args){...}
Then we want to transform all the instantiations so that they call this unique
factory method, for instance:
new Point(1, 2); = Factory.getInstance(”Point, [1, 2]);
This is not feasible with the code converter. The only possible replacement is:
new Point(1, 2); = Factory.getPoint(1, 2);
The following issues come to light:
First, the name of the instantiated class is not passed as a parameter, which
implies that we need a method per class (a getPoint method, a getTriangle
method, etc.).
Second, the arguments are not packed, which means we need a method
per set of parameters (a method getPoint(int, int), another method
getPoint(Point), etc.).
It is easy to see that such an approach is not applicable to real world cases.
What is needed is a tool that can systematically provide runtime information
in a cost-effective manner to the new inserted code. In addition to this, more
flexibility with respect to what code can be inserted is highly appreciable. This
is exactly what Jinline is about.

Citations
More filters
Journal Article

An overview of AspectJ

TL;DR: AspectJ as mentioned in this paper is a simple and practical aspect-oriented extension to Java with just a few new constructs, AspectJ provides support for modular implementation of a range of crosscutting concerns.
Proceedings ArticleDOI

An easy-to-use toolkit for efficient Java bytecode translators

TL;DR: This toolkit uses a custom compiler so that the runtime penalties are minimized and this new compiler support for performance improvement was not included in the previous version.
Proceedings ArticleDOI

Advanced Java bytecode instrumentation

TL;DR: This paper compares different approaches to bytecode instrumentation in Java and comes up with a novel instrumentation framework that goes beyond the aforementioned limitations and generates calling context trees of various platform-independent dynamic metrics.
Proceedings ArticleDOI

Partial behavioral reflection: spatial and temporal selection of reification

TL;DR: This paper exposes the spatial and temporal dimensions of such reflection, and proposes a model of partial behavioral reflection based on the notion of hooksets, and describes a reflective architecture offering appropriate interfaces for static and dynamic configuration of partial Behavioral reflection at various levels.
Proceedings ArticleDOI

Web cache prefetching as an aspect: towards a dynamic-weaving based solution

TL;DR: The μ-Dyner as discussed by the authors approach provides a low overhead for aspect invocation, that meets the performance needs of Web caches, which is a natural technique to address this issue.
References
More filters
Book

Design Patterns: Elements of Reusable Object-Oriented Software

TL;DR: The book is an introduction to the idea of design patterns in software engineering, and a catalog of twenty-three common patterns, which most experienced OOP designers will find out they've known about patterns all along.
Journal ArticleDOI

Aspect-oriented programming

TL;DR: This work proposes to use aspect-orientation to automate the calculation of statistics for database optimization and shows how nicely the update functionality can be modularized in an aspect and how easy it is to specify the exact places and the time when statistics updates should be performed to speed up complex queries.
Book

Smalltalk-80: The Language and its Implementation

TL;DR: This book is the first detailed account of the Smalltalk-80 system and is divided into four major parts: an overview of the concepts and syntax of the programming language, a specification of the system's functionality, and an example of the design and implementation of a moderate-size application.
Journal Article

An overview of AspectJ

TL;DR: AspectJ as mentioned in this paper is a simple and practical aspect-oriented extension to Java with just a few new constructs, AspectJ provides support for modular implementation of a range of crosscutting concerns.
Book ChapterDOI

An Overview of AspectJ

TL;DR: AspectJ provides support for modular implementation of a range of crosscutting concerns, and simple extensions to existing Java development environments make it possible to browse the crosscutting structure of aspects in the same kind of way as one browses the inheritance structure of classes.
Frequently Asked Questions (9)
Q1. What is the function that is responsible for inlining the method?

A MethodInliner is responsible for inlining the method in a semantically correct manner:– if the inlined method expects dynamic information, it first inserts a prologue that does the wrapping of all the parameters. 

Jinline only takes care of wrapping and unwrapping primitive types and exceptions, which is actually the only thing it can do systematically. 

For the inlining part, a MethodParser is responsible for parsing a method body and notifying the appropriate Jinlers whenever needed. 

The purpose of the work the authors present in this paper is to provide a tool enabling such alterations with the appropriate level of abstraction. 

Compared to static transformation systems – such as macro systems, inlining systems, and compile-time MOPs –, where the link between the modifier and the modified entity is merged at some point, runtime MOPs maintain this link, known as the causal connection link [14,15], at run time, thus enabling dynamic updates of this link at the expense of a certain overhead. 

Since Javassist lacks semantic information in the process of modifying bytecode (remember that Javassist does not reify bytecode instructions as such), the possible transformations are limited. 

In addition to this, choosing to inline methods provides us with a natural way to pass dynamic information at run time to the inlined piece of code: all relevant information is packed and passed as argument of the inlined method. 

The only possible replacement is:new Point(1, 2); =⇒ Factory.getPoint(1, 2);The following issues come to light:– First, the name of the instantiated class is not passed as a parameter, which implies that the authors need a method per class (a getPointmethod, a getTriangle method, etc.). 

The initialization work in this case simply consists of telling the Jinliner that it should notify the Jinler upon occurrences of constructor sends (1).