scispace - formally typeset
Open AccessJournal ArticleDOI

Featherweight Java: a minimal core calculus for Java and GJ

Reads0
Chats0
TLDR
This work extends Featherweight Java with generic classes in the style of GJ and gives a detailed proof of type safety, which formalizes for the first time some of the key features ofGJ.
Abstract
Several recent studies have introduced lightweight versions of Java: reduced languages in which complex features like threads and reflection are dropped to enable rigorous arguments about key properties such as type safety. We carry this process a step further, omitting almost all features of the full language (including interfaces and even assignment) to obtain a small calculus, Featherweight Java, for which rigorous proofs are not only possible but easy. Featherweight Java bears a similar relation to Java as the lambda-calculus does to languages such as ML and Haskell. It offers a similar computational "feel," providing classes, methods, fields, inheritance, and dynamic typecasts with a semantics closely following Java's. A proof of type safety for Featherweight Java thus illustrates many of the interesting features of a safety proof for the full language, while remaining pleasingly compact. The minimal syntax, typing rules, and operational semantics of Featherweight Java make it a handy tool for studying the consequences of extensions and variations. As an illustration of its utility in this regard, we extend Featherweight Java with generic classes in the style of GJ (Bracha, Odersky, Stoutamire, and Wadler) and give a detailed proof of type safety. The extended system formalizes for the first time some of the key features of GJ.

read more

Content maybe subject to copyright    Report

P1: IBD
CM026A-03 ACM-TRANSACTION January 23, 2002 17:39
Featherweight Java: A Minimal Core
Calculus for Java and GJ
ATSUSHI IGARASHI
University of Tokyo
BENJAMIN C. PIERCE
University of Pennsylvania
and
PHILIP WADLER
Avaya Labs
Several recent studies have introduced lightweight versions of Java: reduced languages in which
complex features like threads and reflection are dropped to enable rigorous arguments about
key properties such as type safety. We carry this process a step further, omitting almost all fea-
tures of the full language (including interfaces and even assignment) to obtain a small calculus,
Featherweight Java, for which rigorous proofs are not only possible but easy. Featherweight
Java bears a similar relation to Java as the lambda-calculus does to languages such as ML
and Haskell. It offers a similar computational “feel, providing classes, methods, fields, inheri-
tance, and dynamic typecasts with a semantics closely following Java’s. A proof of type safety for
Featherweight Java thus illustrates many of the interesting features of a safety proof for the full
language, while remaining pleasingly compact. The minimal syntax, typing rules, and operational
semantics of Featherweight Java make it a handy tool for studying the consequences of extensions
and variations. As an illustration of its utility in this regard, we extend Featherweight Java with
generic classes in the style of GJ (Bracha, Odersky, Stoutamire, and Wadler) and give a detailed
proof of type safety. The extended system formalizes for the first time some of the key features
of GJ.
Categories and Subject Descriptors: D.3.1 [Programming Languages]: Formal Definitions and
Theory; D.3.2 [Programming Languages]: Language Classifications—Object-oriented languages;
D.3.3 [Programming Languages]: Language Constructs and Features—Classes and objects;
This is a revised and extended version of a paper presented in the Proceedings of the ACM
SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications
(OOPSLA’99), ACM SIGPLAN Notices volume 34 number 10, pages 132–146, October 1999. This
work was done while Igarashi was visting the University of Pennsylvania as a research fellow of the
Japan Society of the Promotion of Science. Pierce was supported by the University of Pennsylvania
and the National Science Foundation under grant CCR-9701826, Principled Foundations for Pro-
gramming with Objects.
Authors’ addresses: A. Igarashi, Department of Graphics and Computer Science, Graduate School
of Arts and Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8902, Japan;
email: igarashi@graco.c.u-tokyo.ac.jp; B. C. Pierce, Department of Computer and Information Sci-
ence, University of Pennsylvania, 200 South 33rd Street, Philadelphia, PA 19104-6389; email:
bcpierce@cis.upenn.edu; P. Wadler, 233 Mount Airy Road, Basking Ridge, NJ 07920; email:
wadler@avaya.com.
Permission to make digital/hard copy of all or part of this material without fee for personal or class-
room use provided that the copies are not made or distributed for profit or commercial advantage,
the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given
that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers,
or to redistribute to lists requires prior specific permission and/or a fee.
C
2001 ACM 0098-3500/01/0500–0396 $5.00
ACM Transactions on Programming Languages and Systems, Vol. 23, No. 3, May 2001, Pages 396–450.

P1: IBD
CM026A-03 ACM-TRANSACTION January 23, 2002 17:39
Featherweight Java
397
Polymorphism; F.3.3 [Logics and Meaning of Programs]: Studies of Program Constructs—
Object-oriented constructs
General Terms: Design, Languages, Theory
Additional Key Words and Phrases: Compilation, generic classes, Java, language design, language
semantics
1. INTRODUCTION
Inside every large language is a small language struggling to get out...
T. Hoare
1
Formal modeling can offer a significant boost to the design of complex real-world
artifacts such as programming languages. A formal model may be used to de-
scribe some aspect of a design precisely, to state and prove its properties, and
to direct attention to issues that might otherwise be overlooked. In formulating
a model, however, there is a tension between completeness and compactness:
The more aspects the model addresses at the same time, the more unwieldy
it becomes. Often it is sensible to choose a model that is less complete but
more compact, offering maximum insight for minimum investment. This strat-
egy may be seen in a flurry of recent papers on the formal properties of Java,
which omit advanced features such as concurrency and reflection and concen-
trate on fragments of the full language to which well-understood theory can
be applied.
We propose Featherweight Java, or FJ, as a new contender for a minimal core
calculus for modeling Java’s type system. The design of FJ favors compactness
over completeness almost obsessively, having just five forms of expression: ob-
ject creation, method invocation, field access, casting, and variables. Its syntax,
typing rules, and operational semantics fit comfortably on a few pages. Indeed,
our aim has been to omit as many features as possible—even assignment—
while retaining the core features of Java typing. There is a direct correspon-
dence between FJ and a purely functional core of Java, in the sense that every
FJ program is literally an executable Java program.
FJ is only a little larger than Church’s lambda calculus [Barendregt 1984]
or Abadi and Cardelli’s object calculus [1996], and is significantly smaller
than previous formal models of class-based languages like Java, including
those put forth by Drossopoulou et al. [1999], Syme [1997], Nipkow and
von Oheimb [1998], and Flatt et al. [1998a; 1998b]. Being smaller, FJ lets
us focus on just a few key issues. For example, we have discovered that
1
We thank Tony Hoare, to whom the first quote below is attributed, for informing us of the second
one:
Inside every large program is a small program struggling to get out...
T. Hoare, Efficient Production of Large Programs (1970)
I’m fat, but I’m thin inside.
Has it ever struck you that there’s a thin man inside every fat man?
—George Orwell, Coming Up For Air (1939)
ACM Transactions on Programming Languages and Systems, Vol. 23, No. 3, May 2001.

P1: IBD
CM026A-03 ACM-TRANSACTION January 23, 2002 17:39
398
A. Igarashi et al.
capturing the behavior of Java’s cast construct in a traditional “small-step”
operational semantics is trickier than we would have expected, a point that
has been overlooked or underemphasized in other models.
One use of FJ is as a starting point for modeling languages that extend Java.
Because FJ is so compact, we can focus attention on essential aspects of the
extension. Moreover, because the proof of soundness for pure FJ is very sim-
ple, a rigorous soundness proof for even a significant extension may remain
manageable. The second part of the article illustrates this utility by enriching
FJ with generic classes and methods `alaGJ [Bracha et al. 1998]. The model
omits some important aspects of GJ (such as “raw types” and type argument
inference for generic method calls). Nonetheless, it led to the discovery and re-
pair of one bug in the GJ compiler and, more importantly, has been a useful
tool in clarifying our thought. Because the model is small, it is easy to con-
template further extensions, and we have begun the work of adding raw types
to the model; so far, this has revealed at least one corner of the design that
was underspecified.
Our main goal in designing FJ was to make a proof of type soundness (“well-
typed programs do not get stuck”) as concise as possible, while still capturing
the essence of the soundness argument for the full Java language. Any lan-
guage feature that made the soundness proof longer without making it sig-
nificantly different was a candidate for omission; we also dropped features
that did not appear to interact with polymorphism in significant ways. As in
previous studies of type soundness in Java, we do not treat advanced mecha-
nisms such as concurrency, inner classes, and reflection. In addition, the Java
features omitted from FJ include assignment, interfaces, overloading, mes-
sages to super, null pointers, base types (int, bool, etc.), abstract method
declarations, shadowing of superclass fields by subclass fields, access control
(public, private, etc.), and exceptions. The features of Java that we do model in-
clude mutually recursive class definitions, object creation, field access, method
invocation, method override, method recursion through this, subtyping,
and casting.
One key simplification in FJ is the omission of assignment. In essence, all
fields and method parameters in FJ are implicitly marked final: we assume
that an object’s fields are initialized by its constructor and never changed after-
ward. This restricts FJ to a “functional” fragment of Java, in which many com-
mon Java idioms, such as use of enumerations, cannot be represented. Nonethe-
less, this fragment is computationally complete (it is easy to encode the lambda
calculus into it), and is large enough to include many useful programs (many of
the programs in Felleisen and Friedman’s Java text [1998] use a purely func-
tional style). Moreover, most of the tricky typing issues in both Java and GJ are
independent of assignment. An important exception is that the type inference
algorithm for generic method invocation in GJ has some twists imposed on it
by the need to maintain soundness in the presence of assignment. This article
treats a simplified version of GJ without type inference.
The remainder of this article is organized as follows. Section 2 intro-
duces the main ideas of Featherweight Java, presents its syntax, type rules,
and reduction rules, and develops a type soundness proof. Section 3 extends
ACM Transactions on Programming Languages and Systems, Vol. 23, No. 3, May 2001.

P1: IBD
CM026A-03 ACM-TRANSACTION January 23, 2002 17:39
Featherweight Java
399
Featherweight Java to Featherweight GJ, which includes generic classes and
methods. Section 4 presents an erasure map from FGJ to FJ, modeling the
techniques used to compile GJ into Java. Section 5 discusses related work, and
Section 6 concludes.
2. FEATHERWEIGHT JAVA
In FJ, a program consists of a collection of class definitions plus an expression
to be evaluated. (This expression corresponds to the body of the main method
in full Java.) Here are some typical class definitions in FJ.
class A extends Object {
A() { super(); }
}
class B extends Object {
B() { super(); }
}
class Pair extends Object {
Object fst;
Object snd;
Pair(Object fst, Object snd) {
super(); this.fst=fst; this.snd=snd;
}
Pair setfst(Object newfst) {
return new Pair(newfst, this.snd);
}
}
For the sake of syntactic regularity, we always (1) include the supertype (even
when it is Object); (2) write out the constructor (even for the trivial classes A
and B); and (3) write the receiver for a field access (as in this.snd) or a method
invocation, even when the receiver is this. Constructors always take the same
stylized form: there is one parameter for each field, with the same name as
the field; the super constructor is invoked on the fields of the supertype; and
the remaining fields are initialized to the corresponding parameters. In this
example the supertype is always Object, which has no fields, so the invocations
of super have no arguments. Constructors are the only place where super or =
appears in an FJ program. Since FJ provides no side-effecting operations, a
method body always consists of return followed by an expression, as in the
body of setfst().
In the context of the above definitions, the expression
new Pair(new A(), new B()).setfst(new B())
evaluates to the expression
new Pair(new B(), new B()).
There are five forms of expression in FJ. Here, new A(), new B(), and
new Pair(e1, e2) are object constructors, and e3.setfst(e4) is a method
ACM Transactions on Programming Languages and Systems, Vol. 23, No. 3, May 2001.

P1: IBD
CM026A-03 ACM-TRANSACTION January 23, 2002 17:39
400
A. Igarashi et al.
invocation. In the body of setfst, the expression this.snd is a field access,
and the occurrences of newfst and this are variables. (The syntax of FJ differs
from Java in that this is a variable rather than a keyword). The remaining
form of expression is a cast. The expression
((Pair)new Pair(new Pair(new A(), new B()), new A()).fst).snd
evaluates to the expression
new B().
Here, ((Pair)e5), where e5 is new Pair(...).fst, is a cast. The cast is required
because e5 is a field access to fst, which is declared to contain an Object,
whereas the next field access, to snd, is only valid on a Pair. At run time, it is
checked whether the Object stored in the fst field is a Pair (and in this case
the check succeeds).
In Java, we may prefix a field or parameter declaration with the keyword
final to indicate that it may not be assigned to, and all parameters accessed
from an inner class must be declared final. Since FJ contains no assignment
and no inner classes, it matters little whether or not final appears, so we omit
it for brevity.
Dropping side effects has a pleasant side effect: evaluation can be easily for-
malized entirely within the syntax of FJ, with no additional mechanisms for
modeling the heap. Moreover, in the absence of side effects, the order in which
expressions are evaluated does not affect the final outcome (modulo nonter-
mination), so we can define the operational semantics of FJ straightforwardly
using a nondeterministic small-step reduction relation, following long-standing
tradition in the lambda calculus. Of course, Java’s call-by-value evaluation
strategy is subsumed by this more general relation, so the soundness properties
we prove for reduction will hold for Java’s evaluation strategy as a special case.
There are three basic computation rules: one for field access, one for method
invocation, and one for casts. Recall that, in the lambda calculus, the beta-
reduction rule for applications assumes that the function is first simplified to
a lambda abstraction. Similarly, in FJ the reduction rules assume the object
operated upon is first simplified to a new expression. Thus, just as the slogan for
the lambda calculus is “everything is a function, here the slogan is “everything
is an object.
The following example shows the rule for field access in action:
new Pair(new A(), new B()).snd new B()
Due to the stylized form for object constructors, we know that the constructor
has one parameter for each field, in the same order that the fields are declared.
Here the fields are fst and snd, and an access to the snd field selects the second
parameter.
Here is the rule for method invocation in action (/ denotes substitution):
new Pair(new A(), new B()).setfst(new B())
·
new B()/newfst,
new Pair(new A(),new B())/this
¸
new Pair(newfst, this.snd)
i.e., new Pair(new B(), new Pair(new A(), new B()).snd)
ACM Transactions on Programming Languages and Systems, Vol. 23, No. 3, May 2001.

Citations
More filters
Book

Types and Programming Languages

TL;DR: This text provides a comprehensive introduction both to type systems in computer science and to the basic theory of programming languages, with a variety of approaches to modeling the features of object-oriented languages.
Journal ArticleDOI

EnerJ: approximate data types for safe and general low-power computation

TL;DR: EnerJ is developed, an extension to Java that adds approximate data types and a hardware architecture that offers explicit approximate storage and computation and allows a programmer to control explicitly how information flows from approximate data to precise data.
Proceedings ArticleDOI

Type-based race detection for Java

TL;DR: This paper presents a static race detection analysis for multithreaded Java programs based on a formal type system capable of capturing many common synchronization patterns, including classes with internal synchronization, classes that require client-side synchronization, and thread-local classes.
Proceedings ArticleDOI

Gradual typing for objects

TL;DR: This paper develops a gradual type system for object-based languages, extending the Ob < : calculus of Abadi and Cardelli, and shows that gradual typing and subtyping are orthogonal and can be combined in a principled fashion.
Book ChapterDOI

ABS: a core language for abstract behavioral specification

TL;DR: A subject reduction property is proved which shows that well-typedness is preserved during execution; in particular, "method not understood" errors do not occur at runtime for well-TYped ABS models.
References
More filters
Book

Types and Programming Languages

TL;DR: This text provides a comprehensive introduction both to type systems in computer science and to the basic theory of programming languages, with a variety of approaches to modeling the features of object-oriented languages.
Journal ArticleDOI

A Syntactic Approach to Type Soundness

TL;DR: A new approach to proving type soundness for Hindley/Milner-style polymorphic type systems by an adaptation of subject reduction theorems from combinatory logic to programming languages and the use of rewriting techniques for the specification of the language semantics is presented.
Book

A Theory of Objects

TL;DR: This book takes a novel approach to the understanding of object-oriented languages by introducing object calculi and developing a theory of objects around them, which covers both the semantics of objects and their typing rules.
Journal ArticleDOI

The |lambda-Calculus

Proceedings ArticleDOI

Making the future safe for the past: adding genericity to the Java programming language

TL;DR: GJ increases expressiveness and safety: code utilizing generic libraries is no longer buried under a plethora of casts, and the corresponding casts inserted by the translation are guaranteed to not fail.