Featherweight Java: a minimal core calculus for Java and GJ

doi:10.1145/503502.503505

P1: IBD

CM026A-03 ACM-TRANSACTION January 23, 2002 17:39

Featherweight Java: A Minimal Core

Calculus for Java and GJ

ATSUSHI IGARASHI

University of Tokyo

BENJAMIN C. PIERCE

University of Pennsylvania

and

PHILIP WADLER

Avaya Labs

Several recent studies have introduced lightweight versions of Java: reduced languages in which

complex features like threads and reﬂection are dropped to enable rigorous arguments about

key properties such as type safety. We carry this process a step further, omitting almost all fea-

tures of the full language (including interfaces and even assignment) to obtain a small calculus,

Featherweight Java, for which rigorous proofs are not only possible but easy. Featherweight

Java bears a similar relation to Java as the lambda-calculus does to languages such as ML

and Haskell. It offers a similar computational “feel,” providing classes, methods, ﬁelds, inheri-

tance, and dynamic typecasts with a semantics closely following Java’s. A proof of type safety for

Featherweight Java thus illustrates many of the interesting features of a safety proof for the full

language, while remaining pleasingly compact. The minimal syntax, typing rules, and operational

semantics of Featherweight Java make it a handy tool for studying the consequences of extensions

and variations. As an illustration of its utility in this regard, we extend Featherweight Java with

generic classes in the style of GJ (Bracha, Odersky, Stoutamire, and Wadler) and give a detailed

proof of type safety. The extended system formalizes for the ﬁrst time some of the key features

of GJ.

Categories and Subject Descriptors: D.3.1 [Programming Languages]: Formal Deﬁnitions and

Theory; D.3.2 [Programming Languages]: Language Classiﬁcations—Object-oriented languages;

D.3.3 [Programming Languages]: Language Constructs and Features—Classes and objects;

This is a revised and extended version of a paper presented in the Proceedings of the ACM

SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications

(OOPSLA’99), ACM SIGPLAN Notices volume 34 number 10, pages 132–146, October 1999. This

work was done while Igarashi was visting the University of Pennsylvania as a research fellow of the

Japan Society of the Promotion of Science. Pierce was supported by the University of Pennsylvania

and the National Science Foundation under grant CCR-9701826, Principled Foundations for Pro-

gramming with Objects.

Authors’ addresses: A. Igarashi, Department of Graphics and Computer Science, Graduate School

of Arts and Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8902, Japan;

email: igarashi@graco.c.u-tokyo.ac.jp; B. C. Pierce, Department of Computer and Information Sci-

ence, University of Pennsylvania, 200 South 33rd Street, Philadelphia, PA 19104-6389; email:

bcpierce@cis.upenn.edu; P. Wadler, 233 Mount Airy Road, Basking Ridge, NJ 07920; email:

wadler@avaya.com.

Permission to make digital/hard copy of all or part of this material without fee for personal or class-

room use provided that the copies are not made or distributed for proﬁt or commercial advantage,

the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given

that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers,

or to redistribute to lists requires prior speciﬁc permission and/or a fee.

C



2001 ACM 0098-3500/01/0500–0396 $5.00

ACM Transactions on Programming Languages and Systems, Vol. 23, No. 3, May 2001, Pages 396–450.

P1: IBD

CM026A-03 ACM-TRANSACTION January 23, 2002 17:39

Featherweight Java

•

397

Polymorphism; F.3.3 [Logics and Meaning of Programs]: Studies of Program Constructs—

Object-oriented constructs

General Terms: Design, Languages, Theory

Additional Key Words and Phrases: Compilation, generic classes, Java, language design, language

semantics

1. INTRODUCTION

“Inside every large language is a small language struggling to get out...”

T. Hoare

1

Formal modeling can offer a signiﬁcant boost to the design of complex real-world

artifacts such as programming languages. A formal model may be used to de-

scribe some aspect of a design precisely, to state and prove its properties, and

to direct attention to issues that might otherwise be overlooked. In formulating

a model, however, there is a tension between completeness and compactness:

The more aspects the model addresses at the same time, the more unwieldy

it becomes. Often it is sensible to choose a model that is less complete but

more compact, offering maximum insight for minimum investment. This strat-

egy may be seen in a ﬂurry of recent papers on the formal properties of Java,

which omit advanced features such as concurrency and reﬂection and concen-

trate on fragments of the full language to which well-understood theory can

be applied.

We propose Featherweight Java, or FJ, as a new contender for a minimal core

calculus for modeling Java’s type system. The design of FJ favors compactness

over completeness almost obsessively, having just ﬁve forms of expression: ob-

ject creation, method invocation, ﬁeld access, casting, and variables. Its syntax,

typing rules, and operational semantics ﬁt comfortably on a few pages. Indeed,

our aim has been to omit as many features as possible—even assignment—

while retaining the core features of Java typing. There is a direct correspon-

dence between FJ and a purely functional core of Java, in the sense that every

FJ program is literally an executable Java program.

FJ is only a little larger than Church’s lambda calculus [Barendregt 1984]

or Abadi and Cardelli’s object calculus [1996], and is signiﬁcantly smaller

than previous formal models of class-based languages like Java, including

those put forth by Drossopoulou et al. [1999], Syme [1997], Nipkow and

von Oheimb [1998], and Flatt et al. [1998a; 1998b]. Being smaller, FJ lets

us focus on just a few key issues. For example, we have discovered that

1

We thank Tony Hoare, to whom the ﬁrst quote below is attributed, for informing us of the second

one:

Inside every large program is a small program struggling to get out...

— T. Hoare, Efﬁcient Production of Large Programs (1970)

I’m fat, but I’m thin inside.

Has it ever struck you that there’s a thin man inside every fat man?

—George Orwell, Coming Up For Air (1939)

ACM Transactions on Programming Languages and Systems, Vol. 23, No. 3, May 2001.

P1: IBD

CM026A-03 ACM-TRANSACTION January 23, 2002 17:39

398

•

A. Igarashi et al.

capturing the behavior of Java’s cast construct in a traditional “small-step”

operational semantics is trickier than we would have expected, a point that

has been overlooked or underemphasized in other models.

One use of FJ is as a starting point for modeling languages that extend Java.

Because FJ is so compact, we can focus attention on essential aspects of the

extension. Moreover, because the proof of soundness for pure FJ is very sim-

ple, a rigorous soundness proof for even a signiﬁcant extension may remain

manageable. The second part of the article illustrates this utility by enriching

FJ with generic classes and methods `alaGJ [Bracha et al. 1998]. The model

omits some important aspects of GJ (such as “raw types” and type argument

inference for generic method calls). Nonetheless, it led to the discovery and re-

pair of one bug in the GJ compiler and, more importantly, has been a useful

tool in clarifying our thought. Because the model is small, it is easy to con-

template further extensions, and we have begun the work of adding raw types

to the model; so far, this has revealed at least one corner of the design that

was underspeciﬁed.

Our main goal in designing FJ was to make a proof of type soundness (“well-

typed programs do not get stuck”) as concise as possible, while still capturing

the essence of the soundness argument for the full Java language. Any lan-

guage feature that made the soundness proof longer without making it sig-

niﬁcantly different was a candidate for omission; we also dropped features

that did not appear to interact with polymorphism in signiﬁcant ways. As in

previous studies of type soundness in Java, we do not treat advanced mecha-

nisms such as concurrency, inner classes, and reﬂection. In addition, the Java

features omitted from FJ include assignment, interfaces, overloading, mes-

sages to super, null pointers, base types (int, bool, etc.), abstract method

declarations, shadowing of superclass ﬁelds by subclass ﬁelds, access control

(public, private, etc.), and exceptions. The features of Java that we do model in-

clude mutually recursive class deﬁnitions, object creation, ﬁeld access, method

invocation, method override, method recursion through this, subtyping,

and casting.

One key simpliﬁcation in FJ is the omission of assignment. In essence, all

ﬁelds and method parameters in FJ are implicitly marked final: we assume

that an object’s ﬁelds are initialized by its constructor and never changed after-

ward. This restricts FJ to a “functional” fragment of Java, in which many com-

mon Java idioms, such as use of enumerations, cannot be represented. Nonethe-

less, this fragment is computationally complete (it is easy to encode the lambda

calculus into it), and is large enough to include many useful programs (many of

the programs in Felleisen and Friedman’s Java text [1998] use a purely func-

tional style). Moreover, most of the tricky typing issues in both Java and GJ are

independent of assignment. An important exception is that the type inference

algorithm for generic method invocation in GJ has some twists imposed on it

by the need to maintain soundness in the presence of assignment. This article

treats a simpliﬁed version of GJ without type inference.

The remainder of this article is organized as follows. Section 2 intro-

duces the main ideas of Featherweight Java, presents its syntax, type rules,

and reduction rules, and develops a type soundness proof. Section 3 extends

ACM Transactions on Programming Languages and Systems, Vol. 23, No. 3, May 2001.

P1: IBD

CM026A-03 ACM-TRANSACTION January 23, 2002 17:39

Featherweight Java

•

399

Featherweight Java to Featherweight GJ, which includes generic classes and

methods. Section 4 presents an erasure map from FGJ to FJ, modeling the

techniques used to compile GJ into Java. Section 5 discusses related work, and

Section 6 concludes.

2. FEATHERWEIGHT JAVA

In FJ, a program consists of a collection of class deﬁnitions plus an expression

to be evaluated. (This expression corresponds to the body of the main method

in full Java.) Here are some typical class deﬁnitions in FJ.

class A extends Object {

A() { super(); }

}

class B extends Object {

B() { super(); }

}

class Pair extends Object {

Object fst;

Object snd;

Pair(Object fst, Object snd) {

super(); this.fst=fst; this.snd=snd;

}

Pair setfst(Object newfst) {

return new Pair(newfst, this.snd);

}

For the sake of syntactic regularity, we always (1) include the supertype (even

when it is Object); (2) write out the constructor (even for the trivial classes A

and B); and (3) write the receiver for a ﬁeld access (as in this.snd) or a method

invocation, even when the receiver is this. Constructors always take the same

stylized form: there is one parameter for each ﬁeld, with the same name as

the ﬁeld; the super constructor is invoked on the ﬁelds of the supertype; and

the remaining ﬁelds are initialized to the corresponding parameters. In this

example the supertype is always Object, which has no ﬁelds, so the invocations

of super have no arguments. Constructors are the only place where super or =

appears in an FJ program. Since FJ provides no side-effecting operations, a

method body always consists of return followed by an expression, as in the

body of setfst().

In the context of the above deﬁnitions, the expression

new Pair(new A(), new B()).setfst(new B())

evaluates to the expression

new Pair(new B(), new B()).

There are ﬁve forms of expression in FJ. Here, new A(), new B(), and

new Pair(e1, e2) are object constructors, and e3.setfst(e4) is a method

ACM Transactions on Programming Languages and Systems, Vol. 23, No. 3, May 2001.

P1: IBD

CM026A-03 ACM-TRANSACTION January 23, 2002 17:39

400

•

A. Igarashi et al.

invocation. In the body of setfst, the expression this.snd is a ﬁeld access,

and the occurrences of newfst and this are variables. (The syntax of FJ differs

from Java in that this is a variable rather than a keyword). The remaining

form of expression is a cast. The expression

((Pair)new Pair(new Pair(new A(), new B()), new A()).fst).snd

evaluates to the expression

new B().

Here, ((Pair)e5), where e5 is new Pair(...).fst, is a cast. The cast is required

because e5 is a ﬁeld access to fst, which is declared to contain an Object,

whereas the next ﬁeld access, to snd, is only valid on a Pair. At run time, it is

checked whether the Object stored in the fst ﬁeld is a Pair (and in this case

the check succeeds).

In Java, we may preﬁx a ﬁeld or parameter declaration with the keyword

final to indicate that it may not be assigned to, and all parameters accessed

from an inner class must be declared final. Since FJ contains no assignment

and no inner classes, it matters little whether or not final appears, so we omit

it for brevity.

Dropping side effects has a pleasant side effect: evaluation can be easily for-

malized entirely within the syntax of FJ, with no additional mechanisms for

modeling the heap. Moreover, in the absence of side effects, the order in which

expressions are evaluated does not affect the ﬁnal outcome (modulo nonter-

mination), so we can deﬁne the operational semantics of FJ straightforwardly

using a nondeterministic small-step reduction relation, following long-standing

tradition in the lambda calculus. Of course, Java’s call-by-value evaluation

strategy is subsumed by this more general relation, so the soundness properties

we prove for reduction will hold for Java’s evaluation strategy as a special case.

There are three basic computation rules: one for ﬁeld access, one for method

invocation, and one for casts. Recall that, in the lambda calculus, the beta-

reduction rule for applications assumes that the function is ﬁrst simpliﬁed to

a lambda abstraction. Similarly, in FJ the reduction rules assume the object

operated upon is ﬁrst simpliﬁed to a new expression. Thus, just as the slogan for

the lambda calculus is “everything is a function,” here the slogan is “everything

is an object.”

The following example shows the rule for ﬁeld access in action:

new Pair(new A(), new B()).snd → new B()

Due to the stylized form for object constructors, we know that the constructor

has one parameter for each ﬁeld, in the same order that the ﬁelds are declared.

Here the ﬁelds are fst and snd, and an access to the snd ﬁeld selects the second

parameter.

Here is the rule for method invocation in action (/ denotes substitution):

new Pair(new A(), new B()).setfst(new B())

→

·

new B()/newfst,

new Pair(new A(),new B())/this

¸

new Pair(newfst, this.snd)

i.e., new Pair(new B(), new Pair(new A(), new B()).snd)

ACM Transactions on Programming Languages and Systems, Vol. 23, No. 3, May 2001.

Featherweight Java: a minimal core calculus for Java and GJ

Figures

Citations

Types and Programming Languages

EnerJ: approximate data types for safe and general low-power computation

Type-based race detection for Java

Gradual typing for objects

ABS: a core language for abstract behavioral specification

References

Types and Programming Languages

A Syntactic Approach to Type Soundness

A Theory of Objects

The |lambda-Calculus

Making the future safe for the past: adding genericity to the Java programming language

Related Papers (5)

Types and Programming Languages

A Syntactic Approach to Type Soundness

Design Patterns: Elements of Reusable Object-Oriented Software

A Theory of Objects

An Overview of AspectJ