scispace - formally typeset
Open AccessProceedings ArticleDOI

Implementation of the typed call-by-value λ-calculus using a stack of regions

TLDR
The translation is proved correct with respect to a store semantics, which models as a region-based run-time system for polymorphically typed call-by-value λ-calculus.
Abstract
We present a translation scheme for the polymorphically typed call-by-value l-calculus. All runtime values, including function closures, are put into regions. The store consists of a stack of regions. Region inference and effect inference are used to infer where regions can be allocated and de-allocated. Recursive functions are handled using a limited form of polymorphic recursion. The translation is proved correct with respect to a store semantics, which models as a region-based run-time system. Experimental results suggest that regions tend to be small, that region allocation is frequent and that overall memory demands are usually modest, even without garbage collection.

read more

Content maybe subject to copyright    Report

Implementation of the Typed Call-by-Value λ-calculus using a Stack of
Regions
Mads Tofte, University of Copenhagen
Jean-Pierre Talpin, European Computer-Industry Research Center
Abstract
We present a translation scheme for the polymorphi-
cally typed call-by-value λ-calculus. All runtime val-
ues, including function clo sures, are put into re gions.
The stor e consists of a stack of regions . Region in-
ference and eect inference ar e used to infer where
regions can be allocated and de-allocated. Recursive
functions are handled using a limited form of polymor-
phic recursion. The translation is proved correct with
respect to a store semantics, which models a region-
based run-time system. Experimental re sults suggest
that regions tend to be small, that region allocation is
frequent and that overall memory demands are usually
modest, even without garbage collection.
1 Introduction
The stack allocation scheme for block-structured
languages[9] often gives economical use of memory re-
sources. Part of the reason for this is that the stack
discipline is e ager to reuse dead memory locations (i.e.
locations, whose contents is of no consequence to the
rest of the computation). Every point of allocation is
matched by a point of de-allo cation and these points
can easily be identified in the source progr am.
In heap-based storage management schemes[4,19,
18], allocation is separa te from de-allocatio n, the latter
being handled by garbage collection. This separation
is useful when the lifetime of values is not a pparent
Postal address: Department of Computer Science (DIKU),
University of Copenhagen, Universitetsparken 1, DK-2100
Copenhagen Ø, Denmark; email: tofte@diku.dk.
Work done while at Ecole des Mines de Paris. Cur-
rent address: European Computer-Industry Research Center
(ECRC GmbH), Arabella Straße 17, D-81925 M¨unchen; email:
jp@ecrc.de
Copyright 1994 ACM. Appeared in the Proceedings of
the 21st Annual ACM SIGPLAN-SIGACT Symposium on
Principles of Programmin g Languages, January 1994, pp.
188–201. Permission to copy without fee all or part of
this material is granted provided that the copies are not
made or distributed for direct commercial advantage, the
ACM copyright notice and t he title of the publication and
its date appear, and notice is given that copying is by
permission of the Asso ciation for Computing Mach inery.
To copy otherwise, or to republish, requires a fee and/or
sp ecific permission.
from the source program. Heap-based schemes a re less
eager to reuse memory. Generational garbage collec-
tion collects young objects when the allocatio n space
is used up. Hayes[1 1] discusses how to reclaim large,
old objects.
Garbage collection can be very fast. Indeed, there
is a much quoted argument that the amortized cost of
copying garbage collection tends to zero, as memory
tends to infinity[2, page 206]. Novice functional pro-
grammers often report that on their machines, mem-
ory is a constant, not a variable, and that this constant
has to be uncomfortably large for their progra ms to
run well. The practical ambition of our work is to
reduce the required size of this constant significantly.
We shall pr esent measurements that indicate that our
translation scheme holds some promise in this respect.
In this paper, we propose a translation scheme for
Milner’s ca ll-by-value λ-calculus with recursive func-
tions and polymorphic let[22,7]. The key feature s of
our scheme are:
1. It determines lexically scoped lifetimes for all run-
time values, including function closures, base val-
ues and reco rds;
2. It is provably safe;
3. It is able to distinguish the lifetimes of dierent
invocations of the same recursive function;
This last feature is essential for obtaining good mem-
ory usage (see Section 5).
Our model of the runtime system involves a stack
of re gions, se e Figure 1. We do not expect always to
be able to determine the size of a region when we allo -
cate it. Part of the reason for this is that we consider
recursive datatypes, such as lists, a must; the size of
a region which is supposed to hold the spine of a list,
say, cannot in gener al be determined when the region
is allocated. Therefore, not all reg ions can be allo-
cated on a hardware stack, although regions of known
size can.
Our allocation scheme diers from the classical
stack allocation scheme in that it admits functions as
first-class values and is intended to work for recursive
datatypes. (So far, the only recurs ive datatype we
have dealt with is lists.)
188

r
0
r
1
r
2
r
3
. . .
Figure 1: The store is a stack of regions; a region is a
box in the picture.
Ruggieri and Murtagh[28] propose a s tack of regions
in conjunction with a traditional heap. Each region is
associated with an activation record (this is not nece s-
sarily the case in our scheme). They use a combination
of interprocedural and intraprocedural data-flow anal-
ysis to find suitable regions to put values in. We use a
type-inference based analysis. They consider updates,
which we do not. However, we deal with polymor-
phism and higher-order functions, which they do not.
Inoue et al.[15] present an interesting technique for
compile-time analysis of runtime garbage cells in lists.
Their method inserts pairs of HOLD a nd RECLAIMη
instructions in the target language. HOLD holds on
to a pointer, p say, to the roo t cell of its arg ument
and RECLAIMη collects those cell s that are reachable
from p and fit the path description η. HOLD and RE-
CLAIM pairs are nested, so the HOLD pointer s can be
held in a stack, not entirely unlike our stack of regions.
In our scheme, however, the unit of collection is one
entire region, i.e., there is no traversal of values in con-
nection with region collection. The path descriptions
of Inoue et al. make it possible to distinguish between
the individual members of a list. This is not possible
in our scheme, as we treat all the elements of the same
list as equal. Inoue et al. report a 100% reclamation
rate for garbage list cells produced by Quicksort[15,
page 575]. We obtain a 100% reclamation rate (but
for 1 word) fo r all garbage produced by Quicksort,
without ga rbage collection (see Section 5).
Hudak[13] describes a reference counting scheme for
a first-or der call-by-value functional language. Refer-
ence counting may give more precise use information,
than our scheme, as we only distinguish between no
use” and “perhaps some use.”
George[10] descr ibes an implementation sche m e
for typed lambda e xpressions in so-called si mple form
together with a transformation of expressions into sim-
ple form. The transformation can result in an increase
in the number of evaluation steps by an arbitrarily
large factor[10, page 6 18]. George also presents an
implementation scheme which doe s not involve trans-
lation, although this relies on not using call-by-value
reduction, when actual parameters are functions.
We translate every well-typed source lang uage ex-
pression, e, into a target language expression, e
!
, which
is identical with e, except for certain region annota-
tions. The evaluation of e
!
corresponds, step for step,
to the evaluation of e. Two forms of anno tations are
e
1
at ρ
letregion ρ in e
2
end
The first for m is used whenever e
1
is an expression
which directly produces a value. (Constant expres-
sions, λ-abstractions and tuple express ions fall into
this category.) The ρ is a region variable; it indicates
that the value of e
1
is to be put in the region bound
to ρ.
The second form introduces a regi on variable ρ with
local scope e
2
. At runtime, first an unused region, r,
is allocated and bound to ρ. Then e
2
is evaluated
(probably using r). Finally, r is de-allocated. The
letregion expressio n is the only way of introducing
and eliminating reg ions. Hence regions are allocated
and de-allocated in a stack-like manner.
The device we use for groupin g values according
to regions is unification of region variables, using es-
sentially the idea of Baker[3], namely that two value-
producing expressions e
1
and e
2
should be given the
same at ρ annotation, if and only if type check-
ing, directly or indirectly, unifies the type of e
1
and
e
2
. Baker does not prove safety, however, nor does he
deal with polymorphism.
To obtain good separation of lifetimes, we introduce
explicit region p olymorphism, by which we mean that
regions can be given as arguments to functions at run-
time. For example, the successor function succ =
λx.x + 1 is compiled into
Λ[ρ,ρ
!
].λx.letregion ρ
!!
in (x + (1 at ρ
!!
)) at ρ
!
end
which has the type scheme
ρ, ρ
!
.(int, ρ)
{get(ρ),put(ρ
!
)}
(int, ρ
!
)
meaning that, for any ρ and ρ
!
, the function ac c e p t s an
integer at ρ and produces an integer at ρ
!
(performing
a get operation on reg ion ρ and a put operation on
region ρ
!
in the process). Now succ will put its result
in diere nt r egions, depending on the context:
··· succ[ρ
12
, ρ
9
](5 at ρ
12
) ··· succ[ρ
1
, ρ
4
](x)
189

Moreover, we make the special provision that a recur-
sive function, f, can call itself with region arguments
which are dierent from its formal region parameters
and which m ay well be local to the body of the recur-
sive function. Such local regions resemble the activa-
tion records of the classical stack discipline.
We use eect inference[20,21,14] to find out where to
wrap letregion ρ in . . . end around an expression.
Most work on eect inference uses the word “eect” as
a short-hand for “s ide-eect”. We have no side-eects
in our sour ce language our eects are side-eects
relative to an underlying region-based store model.
The idea that eect inference makes it possible to
delimit regions of memory and delimit their lifetimes
goes back to early work on eect systems[6]. Lucassen
and Giord[21] call it eect masking; they prove that
eect masking is sound with respect to a stor e se-
mantics where regions ar e not reused. Talpin[29] and
Talpin and Jouvelot[30] pr esent a polymorphic eect
system with eect masking and prove that it is s ound,
with respect to a store semantics where regions are
not reused.
We have found the notion of memory reuse surpris-
ingly subtle, due to, among other things, pointers into
de-allocated regions. Since memory reuse is at the
heart of our tra nslation scheme, we prove that our
translation r ule s are sound with respect to a re gion-
based operational semantics, where regions explicitly
are allocated and de-allocated. This is the main tech-
nical contribution of this paper .
The rest of this paper is org anised as follows. The
source and tar get languages are presented in Section 2.
The translation scheme is presented in Section 3. The
correctness proof is in Section 4. In Section 5 we dis-
cuss strengths and weaknesses of the translation and
give expe rimental results.
Due to space limitations, most proofs have been
omitted. Detailed proofs (and an inference algorithm)
are available in a technical report[31].
2 Source and target languages
2.1 Source language
We assume a denumerably infinite set Var of variables.
Each variable is either an ordinary varia ble, x, or a
letrec variable, f . The grammar for the source lan-
guage is
1
e ::= x | f | λx.e | e
1
e
2
| let x = e
1
in e
2
end
| letrec f(x) = e
1
in e
2
end
A finite map is a map with finite domain. The
domain and range of a finite map f are denoted
1
For brevity, we omit pairs and projections from this paper.
They are treated in [ 31].
Dom(f) and Rng(f), respectively. When f and g
are finite ma ps, f + g is the finite ma p whose do-
main is Dom(f ) Dom(g) and whose value is g(x), if
x Dom(g), and f(x) otherwise. f A means the
restriction of f to A.
A ( non-recursive) closure is a triple 'x, e, E(, where
E is an environment, i.e. a finite map from variables to
values. A (recursive) closu r e takes the form 'x, e, E, f (
where f is the name of the function in question. A
value is either an integer or a closure. Evaluation rules
appear below.
Source Expressions
E ) e v
E(x) = v
E ) x v
E(f) = v
E ) f v
(1)
E ) λx.e 'x, e, E(
(2)
E ) e
1
'x
0
, e
0
, E
0
( E ) e
2
v
2
E
0
+ {x
0
*→ v
2
} ) e
0
v
E ) e
1
e
2
v
(3)
E ) e
1
'x
0
, e
0
, E
0
, f( E ) e
2
v
2
E
0
+ {f *→ 'x
0
, e
0
, E
0
, f(} + {x
0
*→ v
2
} ) e
0
v
E ) e
1
e
2
v
(4)
E ) e
1
v
1
E + {x *→ v
1
} ) e
2
v
E ) le t x = e
1
in e
2
end v
(5)
E + {f *→ 'x, e
1
, E, f(} ) e
2
v
E ) le trec f (x) = e
1
in e
2
end v
(6)
2.2 Target language
Let ρ range over a denumerably infinite set RegVa r of
region varia bles. Let r range over a denumera bly in-
finite set RegName = {r1, r2, . . .} of region names.
Region names serve to identify regions uniquely at
runtime, i.e. a store, s, is a finite map from reg ion
names to regions. Let p range over the set of pl a c es, a
place being either a region variable or a region name.
The grammar for the target language is:
p ::= ρ | r
e ::= x | λx.e at p | e
1
e
2
| let x = e
1
in e
2
end
| letrec f[$ρ ] (x) at p = e
1
in e
2
end
| f [$p ] at p
| letregion ρ in e end
190

where $ρ ranges over finite sequences ρ
1
,···,ρ
k
of region
variables and $p ran g es over finite sequences p
1
,···,p
k
of places (k 0). We write |$p | for the length of a se-
quence $p. For any finite set {ρ
1
, . . . , ρ
k
} of region vari-
ables (k 0), we write letregio n $ρ in e end for
letregion ρ
1
in ··· letregion ρ
k
in e end ···
end
A region is a finite map from o sets, o, to storable
values. A storable value, sv, is either an integer or a
closure. A (plain) closure is a triple 'x, e, VE(, where
e is a target expression and VE is a variable e nviron-
ment, i.e. a finite map from varia bles to addresses. A
region clo sure takes the form '$ρ, x, e, VE( where $ρ is a
(possible empty) sequence ρ
1
,···,ρ
k
of distinct regio n
variables, called the formal region parameters of the
closure. Region closure s represent region-polymorphic
functions. For any sequence $p = p
1
,···,p
k
, the s imul-
taneous substitution of p
i
for free occurre nces of ρ
i
in
e (i = 1 . . . k), is written e[ $p/$ρ ].
For simplicity, we assume that all values are boxed.
Hence a value v is an address a = (r, o), where r is a
region name and o is an o set.
A region-based operational semantics appears be-
low. We are brief about indirect addressing. Thus,
whenever a is an address (r, o), we write s(a) to mean
s(r)(o) and we write a Dom(s) as a shorthand for
r Dom(s) and o Dom(s(r)). Similarly, when s is a
store and sv is a storable value, we write s + {(r, o) *→
sv} as a shorthand for s + {r *→ (s(r) + {o *→ sv})}.
We express the popping of an entire region r from a
store s by writing s \\{r}”, which formally means the
restriction of s to Dom(s) \ {r}.
Target Expressions
s, VE ) e v, s
!
VE(x) = v
s, VE ) x v, s
(7)
VE(f) = a, s(a) = '$ρ, x, e, VE
0
( |$ρ | = | $p |
o / Dom(s(r)) sv = 'x, e[ $p/$ρ ], VE
0
(
s, VE ) f[$p ] at r (r, o), s + {(r, o) *→ sv}
(8)
o / Dom(s(r)) a = (r, o)
s, VE ) λx.e at r a, s + {a *→ 'x, e, VE(}
(9)
s, VE ) e
1
a
1
, s
1
s
1
(a
1
) = 'x
0
, e
0
, VE
0
(
s
1
, VE ) e
2
v
2
, s
2
s
2
, VE
0
+ {x
0
*→ v
2
} ) e
0
v, s
!
s, VE ) e
1
e
2
v, s
!
(10)
s, VE ) e
1
v
1
, s
1
s
1
, VE + {x *→ v
1
} ) e
2
v, s
!
s, VE ) let x = e
1
in e
2
end v, s
!
(11)
o / Dom(s(r)) VE
!
= VE + { f *→ (r, o)}
s + {(r, o) *→ '$ρ , x, e
1
, VE
!
(}, VE
!
) e
2
v, s
!
s, VE ) letrec f[$ρ ](x) at r = e
1
in e
2
end v, s
!
(12)
r / Dom(s) s + {r *→ {}}, VE ) e[r/ρ] v, s
1
s, VE ) letregion ρ in e end v, s
1
\\ {r}
(13)
For arbitrary finite maps f
1
and f
2
, we say that f
2
extends f
1
, written f
1
f
2
, if Dom(f
1
) Dom(f
2
)
and for all x Dom(f
1
), f
1
(x) = f
2
(x). We then say
that s
2
succeeds s
1
, written s
2
- s
1
(or s
1
. s
2
), if
Dom(s
1
) Dom(s
2
) and s
1
(r) s
2
(r), for all r
Dom(s
1
).
Lemma 2.1 If s, VE ) e v, s
!
then Dom(s) =
Dom(s
!
) and s . s
!
.
The proof is a str aightforward induction on the depth
of inference of s, VE ) e v, s
!
.
Example The source expression
let x = (2,3) in λ y .(fst x, y)end 5
translates into
e
!
letr egion ρ
4
, ρ
5
in letregion ρ
6
in let x = (2 at ρ
2
, 3 at ρ
6
) at ρ
4
in (λ y .(fst x, y) at ρ
1
) at ρ
5
end
end
5 at ρ
3
end
Notice that ρ
1
, ρ
2
and ρ
3
occur free in this expression.
That is because they will hold the final result (a fact
which the translation infers). To start the evaluation
of e
!
, we first allocate three regions, say r1, r2 and
r3. Then we substitute ri for ρ
i
in e
!
(i = 1..3).
Figure 2 shows three snapshots from the evaluation
that follows, namely (a) just after the clo sure has been
allocated; (b) just before the closure is applied and (c)
at the e nd. The maximal depth of the region stack is
6 regions and the final depth is 3 regions. Notice the
dangling, but harmless, pointer at (b).
191

r
1
r
2
r
3
r
4
(a)
r
5
r
6
! !
"
2
(
,
) 'y, (fst x, y)at r
1
, {x *→ •}(
3
r
1
r
2
r
3
r
4
(b)
r
5
! !
"
2 5
(
,
) 'y, (fst x, y)at r
1
, {x *→ •}(
r
1
r
2
r
3
(c)
!
"
(
,
)
2 5
Figure 2: Three snapshots of an evaluatio n
3 The Translation
Let α and & rang e over denumerably infinite sets of
type variables and eect variabl es, respectively. We
assume that the sets of type variables, eect var iables,
region variables and region names are all pairwise dis-
joint. An eect, ϕ, is a finite set of atomic eects. An
atomic eect is either a token of the form get(ρ) or
put(ρ), or it is a n eect variable. Types, τ, d ecorated
types, µ, simple type schemes, σ, and compound type
schemes, π, take the form:
τ ::= int | µ
".ϕ
µ | α
µ ::= (τ, p)
σ ::= τ | α.σ | &.σ
π ::= τ
| α.π | &.π | ρ.π
The reason for the apparent redundancy between
the productions for simple and compo und type
schemes is this. In the target language there is
a distinction between plain functions and region-
polymorphic functions. The former are represented
by plain closures, the latter by region closures that
must be applied to zero or more places in order to
yield a plain closure. Thus plain functions and reg ion-
polymorphic functions have dierent eects (even in
the case of region-polymorphic functions that are ap-
plied to zero places). We choose to represent this dis-
tinction in the type system by a complete separation
of simple and compound type schemes. O nly region-
polymorphic functions have compound type schemes.
(The underlining in the production π ::= τ
makes it
possible always to tell which kind a type scheme is.)
When a type τ is regarded as a type scheme, it is
always regarded as a simple type scheme.
An object of the form &.ϕ (formally a pair (&, ϕ)) is
called an arrow eect. Here ϕ is the e ect of calling
the function. The &. is useful for type-checking pur-
poses, as explained in more detail in Appendix A. A
type environment, TE, is a finite ma p from ordinary
variables to pairs of the form (σ, p) and from letrec
variables to pairs of the form (π, p).
A substitution S is a triple (S
r
, S
t
, S
e
), where S
r
is a
finite map from region variables to places, S
t
is a finite
map fro m type variables to types and S
e
is a finite
map from eect var i ables to arrow eects. Its eect is
to carry out the thre e substitutions simultaneously on
the three kinds of variables.
192

Citations
More filters
Proceedings ArticleDOI

Separation logic: a logic for shared mutable data structures

TL;DR: An extension of Hoare logic that permits reasoning about low-level imperative programs that use shared mutable data structure is developed, including extensions that permit unrestricted address arithmetic, dynamically allocated arrays, and recursive procedures.
Proceedings ArticleDOI

Points-to analysis in almost linear time

TL;DR: This is the asymptotically fastest non-trivial interprocedural points-to analysis algorithm yet described and is based on a non-standard type system for describing a universally valid storage shape graph for a program in linear space.
Proceedings ArticleDOI

Enforcing high-level protocols in low-level software

TL;DR: The utility of this approach is validated by enforcing protocols present in the interface between the Windows 2000 kernel and its device drivers, which enforces statically that resources cannot be leaked.
Proceedings ArticleDOI

Type-based race detection for Java

TL;DR: This paper presents a static race detection analysis for multithreaded Java programs based on a formal type system capable of capturing many common synchronization patterns, including classes with internal synchronization, classes that require client-side synchronization, and thread-local classes.
Proceedings ArticleDOI

Flow-sensitive type qualifiers

TL;DR: An efficient constraint-based inference algorithm is obtained that integrates flow-insensitive alias analysis, effect inference, and ideas from linear type systems to support strong updates.
References
More filters
Journal ArticleDOI

A theory of type polymorphism in programming

TL;DR: This work presents a formal type discipline for polymorphic procedures in the context of a simple programming language, and a compile time type-checking algorithm w which enforces the discipline.
Book

The Definition of Standard ML

TL;DR: This book provides a formal definition of Standard ML for the benefit of all concerned with the language, including users and implementers, and the authors have defined their semantic objects in mathematical notation that is completely independent of StandardML.
Proceedings ArticleDOI

Principal type-schemes for functional programs

TL;DR: Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage.
Book

Compiling with continuations

TL;DR: In this article, the authors show how continuation-passing style is used as an intermediate representation to perform optimizations and program transformations for modern languages like ML, and show how concepts from the theory of programming languages can be applied to the production of practical optimizing compilers.
Frequently Asked Questions (11)
Q1. What contributions have the authors mentioned in the paper "Implementation of the typed call-by-value λ-calculus using a stack of regions" ?

The authors present a translation scheme for the polymorphically typed call-by-value λ-calculus. Experimental results suggest that regions tend to be small, that region allocation is frequent and that overall memory demands are usually modest, even without garbage collection. 

The device the authors use for grouping values according to regions is unification of region variables, using essentially the idea of Baker[3], namely that two valueproducing expressions e1 and e2 should be given the same “at ρ” annotation, if and only if type checking, directly or indirectly, unifies the type of e1 and e2. 

Since memory reuse is at the heart of their translation scheme, the authors prove that their translation rules are sound with respect to a regionbased operational semantics, where regions explicitly are allocated and de-allocated. 

When the authors compute the sum by folding the plus operator over the list (hsumit(100)), all the results of the plus operation are put into one region, because the operator is a lambda-bound parameter of the fold operation and hence cannot be region-polymorphic. 

The idea that effect inference makes it possible to delimit regions of memory and delimit their lifetimes goes back to early work on effect systems[6]. 

The grammar for the target language is:p ::= ρ | r e ::= x | λx.e at p | e1e2| let x = e1 in e2 end | letrec f[$ρ](x) at p = e1 in e2 end | f [$p ] at p | letregion ρ in e endwhere $ρ ranges over finite sequences ρ1,···,ρk of region variables and $p ranges over finite sequences p1,···,pk of places (k ≥ 0). 

The type-checking problem for the Milner/Mycroft is equivalent to the semi-unification problem[12,1], and semi-unification is unfortunately undecidable[17]. 

Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. 

Novice functional programmers often report that on their machines, memory is a constant, not a variable, and that this constant has to be uncomfortably large for their programs to run well. 

Lucassen and Gifford[21] call it effect masking; they prove that effect masking is sound with respect to a store semantics where regions are not reused. 

The experiments presented in this report suggest that the scheme in many cases leads to very economical use of memory resources, even without the use of garbage collection.