Journal Article•DOI•

Formal verification of a realistic compiler

Xavier Leroy¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 Jul 2009-Communications of The ACM (ACM)-Vol. 52, Iss: 7, pp 107-115

TL;DR: This paper reports on the development and formal verification of CompCert, a compiler from Clight (a large subset of the C programming language) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness.

read less

Abstract: This paper reports on the development and formal verification (proof of semantic preservation) of CompCert, a compiler from Clight (a large subset of the C programming language) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness. Such a verified compiler is useful in the context of critical software and its formal verification: the verification of the compiler guarantees that the safety properties proved on the source code hold for the executable compiled code as well.

...read moreread less

Summary (5 min read)

Jump to: [1. Introduction] – [2. Approaches to trusted compilation 2.1 Notions of semantic preservation] – [2.2 Verified, validated, certifying compilers] – [Proof-carrying code and certifying compilers] – [2.3 Composition of compilation passes] – [2.4 Summary] – [3. Overview of the CompCert compiler 3.1 The source language] – [3.2 Compilation passes and intermediate languages] – [3.3 Proving the compiler] – [3.4 Programming and running the compiler] – [3.5 Performance] – [4.1 The RTL intermediate language] – [Instructions:] – [4.2 The register allocation algorithm] – [4.3 Proving semantic preservation] and [5. Conclusions and perspectives]

1. Introduction

Compilers are generally assumed to be semantically transparent: the compiled code should behave as prescribed by the semantics of the source program.
For low-assurance software, validated only by testing, the impact of compiler bugs is low: what is tested is the executable code produced by the compiler; rigorous testing should expose compiler-introduced errors along with errors already present in the source program.
Namely, it compiles a language commonly used for critical embedded software: neither Java nor ML nor assembly code, but a large subset of the C language.
Section 3 describes the structure of the CompCert compiler, its performance, and how the Coq proof assistant was used not only to prove its correctness but also to program most of it.

2. Approaches to trusted compilation 2.1 Notions of semantic preservation

The authors aim is to prove that the semantics of S was preserved during compilation.
In all cases, behaviors also include a trace of the input-output operations (system calls) performed during the execution of the program.
If the source language is not deterministic, compilers are allowed to select one of the possible behaviors of the source program.
Under these conditions, there exists exactly one behavior B such that S ⇓ B, and similarly for C. Having proved properties (2) or (3) provides the same guarantee without having to equip the target and intermediate languages with sound type systems and to prove type preservation for the compiler.

2.2 Verified, validated, certifying compilers

The authors now discuss several approaches to establishing that a compiler preserves semantics of the compiled programs, in the sense of section 2.1.
The authors model the compiler as a total function Comp from source programs to either compiled code (written Comp(S) = OK(C)) or a compile-time error (written Comp(S) = Error).
Notice that a compiler that always fails (Comp(S) = Error for all S) is indeed verified, although useless.
Validation can be performed in several ways, ranging from symbolic interpretation and static analysis of S and C to the generation of verification conditions followed by model checking or automatic theorem proving.
Translation validation generates additional confidence in the correctness of the compiled code, but by itself does not provide formal guarantees as strong as those provided by a verified compiler: the validator could itself be incorrect.

Proof-carrying code and certifying compilers

The proof-carrying code (PCC) approach [19, 1] does not attempt to establish semantic preservation between a source program and some compiled code.
Instead, PCC focuses on the generation of independently-checkable evidence that the compiled code C satisfies a behavioral specification Spec such as type and memory safety.
The proof π, also called a certificate, can be checked independently by the code user; there is no need to trust the code producer, nor to formally verify the compiler itself.
Symmetrically, a certifying compiler can be constructed, at least theoretically, from a verified compiler, provided that the verification was conducted in a logic that follows the "propositions as types, proofs as programs" paradigm.
The construction is detailed in [11, section 2].

2.3 Composition of compilation passes

Compilers are naturally decomposed into several passes that communicate through intermediate languages.
Assume that the semantic preservation property ≈ is transitive.

2.4 Summary

The conclusions of this discussion are simple and define the methodology the authors have followed to verify the CompCert compiler back-end.
All intermediate languages must be given appropriate formal semantics.
Finally, for each pass, the authors have a choice between proving the code that implements this pass or performing the transformation via untrusted code, then verifying its results using a verified validator.
The latter approach can reduce the amount of code that needs to be verified.

3. Overview of the CompCert compiler 3.1 The source language

The source language of the CompCert compiler, called Clight [5] , is a large subset of the C programming language, comparable to the subsets commonly recommended for writing critical embedded software.
The semantics of Clight is formally defined in big-step operational style.
The semantics is deterministic and makes precise a number of behaviors left unspecified or undefined in the ISO C standard, such as the sizes of data types, the results of signed arithmetic operations in case of overflow, and the evaluation order.
Other undefined C behaviors are consistently turned into "going wrong" behaviors, such as dereferencing the null pointer or accessing arrays out of bounds.
Memory is modeled as a collection of disjoint blocks, each block being accessed through byte offsets; pointer values are pairs of a block identifier and a byte offset.

3.2 Compilation passes and intermediate languages

The formally verified part of the CompCert compiler translates from Clight abstract syntax to PPC abstract syntax, PPC being a subset of PowerPC assembly language.
The translation from C#minor to Cminor therefore recognizes scalar local variables whose addresses are never taken, assigning them to Cminor local variables and making them candidates for register allocation later; other local variables are stack-allocated in the activation record.
Unlike the other two optimizations, lazy code motion is implemented following the verified validator approach [24] .
Finally, the "stacking" pass lays out the activation records of functions, assigning offsets within this record to abstract stack locations and to saved callee-save registers, and replacing references to abstract stack locations by explicit memory loads and stores relative to the stack pointer.
The final compilation pass expands Mach instructions into canned sequences of PowerPC instructions, dealing with special registers such as the condition registers and with irregularities in the PowerPC instruction set.

3.3 Proving the compiler

The added value of CompCert lies not in the compilation technology implemented, but in the fact that each of the source, intermediate and target languages has formallydefined semantics, and that each of the transformation and optimization passes is proved to preserve semantics in the sense of section 2.4.
Coq implements the Calculus of Inductive and Coinductive Constructions, a powerful constructive, higher-order logic which supports equally well three familiar styles of writing specifications: by functions and pattern-matching, by inductive or coinductive predicates representing inference rules, and by ordinary predicates in first-order logic.
Internally, Coq builds proof terms that are later re-checked by a small kernel verifier, thus generating very high confidence in the validity of proofs.
Of these 42000 lines, 14% define the compilation algorithms implemented in CompCert, and 10% specify the semantics of the languages involved.
The remaining 76% correspond to the correctness proof itself.

3.4 Programming and running the compiler

The authors use Coq not only as a prover to conduct semantic preservation proofs, but also as a programming language to write all verified parts of the CompCert compiler.
With some ingenuity, this language suffices to write a compiler.
The authors use persistent data structures based on balanced trees, which support efficient updates without modifying data in-place.
The main advantage of this unconventional approach, compared with implementing the compiler in a conventional imperative language, is that the authors do not need a program logic (such as Hoare logic) to connect the compiler's code with its logical specifications.
The Coq functions implementing the compiler are first-class citizens of Coq's logic and can be reasoned on directly by induction, simplifications and equational reasoning.

3.5 Performance

Since standard benchmark suites use features of C not supported by CompCert, the authors had to roll their own small suite, which contains some computational kernels, cryptographic primitives, text compressors, a virtual machine interpreter and a ray tracer.
As the timings in figure 2 show, CompCert generates code that is more than twice as fast as that generated by GCC without optimizations, and competitive with GCC at optimization levels 1 and 2.
The test suite is too small to draw definitive conclusions, but these results strongly suggest that while CompCert is not going to win a prize in high performance computing, its performance is adequate for critical embedded code.
Compilation times of CompCert are within a factor of 2 of those of gcc -O1, which is reasonable and shows that the overheads introduced to facilitate verification (many small passes, no imperative data structures, etc.) are acceptable.

4.1 The RTL intermediate language

Register allocation is performed over the RTL intermediate representation, which represents functions as a controlflow graph (CFG) of abstract instructions, corresponding roughly to machine instructions but operating over pseudoregisters (also called "temporaries").
Every function has an unlimited supply of pseudo-registers, and their values are preserved across function call.
In the following, r ranges over pseudo-registers and l over labels of CFG nodes.

Instructions:

Instructions include arithmetic operations op (with an important special case op(move, r, r ′ , l) representing a registerto-register copy), memory loads and stores (of a quantity κ at the address obtained by applying addressing mode mode to registers r), conditional branches (with two successors), and function calls, tail-calls, and returns.
Internal functions are defined within RTL by their CFG, entry point in the CFG, and parameter registers.
Functions and call instructions carry signatures sig specifying the number and register classes (int or float) of their arguments and results.
The register state R maps pseudo-registers to their current values (discriminated union of 32-bit integers, 64-bit floats, and pointers).
Two slightly different forms of execution states, call states and return states, appear when modeling function calls and returns, but will not be described here.

4.2 The register allocation algorithm

The goal of the register allocation pass is to replace the pseudo-registers r that appear in unbounded quantity in the original RTL code by locations ℓ, which are either hardware registers (available in small, fixed quantity) or abstract stack slots in the activation record (available in unbounded quantity).
The dataflow equations are solved iteratively using Kildall's worklist algorithm.
The central step of register allocation consists in coloring the interference graph, assigning to each node r a "color" ϕ(r) that is either a hardware register or a stack slot, under the constraint that two nodes connected by an interference edge are assigned different colors.
Since this heuristic is difficult to prove correct directly, the authors implement it as unverified Caml code, then validate its results a posteriori using a simple verifier written and proved correct in Coq.
Additionally, coalescing and dead code elimination are performed.

4.3 Proving semantic preservation

In the case of register allocation, each original transition corresponds to exactly one transformed transition, resulting in the following "lock-step" simulation diagram:.
This requirement is much too strong, as it essentially precludes any sharing of a location between two pseudo-registers whose live ranges are disjoint.
Once the relation between states is set up, proving the simulation diagram above is a routine case inspection on the various transition rules of the RTL semantics.
In doing so, one comes to the pleasant realization that the dataflow inequations defining liveness, as well as Chaitin's rules for constructing the interference graph, are the minimal sufficient conditions for the invariant between register states R, R ′ to be preserved in all cases.

5. Conclusions and perspectives

The CompCert experiment described in this paper is still ongoing, and much work remains to be done: handle a larger subset of C (e.g. including goto); deploy and prove correct more optimizations; target other processors beyond PowerPC; extend the semantic preservation proofs to shared-memory concurrency; etc.
The preliminary results obtained so far provide strong evidence that the initial goal of formally verifying a realistic compiler can be achieved, within the limitations of today's proof assistants, and using only elementary semantic and algorithmic approaches.

Did you find this useful? Give us your feedback

Figures (2)

Figure 1: Compilation passes and intermediate languages.

Figure 2: Relative execution times of compiled code.

Content maybe subject to copyright Report

HAL Id: inria-00415861

https://hal.inria.fr/inria-00415861

Submitted on 11 Sep 2009

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Formal verication of a realistic compiler

Xavier Leroy

To cite this version:

Xavier Leroy. Formal verication of a realistic compiler. Communications of the ACM, Association

for Computing Machinery, 2009, 52 (7), pp.107-115. �10.1145/1538788.1538814�. �inria-00415861�

Formal veriﬁcation of a realistic compiler

Xavier Leroy

INRIA Paris-Rocquencourt

Domaine de Voluceau, B.P. 105, 78153 Le Chesnay, France

xavier.leroy@inria.fr

Abstract

This paper reports on the development and formal veriﬁca-

tion (proof of semantic preservation) of CompCert, a com-

piler from Clight (a large subset of the C programming lan-

guage) to PowerPC assembly code, using the Coq proof as-

sistant both for programming the compiler and for proving

its correctness. Such a veriﬁed compiler is useful in the con-

text of critical software and its formal veriﬁcation: the veri-

ﬁcation of the compiler guarantees that the safety properties

proved on the source code hold for the executable compiled

code as well.

1. Introduction

Can you trust your compiler? Compilers are generally

assumed to be semantically transparent: the compiled

code should behave as prescribed by the semantics of the

source program. Yet, compilers—and especially optimizing

compilers—are complex programs that perform complicated

symbolic transformations. Despite intensive testing, bugs

in compilers do occur, causing the compilers to crash at

compile-time or—much worse—to silently generate an

incorrect executable for a correct source program.

For low-assurance software, validated only by testing,

the impact of compiler bugs is low: what is tested is the

executable code produced by the compiler; rigorous testing

should expose compiler-introduced errors along with errors

already present in the source program. Note, however,

that compiler-introduced bugs are notoriously diﬃcult to

expose and track down. The picture changes dramatically

for safety-critical, high-assurance software. Here, validation

by testing reaches its limits and needs to be complemented

or even replaced by the use of formal methods such as

model checking, static analysis, and program proof. Almost

universally, these formal veriﬁcation tools are applied to

the source code of a program. Bugs in the compiler used to

turn this formally veriﬁed source code into an executable

can potentially invalidate all the guarantees so painfully

obtained by the use of formal methods. In a future where

formal methods are routinely applied to source programs,

the compiler could appear as a weak link in the chain that

goes from speciﬁcations to executables. The safety-critical

software industry is aware of these issues and uses a variety

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for proﬁt or commercial advantage and that copies

bear this notice and the full citation on the ﬁrst page. To copy otherwise, to

republish, to post on servers or to redistribute to lists, requires prior speciﬁc

permission and/or a fee.

of techniques to alleviate them, such as conducting manual

code reviews of the generated assembly code after having

turned all compiler optimizations oﬀ. These techniques

do not fully address the issues, and are costly in terms of

development time and program performance.

An obviously better approach is to apply formal methods

to the compiler itself in order to gain assurance that it pre-

serves the semantics of the source programs. For the last

ﬁve years, we have been working on the development of a

realistic, veriﬁed compiler called CompCert. By veriﬁed, we

mean a compiler that is accompanied by a machine-checked

proof of a semantic preservation property: the generated

machine code behaves as prescribed by the semantics of the

source program. By realistic, we mean a compiler that could

realistically be used in the context of production of critical

software. Namely, it compiles a language commonly used

for critical embedded software: neither Java nor ML nor

assembly code, but a large subset of the C language. It

produces code for a processor commonly used in embedded

systems: we chose the PowerPC because it is popular in

avionics. Finally, the compiler must generate code that is

eﬃcient enough and compact enough to ﬁt the requirements

of critical embedded systems. This implies a multi-pass com-

piler that features good register allocation and some basic

optimizations.

Proving the correctness of a compiler is by no ways a

new idea: the ﬁrst such proof was published in 1967 [16]

(for the compilation of arithmetic expressions down to stack

machine code) and mechanically veriﬁed in 1972 [17]. Since

then, many other proofs have been conducted, ranging from

single-pass compilers for toy languages to sophisticated code

optimizations [8]. In the CompCert experiment, we carry

this line of work all the way to end-to-end veriﬁcation of a

complete compilation chain from a structured imperative

language down to assembly code through 8 intermediate

languages. While conducting the veriﬁcation of CompCert,

we found that many of the non-optimizing translations per-

formed, while often considered obvious in the compiler lit-

erature, are surprisingly tricky to formally prove correct.

This paper gives a high-level overview of the CompCert

compiler and its mechanized veriﬁcation, which uses the Coq

proof assistant [7, 3]. This compiler, classically, consists of

two parts: a front-end translating the Clight subset of C to

a low-level, structured intermediate language called Cminor,

and a lightly-optimizing back-end generating PowerPC as-

sembly code from Cminor. A detailed description of Clight

can be found in [5]; of the compiler front-end in [4]; and of

the compiler back-end in [11, 13]. The complete source code

of the Coq development, extensively commented, is available

on the Web [12].

The remainder of this paper is organized as follows. Sec-

tion 2 compares and formalizes several approaches to estab-

lishing trust in the results of compilation. Section 3 de-

scribes the structure of the CompCert compiler, its p erfor-

mance, and how the Coq proof assistant was used not only

to prove its correctness but also to program most of it. By

lack of space, we will not detail the formal veriﬁcation of

every compilation pass. However, section 4 provides a tech-

nical overview of such a veriﬁcation for one crucial pass of

the compiler: register allocation. Finally, section 5 presents

preliminary conclusions and directions for future work.

2. Approaches to trusted compilation

2.1 Notions of semantic preservation

Consider a source program S and a compiled program C

produced by a compiler. Our aim is to prove that the seman-

tics of S was preserved during compilation. To make this

notion of semantic preservation precise, we assume given se-

mantics for the source and target languages that associate

observable behaviors B to S and C. We write S ⇓ B to

mean that program S executes with observable behavior B.

The behaviors we observe in CompCert include termination,

divergence, and “going wrong” (invoking an undeﬁned oper-

ation that could crash, such as accessing an array out of

bounds). In all cases, behaviors also include a trace of the

input-output operations (system calls) performed during the

execution of the program. Behaviors therefore reﬂect accu-

rately what the user of the program, or more generally the

outside world the program interacts with, can observe.

The strongest notion of semantic preservation during com-

pilation is that the source program S and the compiled

code C have exactly the same observable behaviors:

∀B, S ⇓ B ⇐⇒ C ⇓ B (1)

Notion (1) is too strong to be usable. If the source lan-

guage is not deterministic, compilers are allowed to select

one of the possible behaviors of the source program. (For

instance, C compilers choose one particular evaluation or-

der for expressions among the several orders allowed by the

C speciﬁcations.) In this case, C will have fewer behaviors

than S. Additionally, compiler optimizations can optimize

away “going wrong” behaviors. For example, if S can go

wrong on an integer division by zero but the compiler elim-

inated this computation because its result is unused, C will

not go wrong. To account for these degrees of freedom in

the compiler, we relax deﬁnition (1) as follows:

S safe =⇒ (∀B, C ⇓ B =⇒ S ⇓ B) (2)

(Here, S safe means that none of the possible behaviors of

S is a “going wrong” behavior.) In other words, if S does not

goes wrong, then neither does C; moreover, all observable

behaviors of C are acceptable behaviors of S.

In the CompCert experiment and the remainder of this pa-

per, we focus on source and target languages that are deter-

ministic (programs change their behaviors only in response

to diﬀerent inputs but not because of internal choices) and

on execution environments that are deterministic as well

(the inputs given to the programs are uniquely determined

by their previous outputs). Under these conditions, there

exists exactly one behavior B such that S ⇓ B, and simi-

larly for C. In this case, it is easy to prove that property (2)

is equivalent to:

∀B /∈ Wrong, S ⇓ B =⇒ C ⇓ B (3)

(Here, Wrong is the set of “going wrong” behaviors.) Prop-

erty (3) is generally much easier to prove than property (2),

since the proof can proceed by induction on the execution

of S. This is the approach that we take in this work.

From a formal methods perspective, what we are really

interested in is whether the compiled code satisﬁes the func-

tional speciﬁcations of the application. Assume that these

speciﬁcations are given as a predicate Spec(B) of the observ-

able behavior. We say that C satisﬁes the speciﬁcations,

and write C |= Spec, if C cannot go wrong (C safe) and

all behaviors of B satisfy Spec (∀B, C ⇓ B =⇒ Spec(B)).

The expected correctness property of the compiler is that it

preserves the fact that the source code S satisﬁes the speciﬁ-

cation, a fact that has been established separately by formal

veriﬁcation of S:

S |= Spec =⇒ C |= Spec (4)

It is easy to show that property (2) implies property (4) for

all speciﬁcations Spec. Therefore, establishing property (2)

once and for all spares us from establishing property (4) for

every speciﬁcation of interest.

A special case of property (4), of considerable historical

importance, is the preservation of type and memory safety,

which we can summarize as “if S does not go wrong, neither

does C”:

S safe =⇒ C safe (5)

Combined with a separate check that S is well-typed in a

sound type system, property (5) implies that C executes

without memory violations. Type-preserving compilation

[18] obtains this guarantee by diﬀerent means: under the

assumption that S is well typed, C is proved to be well-

typed in a sound type system, ensuring that it cannot go

wrong. Having proved properties (2) or (3) provides the

same guarantee without having to equip the target and in-

termediate languages with sound type systems and to prove

type preservation for the compiler.

2.2 Veriﬁed, validated, certifying compilers

We now discuss several approaches to establishing that a

compiler preserves semantics of the compiled programs, in

the sense of section 2.1. In the following, we write S ≈ C,

where S is a source program and C is compiled code, to

denote one of the semantic preservation properties (1) to

(5) of section 2.1.

Veriﬁed compilers. We model the compiler as a total

function Comp from source programs to either compiled

code (written Comp(S) = OK(C)) or a compile-time error

(written Comp(S) = Error). Compile-time errors corre-

spond to cases where the compiler is unable to produce code,

for instance if the source program is incorrect (syntax error,

type error, etc.), but also if it exceeds the capacities of the

compiler. A compiler Comp is said to be veriﬁed if it is

accompanied with a formal proof of the following property:

∀S, C, Comp(S) = OK(C) =⇒ S ≈ C (6)

In other words, a veriﬁed compiler either reports an error or

produces code that satisﬁes the desired correctness property.

Notice that a compiler that always fails (Comp(S) = Error

for all S) is indeed veriﬁed, although useless. Whether the

compiler succeeds to compile the source programs of interest

is not a correctness issue, but a quality of implementation

issue, which is addressed by non-formal methods such as

testing. The important feature, from a formal veriﬁcation

standpoint, is that the compiler never silently produces in-

correct code.

Verifying a compiler in the sense of deﬁnition (6)

amounts to applying program proof technology to the

compiler sources, using one of the properties deﬁned in

section 2.1 as the high-level speciﬁcation of the compiler.

Translation validation with veriﬁed validators In

the translation validation approach [22, 20] the compiler

does not need to be veriﬁed. Instead, the compiler is

complemented by a validator: a boolean-valued function

Validate(S, C) that veriﬁes the property S ≈ C a posteriori.

If Comp(S) = OK(C) and Validate(S, C) = true, the

compiled code C is deemed trustworthy. Validation can

be performed in several ways, ranging from symbolic inter-

pretation and static analysis of S and C to the generation

of veriﬁcation conditions followed by model checking or

automatic theorem proving. The property S ≈ C being

undecidable in general, validators are necessarily incomplete

and should reply false if they cannot establish S ≈ C.

Translation validation generates additional conﬁdence in

the correctness of the compiled code, but by itself does not

provide formal guarantees as strong as those provided by

a veriﬁed compiler: the validator could itself be incorrect.

To rule our this possibility, we say that a validator Validate

is veriﬁed if it is accompanied with a formal proof of the

following property:

∀S, C, Validate(S, C) = true =⇒ S ≈ C (7)

The combination of a veriﬁed validator Validate with an

unveriﬁed compiler Comp does provide formal guarantees

as strong as those provided by a veriﬁed compiler. Indeed,

consider the following function:

Comp

′

(S) =

match Comp(S) with

| Error → Error

| OK(C) → if Validate(S, C) then OK(C) else Error

This function is a veriﬁed compiler in the sense of deﬁni-

tion (6). Veriﬁcation of a translation validator is therefore

an attractive alternative to the veriﬁcation of a compiler,

provided the validator is smaller and simpler than the com-

piler.

Proof-carrying code and certifying compilers The

proof-carrying code (PCC) approach [19, 1] does not at-

tempt to establish semantic preservation between a source

program and some compiled code. Instead, PCC focuses

on the generation of independently-checkable evidence that

the compiled code C satisﬁes a behavioral speciﬁcation

Spec such as type and memory safety. PCC makes use of a

certifying compiler, which is a function CComp that either

fails or returns both a compiled code C and a proof π of the

property C |= Spec. The proof π, also called a certiﬁcate,

can be checked independently by the code user; there is no

need to trust the code producer, nor to formally verify the

compiler itself. The only part of the infrastructure that

needs to be trusted is the client-side checker: the program

that checks whether π entails the property C |= Spec.

As in the case of translation validation, it suﬃces to

formally verify the client-side checker to obtain guarantees

as strong as those obtained from compiler veriﬁcation of

property (4). Symmetrically, a certifying compiler can be

constructed, at least theoretically, from a veriﬁed compiler,

provided that the veriﬁcation was conducted in a logic

that follows the “propositions as types, proofs as programs”

paradigm. The construction is detailed in [11, section 2].

2.3 Composition of compilation passes

Compilers are naturally decomposed into several passes

that communicate through intermediate languages. It is

fortunate that veriﬁed compilers can also be decomposed

in this manner. Consider two veriﬁed compilers Comp

and

Comp

from languages L

to L

and L

to L

, respectively.

Assume that the semantic preservation property ≈ is tran-

sitive. (This is true for properties (1) to (5) of section 2.1.)

Consider the error-propagating composition of Comp

and

Comp

Comp(S) = match Comp

(S) with

| Error → Error

| OK(I) → Comp

(I)

It is trivial to show that this function is a veriﬁed compiler

from L

to L

2.4 Summary

The conclusions of this discussion are simple and deﬁne

the methodology we have followed to verify the CompCert

compiler back-end. First, provided the target language of

the compiler has deterministic semantics, an appropriate

speciﬁcation for the correctness pro of of the compiler is the

combination of deﬁnitions (3) and (6):

∀S, C, B /∈ Wrong, Comp(S) = OK(C) ∧ S ⇓ B =⇒ C ⇓ B

Second, a veriﬁed compiler can be structured as a com-

position of compilation passes, following common practice.

However, all intermediate languages must be given appro-

priate formal semantics.

Finally, for each pass, we have a choice between prov-

ing the code that implements this pass or performing the

transformation via untrusted code, then verifying its results

using a veriﬁed validator. The latter approach can reduce

the amount of code that needs to be veriﬁed.

3. Overview of the CompCert compiler

3.1 The source language

The source language of the CompCert compiler, called

Clight [5], is a large subset of the C programming language,

comparable to the subsets commonly recommended for

writing critical embedded software. It supports almost all

C data types, including pointers, arrays, struct and union

types; all structured control ( if/then, loops, break, con-

tinue, Java-style switch); and the full power of functions,

including recursive functions and function pointers. The

main omissions are extended-precision arithmetic (long

long and long double); the goto statement; non-structured

forms of switch such as Duﬀ’s device; passing struct and

union parameters and results by value; and functions

with variable numbers of arguments. Other features of

Clight C#minor

Cminor

CminorSel

RTLLTLLTLin

Linear

Mach

PPC

simpliﬁcations

type elimination

stack pre-

-allocation

instruction

selection

CFG

construction

allocation

code

linearization

spilling, reloading

calling conventions

layout of

stack frames

PowerPC code

generation

CSELCM

constant propagation

branch tunneling

instr. scheduling

parsing, elaboration

(not veriﬁed)

assembling, linking

(not veriﬁed)

Figure 1: Compilation passes and intermediate languages.

C are missing from Clight but are supported through

code expansion (“de-sugaring”) during parsing: side eﬀects

within expressions (Clight expressions are side-eﬀect free)

and block-scoped variables (Clight has only global and

function-local variables).

The semantics of Clight is formally deﬁned in big-step op-

erational style. The semantics is deterministic and makes

precise a number of behaviors left unspeciﬁed or undeﬁned

in the ISO C standard, such as the sizes of data types, the re-

sults of signed arithmetic operations in case of overﬂow, and

the evaluation order. Other undeﬁned C behaviors are con-

sistently turned into “going wrong” behaviors, such as deref-

erencing the null pointer or accessing arrays out of bounds.

Memory is modeled as a collection of disjoint blocks, each

block being accessed through byte oﬀsets; pointer values are

pairs of a block identiﬁer and a byte oﬀset. This way, pointer

arithmetic is modeled accurately, even in the presence of

casts between incompatible pointer types.

3.2 Compilation passes and intermediate languages

The formally veriﬁed part of the CompCert compiler

translates from Clight abstract syntax to PPC abstract

syntax, PPC being a subset of PowerPC assembly language.

As depicted in ﬁgure 1, the compiler is composed of

14 passes that go through 8 intermediate languages. Not

detailed in ﬁgure 1 are the parts of the compiler that are not

veriﬁed yet: upstream, a parser, type-checker and simpliﬁer

that generates Clight abstract syntax from C source ﬁles

and is based on the CIL library [21]; downstream, a printer

for PPC abstract syntax trees in concrete assembly syntax,

followed by generation of executable binary using the

system’s assembler and linker.

The front-end of the compiler translates away C-speciﬁc

features in two passes, going through the C#minor and Cmi-

nor intermediate languages. C#minor is a simpliﬁed, type-

less variant of Clight where distinct arithmetic operators are

provided for integers, pointers and ﬂoats, and C loops are re-

placed by inﬁnite loops plus blocks and multi-level exits from

enclosing blocks. The ﬁrst pass translates C loops accord-

ingly and eliminates all type-dependent behaviors: operator

overloading is resolved; memory loads and stores, as well as

address computations, are made explicit. The next inter-

mediate language, Cminor, is similar to C#minor with the

omission of the & (address-of) operator. Cminor function-

local variables do not reside in memory, and their address

cannot be taken. However, Cminor supports explicit stack

allocation of data in the activation records of functions. The

translation from C#minor to Cminor therefore recognizes

scalar local variables whose addresses are never taken, as-

signing them to Cminor local variables and making them

candidates for register allocation later; other local variables

are stack-allocated in the activation record.

The compiler back-end starts with an instruction se-

lection pass, which recognizes opportunities for using

combined arithmetic instructions (add-immediate, not-and,

rotate-and-mask, etc.) and addressing modes provided by

the target processor. This pass proceeds by bottom-up

rewriting of Cminor expressions. The target language is

CminorSel, a processor-dependent variant of Cminor that

oﬀers additional operators, addressing modes, and a class of

condition expressions (expressions evaluated for their truth

value only).

The next pass translates CminorSel to RTL, a classic

control-ﬂow graph (CFG). Each node of the graph carries

a machine-level instruction operating over temporaries

(pseudo-registers). RTL is a convenient representation to

conduct optimizations based on dataﬂow analyses. Two

such optimizations are currently implemented: constant

propagation and common subexpression elimination, the

latter being performed via value numbering over extended

basic blocks. A third optimization, lazy code motion,

was developed separately and will be integrated soon.

Unlike the other two optimizations, lazy code motion is

implemented following the veriﬁed validator approach [24].

After these optimizations, register allocation is performed

via coloring of an interference graph [6]. The output of this

pass is LTL, a language similar to RTL where temporaries

are replaced by hardware registers or abstract stack loca-

tions. The control-ﬂow graph is then “linearized”, producing

a list of instructions with explicit labels, conditional and un-

conditional branches. Next, spills and reloads are inserted

around instructions that reference temporaries that were al-

located to stack locations, and moves are inserted around

function calls, prologues and epilogues to enforce calling con-

ventions. Finally, the “stacking” pass lays out the activation

records of functions, assigning oﬀsets within this record to

HTML Viewer

Frequently Asked Questions (11)

Q1. What contributions have the authors mentioned in the paper "Formal verification of a realistic compiler" ?

This paper reports on the development and formal verification ( proof of semantic preservation ) of CompCert, a compiler from Clight ( a large subset of the C programming language ) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness.

Q2. What are the future works mentioned in the paper "Formal verification of a realistic compiler" ?

The CompCert experiment described in this paper is still ongoing, and much work remains to be done: handle a larger subset of C ( e. g. including goto ) ; deploy and prove correct more optimizations ; target other processors beyond PowerPC ; extend the semantic preservation proofs to shared-memory concurrency ; etc. However, the preliminary results obtained so far provide strong evidence that the initial goal of formally verifying a realistic compiler can be achieved, within the limitations of today ’ s proof assistants, and using only elementary semantic and algorithmic approaches. The techniques and tools the authors used are very far from perfect—more proof automation, higher-level semantics and more modern intermediate representations all have the potential to significantly reduce the proof effort—but good enough to achieve the goal. Composed with the CompCert back-end, these efforts could eventually result in a trusted execution path for programs written and verified in Coq, like CompCert itself, therefore increasing confidence further through a form of bootstrapping.

Q3. What are the behaviors the authors observe in CompCert?

The behaviors the authors observe in CompCert include termination, divergence, and “going wrong” (invoking an undefined operation that could crash, such as accessing an array out of bounds).

Q4. What is the strongest notion of semantic preservation in the CompCert experiment?

In the CompCert experiment and the remainder of this paper, the authors focus on source and target languages that are deterministic (programs change their behaviors only in response to different inputs but not because of internal choices) and on execution environments that are deterministic as well (the inputs given to the programs are uniquely determined by their previous outputs).

Q5. What is the impact of compiler bugs?

For low-assurance software, validated only by testing, the impact of compiler bugs is low: what is tested is the executable code produced by the compiler; rigorous testing should expose compiler-introduced errors along with errors already present in the source program.

Q6. What is the expected correctness property of the compiler?

The expected correctness property of the compiler is that it preserves the fact that the source code S satisfies the specification, a fact that has been established separately by formal verification of S:S |= Spec =⇒ C |= Spec (4)It is easy to show that property (2) implies property (4) for all specifications Spec.

Q7. How can a certifying compiler be constructed?

a certifying compiler can be constructed, at least theoretically, from a verified compiler, provided that the verification was conducted in a logic that follows the “propositions as types, proofs as programs” paradigm.

Q8. What is the correctness proof of the compiler?

provided the target language of the compiler has deterministic semantics, an appropriate specification for the correctness proof of the compiler is the combination of definitions (3) and (6):∀S, C, B /∈ Wrong, Comp(S) = OK(C) ∧

Q9. What is the way to test a compiler?

validation by testing reaches its limits and needs to be complemented or even replaced by the use of formal methods such as model checking, static analysis, and program proof.

Q10. What is the definition of a verified compiler?

By verified, the authors mean a compiler that is accompanied by a machine-checked proof of a semantic preservation property: the generated machine code behaves as prescribed by the semantics of the source program.

Q11. What is the way to test compilers?

Bugs in the compiler used to turn this formally verified source code into an executable can potentially invalidate all the guarantees so painfully obtained by the use of formal methods.

Formal verification of a realistic compiler

Summary (5 min read)

1. Introduction

2. Approaches to trusted compilation 2.1 Notions of semantic preservation

2.2 Verified, validated, certifying compilers

Proof-carrying code and certifying compilers

2.3 Composition of compilation passes

2.4 Summary

3. Overview of the CompCert compiler 3.1 The source language

3.2 Compilation passes and intermediate languages

3.3 Proving the compiler

3.4 Programming and running the compiler

3.5 Performance

4.1 The RTL intermediate language

Instructions:

4.2 The register allocation algorithm

4.3 Proving semantic preservation

5. Conclusions and perspectives

Figures (2)

Citations

Cites background from "Formal verification of a realistic ..."

Cites background or methods from "Formal verification of a realistic ..."

Cites methods from "Formal verification of a realistic ..."

Cites methods from "Formal verification of a realistic ..."

References

"Formal verification of a realistic ..." refers methods in this paper

"Formal verification of a realistic ..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (11)

Q1. What contributions have the authors mentioned in the paper "Formal verification of a realistic compiler" ?

Q2. What are the future works mentioned in the paper "Formal verification of a realistic compiler" ?

Q3. What are the behaviors the authors observe in CompCert?

Q4. What is the strongest notion of semantic preservation in the CompCert experiment?

Q5. What is the impact of compiler bugs?

Q6. What is the expected correctness property of the compiler?

Q7. How can a certifying compiler be constructed?

Q8. What is the correctness proof of the compiler?

Q9. What is the way to test a compiler?

Q10. What is the definition of a verified compiler?

Q11. What is the way to test compilers?