scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Formal verification of a realistic compiler

TL;DR: This paper reports on the development and formal verification of CompCert, a compiler from Clight (a large subset of the C programming language) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness.
Abstract: This paper reports on the development and formal verification (proof of semantic preservation) of CompCert, a compiler from Clight (a large subset of the C programming language) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness. Such a verified compiler is useful in the context of critical software and its formal verification: the verification of the compiler guarantees that the safety properties proved on the source code hold for the executable compiled code as well.

Summary (5 min read)

1. Introduction

  • Compilers are generally assumed to be semantically transparent: the compiled code should behave as prescribed by the semantics of the source program.
  • For low-assurance software, validated only by testing, the impact of compiler bugs is low: what is tested is the executable code produced by the compiler; rigorous testing should expose compiler-introduced errors along with errors already present in the source program.
  • Namely, it compiles a language commonly used for critical embedded software: neither Java nor ML nor assembly code, but a large subset of the C language.
  • Section 3 describes the structure of the CompCert compiler, its performance, and how the Coq proof assistant was used not only to prove its correctness but also to program most of it.

2. Approaches to trusted compilation 2.1 Notions of semantic preservation

  • The authors aim is to prove that the semantics of S was preserved during compilation.
  • In all cases, behaviors also include a trace of the input-output operations (system calls) performed during the execution of the program.
  • If the source language is not deterministic, compilers are allowed to select one of the possible behaviors of the source program.
  • Under these conditions, there exists exactly one behavior B such that S ⇓ B, and similarly for C. Having proved properties (2) or (3) provides the same guarantee without having to equip the target and intermediate languages with sound type systems and to prove type preservation for the compiler.

2.2 Verified, validated, certifying compilers

  • The authors now discuss several approaches to establishing that a compiler preserves semantics of the compiled programs, in the sense of section 2.1.
  • The authors model the compiler as a total function Comp from source programs to either compiled code (written Comp(S) = OK(C)) or a compile-time error (written Comp(S) = Error).
  • Notice that a compiler that always fails (Comp(S) = Error for all S) is indeed verified, although useless.
  • Validation can be performed in several ways, ranging from symbolic interpretation and static analysis of S and C to the generation of verification conditions followed by model checking or automatic theorem proving.
  • Translation validation generates additional confidence in the correctness of the compiled code, but by itself does not provide formal guarantees as strong as those provided by a verified compiler: the validator could itself be incorrect.

Proof-carrying code and certifying compilers

  • The proof-carrying code (PCC) approach [19, 1] does not attempt to establish semantic preservation between a source program and some compiled code.
  • Instead, PCC focuses on the generation of independently-checkable evidence that the compiled code C satisfies a behavioral specification Spec such as type and memory safety.
  • The proof π, also called a certificate, can be checked independently by the code user; there is no need to trust the code producer, nor to formally verify the compiler itself.
  • Symmetrically, a certifying compiler can be constructed, at least theoretically, from a verified compiler, provided that the verification was conducted in a logic that follows the "propositions as types, proofs as programs" paradigm.
  • The construction is detailed in [11, section 2].

2.3 Composition of compilation passes

  • Compilers are naturally decomposed into several passes that communicate through intermediate languages.
  • Assume that the semantic preservation property ≈ is transitive.

2.4 Summary

  • The conclusions of this discussion are simple and define the methodology the authors have followed to verify the CompCert compiler back-end.
  • All intermediate languages must be given appropriate formal semantics.
  • Finally, for each pass, the authors have a choice between proving the code that implements this pass or performing the transformation via untrusted code, then verifying its results using a verified validator.
  • The latter approach can reduce the amount of code that needs to be verified.

3. Overview of the CompCert compiler 3.1 The source language

  • The source language of the CompCert compiler, called Clight [5] , is a large subset of the C programming language, comparable to the subsets commonly recommended for writing critical embedded software.
  • The semantics of Clight is formally defined in big-step operational style.
  • The semantics is deterministic and makes precise a number of behaviors left unspecified or undefined in the ISO C standard, such as the sizes of data types, the results of signed arithmetic operations in case of overflow, and the evaluation order.
  • Other undefined C behaviors are consistently turned into "going wrong" behaviors, such as dereferencing the null pointer or accessing arrays out of bounds.
  • Memory is modeled as a collection of disjoint blocks, each block being accessed through byte offsets; pointer values are pairs of a block identifier and a byte offset.

3.2 Compilation passes and intermediate languages

  • The formally verified part of the CompCert compiler translates from Clight abstract syntax to PPC abstract syntax, PPC being a subset of PowerPC assembly language.
  • The translation from C#minor to Cminor therefore recognizes scalar local variables whose addresses are never taken, assigning them to Cminor local variables and making them candidates for register allocation later; other local variables are stack-allocated in the activation record.
  • Unlike the other two optimizations, lazy code motion is implemented following the verified validator approach [24] .
  • Finally, the "stacking" pass lays out the activation records of functions, assigning offsets within this record to abstract stack locations and to saved callee-save registers, and replacing references to abstract stack locations by explicit memory loads and stores relative to the stack pointer.
  • The final compilation pass expands Mach instructions into canned sequences of PowerPC instructions, dealing with special registers such as the condition registers and with irregularities in the PowerPC instruction set.

3.3 Proving the compiler

  • The added value of CompCert lies not in the compilation technology implemented, but in the fact that each of the source, intermediate and target languages has formallydefined semantics, and that each of the transformation and optimization passes is proved to preserve semantics in the sense of section 2.4.
  • Coq implements the Calculus of Inductive and Coinductive Constructions, a powerful constructive, higher-order logic which supports equally well three familiar styles of writing specifications: by functions and pattern-matching, by inductive or coinductive predicates representing inference rules, and by ordinary predicates in first-order logic.
  • Internally, Coq builds proof terms that are later re-checked by a small kernel verifier, thus generating very high confidence in the validity of proofs.
  • Of these 42000 lines, 14% define the compilation algorithms implemented in CompCert, and 10% specify the semantics of the languages involved.
  • The remaining 76% correspond to the correctness proof itself.

3.4 Programming and running the compiler

  • The authors use Coq not only as a prover to conduct semantic preservation proofs, but also as a programming language to write all verified parts of the CompCert compiler.
  • With some ingenuity, this language suffices to write a compiler.
  • The authors use persistent data structures based on balanced trees, which support efficient updates without modifying data in-place.
  • The main advantage of this unconventional approach, compared with implementing the compiler in a conventional imperative language, is that the authors do not need a program logic (such as Hoare logic) to connect the compiler's code with its logical specifications.
  • The Coq functions implementing the compiler are first-class citizens of Coq's logic and can be reasoned on directly by induction, simplifications and equational reasoning.

3.5 Performance

  • Since standard benchmark suites use features of C not supported by CompCert, the authors had to roll their own small suite, which contains some computational kernels, cryptographic primitives, text compressors, a virtual machine interpreter and a ray tracer.
  • As the timings in figure 2 show, CompCert generates code that is more than twice as fast as that generated by GCC without optimizations, and competitive with GCC at optimization levels 1 and 2.
  • The test suite is too small to draw definitive conclusions, but these results strongly suggest that while CompCert is not going to win a prize in high performance computing, its performance is adequate for critical embedded code.
  • Compilation times of CompCert are within a factor of 2 of those of gcc -O1, which is reasonable and shows that the overheads introduced to facilitate verification (many small passes, no imperative data structures, etc.) are acceptable.

4.1 The RTL intermediate language

  • Register allocation is performed over the RTL intermediate representation, which represents functions as a controlflow graph (CFG) of abstract instructions, corresponding roughly to machine instructions but operating over pseudoregisters (also called "temporaries").
  • Every function has an unlimited supply of pseudo-registers, and their values are preserved across function call.
  • In the following, r ranges over pseudo-registers and l over labels of CFG nodes.

Instructions:

  • Instructions include arithmetic operations op (with an important special case op(move, r, r ′ , l) representing a registerto-register copy), memory loads and stores (of a quantity κ at the address obtained by applying addressing mode mode to registers r), conditional branches (with two successors), and function calls, tail-calls, and returns.
  • Internal functions are defined within RTL by their CFG, entry point in the CFG, and parameter registers.
  • Functions and call instructions carry signatures sig specifying the number and register classes (int or float) of their arguments and results.
  • The register state R maps pseudo-registers to their current values (discriminated union of 32-bit integers, 64-bit floats, and pointers).
  • Two slightly different forms of execution states, call states and return states, appear when modeling function calls and returns, but will not be described here.

4.2 The register allocation algorithm

  • The goal of the register allocation pass is to replace the pseudo-registers r that appear in unbounded quantity in the original RTL code by locations ℓ, which are either hardware registers (available in small, fixed quantity) or abstract stack slots in the activation record (available in unbounded quantity).
  • The dataflow equations are solved iteratively using Kildall's worklist algorithm.
  • The central step of register allocation consists in coloring the interference graph, assigning to each node r a "color" ϕ(r) that is either a hardware register or a stack slot, under the constraint that two nodes connected by an interference edge are assigned different colors.
  • Since this heuristic is difficult to prove correct directly, the authors implement it as unverified Caml code, then validate its results a posteriori using a simple verifier written and proved correct in Coq.
  • Additionally, coalescing and dead code elimination are performed.

4.3 Proving semantic preservation

  • In the case of register allocation, each original transition corresponds to exactly one transformed transition, resulting in the following "lock-step" simulation diagram:.
  • This requirement is much too strong, as it essentially precludes any sharing of a location between two pseudo-registers whose live ranges are disjoint.
  • Once the relation between states is set up, proving the simulation diagram above is a routine case inspection on the various transition rules of the RTL semantics.
  • In doing so, one comes to the pleasant realization that the dataflow inequations defining liveness, as well as Chaitin's rules for constructing the interference graph, are the minimal sufficient conditions for the invariant between register states R, R ′ to be preserved in all cases.

5. Conclusions and perspectives

  • The CompCert experiment described in this paper is still ongoing, and much work remains to be done: handle a larger subset of C (e.g. including goto); deploy and prove correct more optimizations; target other processors beyond PowerPC; extend the semantic preservation proofs to shared-memory concurrency; etc.
  • The preliminary results obtained so far provide strong evidence that the initial goal of formally verifying a realistic compiler can be achieved, within the limitations of today's proof assistants, and using only elementary semantic and algorithmic approaches.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

HAL Id: inria-00415861
https://hal.inria.fr/inria-00415861
Submitted on 11 Sep 2009
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Formal verication of a realistic compiler
Xavier Leroy
To cite this version:
Xavier Leroy. Formal verication of a realistic compiler. Communications of the ACM, Association
for Computing Machinery, 2009, 52 (7), pp.107-115. �10.1145/1538788.1538814�. �inria-00415861�

Formal verification of a realistic compiler
Xavier Leroy
INRIA Paris-Rocquencourt
Domaine de Voluceau, B.P. 105, 78153 Le Chesnay, France
xavier.leroy@inria.fr
Abstract
This paper reports on the development and formal verifica-
tion (proof of semantic preservation) of CompCert, a com-
piler from Clight (a large subset of the C programming lan-
guage) to PowerPC assembly code, using the Coq proof as-
sistant both for programming the compiler and for proving
its correctness. Such a verified compiler is useful in the con-
text of critical software and its formal verification: the veri-
fication of the compiler guarantees that the safety properties
proved on the source code hold for the executable compiled
code as well.
1. Introduction
Can you trust your compiler? Compilers are generally
assumed to be semantically transparent: the compiled
code should behave as prescribed by the semantics of the
source program. Yet, compilers—and especially optimizing
compilers—are complex programs that perform complicated
symbolic transformations. Despite intensive testing, bugs
in compilers do occur, causing the compilers to crash at
compile-time or—much worse—to silently generate an
incorrect executable for a correct source program.
For low-assurance software, validated only by testing,
the impact of compiler bugs is low: what is tested is the
executable code produced by the compiler; rigorous testing
should expose compiler-introduced errors along with errors
already present in the source program. Note, however,
that compiler-introduced bugs are notoriously difficult to
expose and track down. The picture changes dramatically
for safety-critical, high-assurance software. Here, validation
by testing reaches its limits and needs to be complemented
or even replaced by the use of formal methods such as
model checking, static analysis, and program proof. Almost
universally, these formal verification tools are applied to
the source code of a program. Bugs in the compiler used to
turn this formally verified source code into an executable
can potentially invalidate all the guarantees so painfully
obtained by the use of formal methods. In a future where
formal methods are routinely applied to source programs,
the compiler could appear as a weak link in the chain that
goes from specifications to executables. The safety-critical
software industry is aware of these issues and uses a variety
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
Copyright 2008 ACM 0001-0782/08/0X00 ...$5.00.
of techniques to alleviate them, such as conducting manual
code reviews of the generated assembly code after having
turned all compiler optimizations off. These techniques
do not fully address the issues, and are costly in terms of
development time and program performance.
An obviously better approach is to apply formal methods
to the compiler itself in order to gain assurance that it pre-
serves the semantics of the source programs. For the last
five years, we have been working on the development of a
realistic, verified compiler called CompCert. By verified, we
mean a compiler that is accompanied by a machine-checked
proof of a semantic preservation property: the generated
machine code behaves as prescribed by the semantics of the
source program. By realistic, we mean a compiler that could
realistically be used in the context of production of critical
software. Namely, it compiles a language commonly used
for critical embedded software: neither Java nor ML nor
assembly code, but a large subset of the C language. It
produces code for a processor commonly used in embedded
systems: we chose the PowerPC because it is popular in
avionics. Finally, the compiler must generate code that is
efficient enough and compact enough to fit the requirements
of critical embedded systems. This implies a multi-pass com-
piler that features good register allocation and some basic
optimizations.
Proving the correctness of a compiler is by no ways a
new idea: the first such proof was published in 1967 [16]
(for the compilation of arithmetic expressions down to stack
machine code) and mechanically verified in 1972 [17]. Since
then, many other proofs have been conducted, ranging from
single-pass compilers for toy languages to sophisticated code
optimizations [8]. In the CompCert experiment, we carry
this line of work all the way to end-to-end verification of a
complete compilation chain from a structured imperative
language down to assembly code through 8 intermediate
languages. While conducting the verification of CompCert,
we found that many of the non-optimizing translations per-
formed, while often considered obvious in the compiler lit-
erature, are surprisingly tricky to formally prove correct.
This paper gives a high-level overview of the CompCert
compiler and its mechanized verification, which uses the Coq
proof assistant [7, 3]. This compiler, classically, consists of
two parts: a front-end translating the Clight subset of C to
a low-level, structured intermediate language called Cminor,
and a lightly-optimizing back-end generating PowerPC as-
sembly code from Cminor. A detailed description of Clight
can be found in [5]; of the compiler front-end in [4]; and of
the compiler back-end in [11, 13]. The complete source code

of the Coq development, extensively commented, is available
on the Web [12].
The remainder of this paper is organized as follows. Sec-
tion 2 compares and formalizes several approaches to estab-
lishing trust in the results of compilation. Section 3 de-
scribes the structure of the CompCert compiler, its p erfor-
mance, and how the Coq proof assistant was used not only
to prove its correctness but also to program most of it. By
lack of space, we will not detail the formal verification of
every compilation pass. However, section 4 provides a tech-
nical overview of such a verification for one crucial pass of
the compiler: register allocation. Finally, section 5 presents
preliminary conclusions and directions for future work.
2. Approaches to trusted compilation
2.1 Notions of semantic preservation
Consider a source program S and a compiled program C
produced by a compiler. Our aim is to prove that the seman-
tics of S was preserved during compilation. To make this
notion of semantic preservation precise, we assume given se-
mantics for the source and target languages that associate
observable behaviors B to S and C. We write S B to
mean that program S executes with observable behavior B.
The behaviors we observe in CompCert include termination,
divergence, and “going wrong” (invoking an undefined oper-
ation that could crash, such as accessing an array out of
bounds). In all cases, behaviors also include a trace of the
input-output operations (system calls) performed during the
execution of the program. Behaviors therefore reflect accu-
rately what the user of the program, or more generally the
outside world the program interacts with, can observe.
The strongest notion of semantic preservation during com-
pilation is that the source program S and the compiled
code C have exactly the same observable behaviors:
B, S B C B (1)
Notion (1) is too strong to be usable. If the source lan-
guage is not deterministic, compilers are allowed to select
one of the possible behaviors of the source program. (For
instance, C compilers choose one particular evaluation or-
der for expressions among the several orders allowed by the
C specifications.) In this case, C will have fewer behaviors
than S. Additionally, compiler optimizations can optimize
away “going wrong” behaviors. For example, if S can go
wrong on an integer division by zero but the compiler elim-
inated this computation because its result is unused, C will
not go wrong. To account for these degrees of freedom in
the compiler, we relax definition (1) as follows:
S safe = (B, C B = S B) (2)
(Here, S safe means that none of the possible behaviors of
S is a “going wrong” behavior.) In other words, if S does not
goes wrong, then neither does C; moreover, all observable
behaviors of C are acceptable behaviors of S.
In the CompCert experiment and the remainder of this pa-
per, we focus on source and target languages that are deter-
ministic (programs change their behaviors only in response
to different inputs but not because of internal choices) and
on execution environments that are deterministic as well
(the inputs given to the programs are uniquely determined
by their previous outputs). Under these conditions, there
exists exactly one behavior B such that S B, and simi-
larly for C. In this case, it is easy to prove that property (2)
is equivalent to:
B / Wrong, S B = C B (3)
(Here, Wrong is the set of “going wrong” behaviors.) Prop-
erty (3) is generally much easier to prove than property (2),
since the proof can proceed by induction on the execution
of S. This is the approach that we take in this work.
From a formal methods perspective, what we are really
interested in is whether the compiled code satisfies the func-
tional specifications of the application. Assume that these
specifications are given as a predicate Spec(B) of the observ-
able behavior. We say that C satisfies the specifications,
and write C |= Spec, if C cannot go wrong (C safe) and
all behaviors of B satisfy Spec (B, C B = Spec(B)).
The expected correctness property of the compiler is that it
preserves the fact that the source code S satisfies the specifi-
cation, a fact that has been established separately by formal
verification of S:
S |= Spec = C |= Spec (4)
It is easy to show that property (2) implies property (4) for
all specifications Spec. Therefore, establishing property (2)
once and for all spares us from establishing property (4) for
every specification of interest.
A special case of property (4), of considerable historical
importance, is the preservation of type and memory safety,
which we can summarize as “if S does not go wrong, neither
does C:
S safe = C safe (5)
Combined with a separate check that S is well-typed in a
sound type system, property (5) implies that C executes
without memory violations. Type-preserving compilation
[18] obtains this guarantee by different means: under the
assumption that S is well typed, C is proved to be well-
typed in a sound type system, ensuring that it cannot go
wrong. Having proved properties (2) or (3) provides the
same guarantee without having to equip the target and in-
termediate languages with sound type systems and to prove
type preservation for the compiler.
2.2 Verified, validated, certifying compilers
We now discuss several approaches to establishing that a
compiler preserves semantics of the compiled programs, in
the sense of section 2.1. In the following, we write S C,
where S is a source program and C is compiled code, to
denote one of the semantic preservation properties (1) to
(5) of section 2.1.
Verified compilers. We model the compiler as a total
function Comp from source programs to either compiled
code (written Comp(S) = OK(C)) or a compile-time error
(written Comp(S) = Error). Compile-time errors corre-
spond to cases where the compiler is unable to produce code,
for instance if the source program is incorrect (syntax error,
type error, etc.), but also if it exceeds the capacities of the
compiler. A compiler Comp is said to be verified if it is
accompanied with a formal proof of the following property:
S, C, Comp(S) = OK(C) = S C (6)
In other words, a verified compiler either reports an error or
produces code that satisfies the desired correctness property.

Notice that a compiler that always fails (Comp(S) = Error
for all S) is indeed verified, although useless. Whether the
compiler succeeds to compile the source programs of interest
is not a correctness issue, but a quality of implementation
issue, which is addressed by non-formal methods such as
testing. The important feature, from a formal verification
standpoint, is that the compiler never silently produces in-
correct code.
Verifying a compiler in the sense of definition (6)
amounts to applying program proof technology to the
compiler sources, using one of the properties defined in
section 2.1 as the high-level specification of the compiler.
Translation validation with verified validators In
the translation validation approach [22, 20] the compiler
does not need to be verified. Instead, the compiler is
complemented by a validator: a boolean-valued function
Validate(S, C) that verifies the property S C a posteriori.
If Comp(S) = OK(C) and Validate(S, C) = true, the
compiled code C is deemed trustworthy. Validation can
be performed in several ways, ranging from symbolic inter-
pretation and static analysis of S and C to the generation
of verification conditions followed by model checking or
automatic theorem proving. The property S C being
undecidable in general, validators are necessarily incomplete
and should reply false if they cannot establish S C.
Translation validation generates additional confidence in
the correctness of the compiled code, but by itself does not
provide formal guarantees as strong as those provided by
a verified compiler: the validator could itself be incorrect.
To rule our this possibility, we say that a validator Validate
is verified if it is accompanied with a formal proof of the
following property:
S, C, Validate(S, C) = true = S C (7)
The combination of a verified validator Validate with an
unverified compiler Comp does provide formal guarantees
as strong as those provided by a verified compiler. Indeed,
consider the following function:
Comp
(S) =
match Comp(S) with
| Error Error
| OK(C) if Validate(S, C) then OK(C) else Error
This function is a verified compiler in the sense of defini-
tion (6). Verification of a translation validator is therefore
an attractive alternative to the verification of a compiler,
provided the validator is smaller and simpler than the com-
piler.
Proof-carrying code and certifying compilers The
proof-carrying code (PCC) approach [19, 1] does not at-
tempt to establish semantic preservation between a source
program and some compiled code. Instead, PCC focuses
on the generation of independently-checkable evidence that
the compiled code C satisfies a behavioral specification
Spec such as type and memory safety. PCC makes use of a
certifying compiler, which is a function CComp that either
fails or returns both a compiled code C and a proof π of the
property C |= Spec. The proof π, also called a certificate,
can be checked independently by the code user; there is no
need to trust the code producer, nor to formally verify the
compiler itself. The only part of the infrastructure that
needs to be trusted is the client-side checker: the program
that checks whether π entails the property C |= Spec.
As in the case of translation validation, it suffices to
formally verify the client-side checker to obtain guarantees
as strong as those obtained from compiler verification of
property (4). Symmetrically, a certifying compiler can be
constructed, at least theoretically, from a verified compiler,
provided that the verification was conducted in a logic
that follows the “propositions as types, proofs as programs”
paradigm. The construction is detailed in [11, section 2].
2.3 Composition of compilation passes
Compilers are naturally decomposed into several passes
that communicate through intermediate languages. It is
fortunate that verified compilers can also be decomposed
in this manner. Consider two verified compilers Comp
1
and
Comp
2
from languages L
1
to L
2
and L
2
to L
3
, respectively.
Assume that the semantic preservation property is tran-
sitive. (This is true for properties (1) to (5) of section 2.1.)
Consider the error-propagating composition of Comp
1
and
Comp
2
:
Comp(S) = match Comp
1
(S) with
| Error Error
| OK(I) Comp
2
(I)
It is trivial to show that this function is a verified compiler
from L
1
to L
3
.
2.4 Summary
The conclusions of this discussion are simple and define
the methodology we have followed to verify the CompCert
compiler back-end. First, provided the target language of
the compiler has deterministic semantics, an appropriate
specification for the correctness pro of of the compiler is the
combination of definitions (3) and (6):
S, C, B / Wrong, Comp(S) = OK(C) S B = C B
Second, a verified compiler can be structured as a com-
position of compilation passes, following common practice.
However, all intermediate languages must be given appro-
priate formal semantics.
Finally, for each pass, we have a choice between prov-
ing the code that implements this pass or performing the
transformation via untrusted code, then verifying its results
using a verified validator. The latter approach can reduce
the amount of code that needs to be verified.
3. Overview of the CompCert compiler
3.1 The source language
The source language of the CompCert compiler, called
Clight [5], is a large subset of the C programming language,
comparable to the subsets commonly recommended for
writing critical embedded software. It supports almost all
C data types, including pointers, arrays, struct and union
types; all structured control ( if/then, loops, break, con-
tinue, Java-style switch); and the full power of functions,
including recursive functions and function pointers. The
main omissions are extended-precision arithmetic (long
long and long double); the goto statement; non-structured
forms of switch such as Duff’s device; passing struct and
union parameters and results by value; and functions
with variable numbers of arguments. Other features of

Clight C#minor
Cminor
CminorSel
RTLLTLLTLin
Linear
Mach
PPC
simplifications
type elimination
stack pre-
-allocation
instruction
selection
CFG
construction
register
allocation
code
linearization
spilling, reloading
calling conventions
layout of
stack frames
PowerPC code
generation
CSELCM
constant propagation
branch tunneling
instr. scheduling
parsing, elaboration
(not verified)
assembling, linking
(not verified)
Figure 1: Compilation passes and intermediate languages.
C are missing from Clight but are supported through
code expansion (“de-sugaring”) during parsing: side effects
within expressions (Clight expressions are side-effect free)
and block-scoped variables (Clight has only global and
function-local variables).
The semantics of Clight is formally defined in big-step op-
erational style. The semantics is deterministic and makes
precise a number of behaviors left unspecified or undefined
in the ISO C standard, such as the sizes of data types, the re-
sults of signed arithmetic operations in case of overflow, and
the evaluation order. Other undefined C behaviors are con-
sistently turned into “going wrong” behaviors, such as deref-
erencing the null pointer or accessing arrays out of bounds.
Memory is modeled as a collection of disjoint blocks, each
block being accessed through byte offsets; pointer values are
pairs of a block identifier and a byte offset. This way, pointer
arithmetic is modeled accurately, even in the presence of
casts between incompatible pointer types.
3.2 Compilation passes and intermediate languages
The formally verified part of the CompCert compiler
translates from Clight abstract syntax to PPC abstract
syntax, PPC being a subset of PowerPC assembly language.
As depicted in figure 1, the compiler is composed of
14 passes that go through 8 intermediate languages. Not
detailed in figure 1 are the parts of the compiler that are not
verified yet: upstream, a parser, type-checker and simplifier
that generates Clight abstract syntax from C source files
and is based on the CIL library [21]; downstream, a printer
for PPC abstract syntax trees in concrete assembly syntax,
followed by generation of executable binary using the
system’s assembler and linker.
The front-end of the compiler translates away C-specific
features in two passes, going through the C#minor and Cmi-
nor intermediate languages. C#minor is a simplified, type-
less variant of Clight where distinct arithmetic operators are
provided for integers, pointers and floats, and C loops are re-
placed by infinite loops plus blocks and multi-level exits from
enclosing blocks. The first pass translates C loops accord-
ingly and eliminates all type-dependent behaviors: operator
overloading is resolved; memory loads and stores, as well as
address computations, are made explicit. The next inter-
mediate language, Cminor, is similar to C#minor with the
omission of the & (address-of) operator. Cminor function-
local variables do not reside in memory, and their address
cannot be taken. However, Cminor supports explicit stack
allocation of data in the activation records of functions. The
translation from C#minor to Cminor therefore recognizes
scalar local variables whose addresses are never taken, as-
signing them to Cminor local variables and making them
candidates for register allocation later; other local variables
are stack-allocated in the activation record.
The compiler back-end starts with an instruction se-
lection pass, which recognizes opportunities for using
combined arithmetic instructions (add-immediate, not-and,
rotate-and-mask, etc.) and addressing modes provided by
the target processor. This pass proceeds by bottom-up
rewriting of Cminor expressions. The target language is
CminorSel, a processor-dependent variant of Cminor that
offers additional operators, addressing modes, and a class of
condition expressions (expressions evaluated for their truth
value only).
The next pass translates CminorSel to RTL, a classic
register transfer language where control is represented as a
control-flow graph (CFG). Each node of the graph carries
a machine-level instruction operating over temporaries
(pseudo-registers). RTL is a convenient representation to
conduct optimizations based on dataflow analyses. Two
such optimizations are currently implemented: constant
propagation and common subexpression elimination, the
latter being performed via value numbering over extended
basic blocks. A third optimization, lazy code motion,
was developed separately and will be integrated soon.
Unlike the other two optimizations, lazy code motion is
implemented following the verified validator approach [24].
After these optimizations, register allocation is performed
via coloring of an interference graph [6]. The output of this
pass is LTL, a language similar to RTL where temporaries
are replaced by hardware registers or abstract stack loca-
tions. The control-flow graph is then “linearized”, producing
a list of instructions with explicit labels, conditional and un-
conditional branches. Next, spills and reloads are inserted
around instructions that reference temporaries that were al-
located to stack locations, and moves are inserted around
function calls, prologues and epilogues to enforce calling con-
ventions. Finally, the “stacking” pass lays out the activation
records of functions, assigning offsets within this record to

Citations
More filters
Journal ArticleDOI
04 Jun 2011
TL;DR: Csmith, a randomized test-case generation tool, is created and spent three years using it to find compiler bugs, and a collection of qualitative and quantitative results about the bugs it found are presented.
Abstract: Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to compiler developers. Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. In this paper we present our compiler-testing tool and the results of our bug-hunting study. Our first contribution is to advance the state of the art in compiler testing. Unlike previous tools, Csmith generates programs that cover a large subset of C while avoiding the undefined and unspecified behaviors that would destroy its ability to automatically find wrong-code bugs. Our second contribution is a collection of qualitative and quantitative results about the bugs we have found in open-source C compilers.

799 citations


Cites background from "Formal verification of a realistic ..."

  • ...Wolfe [30], talk­ ing about independent software vendors (ISVs) says: An ISV with a complex code can work around correctness, turn off the optimizer in one or two .les, and usually they have to do that for any of the compilers they use (emphasis ours)....

    [...]

Proceedings ArticleDOI
08 Jan 2014
TL;DR: This work has developed and mechanically verified an ML system called CakeML, which supports a substantial subset of Standard ML, and its formally verified compiler can bootstrap itself: it applies the verified compiler to itself to produce a verified machine-code implementation of the compiler.
Abstract: We have developed and mechanically verified an ML system called CakeML, which supports a substantial subset of Standard ML. CakeML is implemented as an interactive read-eval-print loop (REPL) in x86-64 machine code. Our correctness theorem ensures that this REPL implementation prints only those results permitted by the semantics of CakeML. Our verification effort touches on a breadth of topics including lexing, parsing, type checking, incremental and dynamic compilation, garbage collection, arbitrary-precision arithmetic, and compiler bootstrapping.Our contributions are twofold. The first is simply in building a system that is end-to-end verified, demonstrating that each piece of such a verification effort can in practice be composed with the others, and ensuring that none of the pieces rely on any over-simplifying assumptions. The second is developing novel approaches to some of the more challenging aspects of the verification. In particular, our formally verified compiler can bootstrap itself: we apply the verified compiler to itself to produce a verified machine-code implementation of the compiler. Additionally, our compiler proof handles diverging input programs with a lightweight approach based on logical timeout exceptions. The entire development was carried out in the HOL4 theorem prover.

315 citations


Cites background or methods from "Formal verification of a realistic ..."

  • ...The CompCert compiler [14] and projects based on it, including CompcertTSO [29] and the Princeton “verified software toolchain” [1], focus on a C-like source language in contrast to our ML-like language....

    [...]

  • ...The last decade has seen a strong interest in verified compilation; and there have been significant, high-profile results, many based on the CompCert compiler for C [1, 14, 16, 29]....

    [...]

Proceedings ArticleDOI
03 Jun 2015
TL;DR: Verdi, a framework for implementing and formally verifying distributed systems in Coq, formalizes various network semantics with different faults, and enables the developer to first verify their system under an idealized fault model then transfer the resulting correctness guarantees to a more realistic fault model without any additional proof burden.
Abstract: Distributed systems are difficult to implement correctly because they must handle both concurrency and failures: machines may crash at arbitrary points and networks may reorder, drop, or duplicate packets. Further, their behavior is often too complex to permit exhaustive testing. Bugs in these systems have led to the loss of critical data and unacceptable service outages. We present Verdi, a framework for implementing and formally verifying distributed systems in Coq. Verdi formalizes various network semantics with different faults, and the developer chooses the most appropriate fault model when verifying their implementation. Furthermore, Verdi eases the verification burden by enabling the developer to first verify their system under an idealized fault model, then transfer the resulting correctness guarantees to a more realistic fault model without any additional proof burden. To demonstrate Verdi's utility, we present the first mechanically checked proof of linearizability of the Raft state machine replication algorithm, as well as verified implementations of a primary-backup replication system and a key-value store. These verified systems provide similar performance to unverified equivalents.

275 citations


Cites methods from "Formal verification of a realistic ..."

  • ...The CompCert C compiler [22] was verified in Coq and repeatedly shown to be more reliable than traditionally developed compilers [21, 41]....

    [...]

Book ChapterDOI
03 Apr 2017
TL;DR: This work defined EVM in Lem, a language that can be compiled for a few interactive theorem provers, which is the first formal EVM definition for smart contract verification that implements all instructions.
Abstract: Smart contracts in Ethereum are executed by the Ethereum Virtual Machine (EVM). We defined EVM in Lem, a language that can be compiled for a few interactive theorem provers. We tested our definition against a standard test suite for Ethereum implementations. Using our definition, we proved some safety properties of Ethereum smart contracts in an interactive theorem prover Isabelle/HOL. To our knowledge, ours is the first formal EVM definition for smart contract verification that implements all instructions. Our definition can serve as a basis for further analysis and generation of Ethereum smart contracts.

265 citations

Proceedings ArticleDOI
06 Mar 2017
TL;DR: This work presents a series of algorithms and an accompanying system that enables robots to autonomously synthesize policy descriptions and respond to both general and targeted queries by human collaborators, demonstrating applicability to a variety of robot controller types.
Abstract: Shared expectations and mutual understanding are critical facets of teamwork. Achieving these in human-robot collaborative contexts can be especially challenging, as humans and robots are unlikely to share a common language to convey intentions, plans, or justifications. Even in cases where human co-workers can inspect a robot's control code, and particularly when statistical methods are used to encode control policies, there is no guarantee that meaningful insights into a robot's behavior can be derived or that a human will be able to efficiently isolate the behaviors relevant to the interaction. We present a series of algorithms and an accompanying system that enables robots to autonomously synthesize policy descriptions and respond to both general and targeted queries by human collaborators. We demonstrate applicability to a variety of robot controller types including those that utilize conditional logic, tabular reinforcement learning, and deep reinforcement learning, synthesizing informative policy descriptions for collaborators and facilitating fault diagnosis by non-experts.

222 citations


Cites methods from "Formal verification of a realistic ..."

  • ...Traditional methods for acquiring an understanding of control logic typically involve some form of source code inspection and documentation review [2, 13] or formal verification given proper specifications [36, 23]....

    [...]

References
More filters
Proceedings ArticleDOI
01 Jan 1997
TL;DR: It is shown in this paper how proof-carrying code might be used to develop safe assembly-language extensions of ML programs and the adequacy of concrete representations for the safety policy, the safety proofs, and the proof validation is proved.
Abstract: This paper describes proof-carrying code (PCC), a mechanism by which a host system can determine with certainty that it is safe to execute a program supplied (possibly in binary form) by an untrusted source. For this to be possible, the untrusted code producer must supply with the code a safety proof that attests to the code's adherence to a previously defined safety policy. The host can then easily and quickly validate the proof without using cryptography and without consulting any external agents.In order to gain preliminary experience with PCC, we have performed several case studies. We show in this paper how proof-carrying code might be used to develop safe assembly-language extensions of ML programs. In the context of this case study, we present and prove the adequacy of concrete representations for the safety policy, the safety proofs, and the proof validation. Finally, we briefly discuss how we use proof-carrying code to develop network packet filters that are faster than similar filters developed using other techniques and are formally guaranteed to be safe with respect to a given operating system safety policy.

1,799 citations

Book
12 Mar 2014
TL;DR: A practical introduction to the development of proofs and certified programs using Coq can be found in this paper, which is an invaluable tool for researchers, students, and engineers interested in formal methods and the developing of zero-fault software.
Abstract: A practical introduction to the development of proofs and certified programs using Coq. An invaluable tool for researchers, students, and engineers interested in formal methods and the development of zero-fault software.

1,514 citations


"Formal verification of a realistic ..." refers methods in this paper

  • ...This paper gives a high-level overview of the CompCert compiler and its mechanized verification, which uses the Coq proof assistant [7, 3]....

    [...]

Book ChapterDOI
08 Apr 2002
TL;DR: The structure of CIL is described, with a focus on how it disambiguates those features of C that were found to be most confusing for program analysis and transformation, allowing a complete project to be viewed as a single compilation unit.
Abstract: This paper describes the C Intermediate Language: a high-level representation along with a set of tools that permit easy analysis and source-to-source transformation of C programs.Compared to C, CIL has fewer constructs. It breaks down certain complicated constructs of C into simpler ones, and thus it works at a lower level than abstract-syntax trees. But CIL is also more high-level than typical intermediate languages (e.g., three-address code) designed for compilation. As a result, what we have is a representation that makes it easy to analyze and manipulate C programs, and emit them in a form that resembles the original source. Moreover, it comes with a front-end that translates to CIL not only ANSI C programs but also those using Microsoft C or GNU C extensions.We describe the structure of CIL with a focus on how it disambiguates those features of C that we found to be most confusing for program analysis and transformation. We also describe a whole-program merger based on structural type equality, allowing a complete project to be viewed as a single compilation unit. As a representative application of CIL, we show a transformation aimed at making code immune to stack-smashing attacks. We are currently using CIL as part of a system that analyzes and instruments C programs with run-time checks to ensure type safety. CIL has served us very well in this project, and we believe it can usefully be applied in other situations as well.

1,065 citations

Book
14 May 2004
TL;DR: The similarity between Fixpoint and fix makes it easier to understand the need for the various parts of this construct, and the construction of higher-order types and simple inductive types defined inside a section is helpful to understanding the form of the induction principle.
Abstract: ion makes it possible to build non-recursive functions directly inside Calculus of Constructions terms, without giving a name to them, but for recursive functions the Fixpoint command always gives a name to the defined function. It mixes the two operations: first the description of a recursive function, second the definition of a constant having this function as value. With the fix construct, we can have only the first operation; in this sense, it is similar to the abstraction construct. Here is the syntax for this construct: As with the Fixpoint command, the {struct ai} is not mandatory if p = l. In the particular case where one defines only one recursive function the two occurrences of the identifier f must coincide. This identifier is used to denote the recursive function being defined but it can only be used inside the expression expr. The similarity between Fixpoint and fix makes it easier to understand the need for the various parts of this construct. For instance, the mul t2 function could also have been declared in the following manner: Definition mult2' : nat~nat := fix f (n:nat) : nat .= match n with 0 =} 0 I S p =} S (S (f p)) end. Here we have willingly changed the name given to the recursive function inside the fix construct to underline the fact that f is bound only inside the construct. Thus, this identifier has no relation with the name under which the function will be known. 6.4 Polymorphic Types 175 6.4 Polymorphic Types Among the operations that one can perform on binary trees carrying integer values, many rely only on the tree structure, but are independent of the fact that the values are integer values. For instance, we can compute the size or the height of a tree without looking at the values. It is sensible to define a general type of tree, in which the type of elements is left as a parameter, and use instances of this general type according to the needs of our algorithms. This is similar to the polymorphic data structures available in conventional functional languages, or the generic data structures of Ada. We describe this notion of polymorphism on lists, pairs, etc. 6.4.1 Polymorphic Lists The Coq system provides a theory of polymorphic lists in the package List. Require Import List. Print list. Inductive list (A : Set) : Set := nil : list A / cons : A -+ list A -+ list A For nil: Argument A is implicit For cons: Argument A is implicit For list: Argument scope is [type_scope] For nil: Argument scope is [type_scope] For cons: Argument scopes are [type_scope __ ] The Coq system provides a notation for lists, so that the expression" cons a l" is actually denoted "a: : l." We see here that the inductive type being defined does not occur as a simple identifier, but as a dependent type with one argument. The value of this argument is always A, the parameter of the definition, as given in the first line of the definition. This definition behaves as if we were actually defining a whole family of inductive types, indexed over the sort Set. This illustrates the construction of higher-order types that we saw in Chap. 4. There may be several parameters in an inductive definition. When parameters are provided, they must appear at every use of the type being defined. Everything happens as if the inductive type had been declared in a section, with a context where A is bound as a variable. Thus, the definition above is equivalent to a definition of the following form: Section define_lists. Variable A : Set. Inductive list' : Set := I nil' : list' I cons' : A -+ list' -+ list'. End define_lists. 176 6 Inductive Data Types This analogy between polymorphic inductive types and simple inductive types defined inside a section is helpful to understand the form of the induction principle. Let us first study the type of the induction principle as it would have been constructed inside the section: list'_indO : 'v'P : list' --+Prop, P nil' --+ ('v'(x:A)(l:list'), P 1 --+ P (cons' x 1»--+ 'v'x:list', P x. When the section is closed, the variable A is discharged, the type list' must be abstracted over A, the constructors, too, and the induction principle must take these changes into account: Check list'. list' : Set--+Set Check nil'. nil' : 'v' A:Set, list' A Check cons'. cons' : 'v' A:Set, A --+ list' A --+ list' A Check list'_ind. list' ind: 'v' (A:Set)(P:list' A --+ Prop), P (nil' A) --+{V (a:A)(l:list' A), P l--+ P (cons' A a l)) --+ 'v' l:list' A, P l From a practical point of view, an important characteristic of parametric inductive definitions is that the universal quantification appears before the universal quantification over the property that is the object of the proof by induction. This characteristic will be important in comparison with the inductive principles for inductive definitions of variably dependent types (see Sect. 6.5.2) Recursive functions and pattern matching on polymorphic types can be performed in the same manner as for the inductive types of the previous sections. However, there is an important difference; the parameters must not appear in the left-hand side of pattern matching clauses. For instance, the function to concatenate polymorphic lists is defined by an expression of this form: Fixpoint app (A:Set) (1 m:list A){struct I} : list A := match 1 with I nil :::} m I cons a 11 :::} cons a (app A 11 m) end. In this pattern matching construct, nil appears in the left-hand side of its clause without its Set argument. The same occurs for the cons constructor, 6.4 Polymorphic Types 177 even though cons normally has three arguments; the pattern only has two. The reasoh for removing the parameter arguments from the constructors is that these parameters cannot be bound in the pattern. The type A for the values is already fixed because an expression of type "list A" is being analyzed by the pattern matching construct. In the right-hand side of the second clause, cons also appears with two arguments, but this is because the function is defined with the first argument being implicit (see Sect. 4.2.3.1). Use of implicit arguments for functions manipulating polymorphic types is frequent. For instance, the function app also has its first argument as an implicit argument. For this function, the Coq system also provides an infix notation, where "app h h" is actually denoted "h++h." Exercise 6.34 Build a polymorphic function that takes a list as argument and returns a list containing the first two elements when they exist. Exercise 6.35 Build a function that takes a natural number, n, and a list as arguments and returns the list containing the first n elements of the list when they exist. Exercise 6.36 Build a function that takes a list of integers as argument and returns the sum of these numbers. Exercise 6.37 Build a function that takes a natural number n as argument and builds a list containing n occurrences of the number one. Exercise 6.38 Build a function that takes a number n and returns the list containing the integers from 1 to n, in this order. 6.4.2 The option Type Polymorphic types need not be truly recursive. A frequent example is the option type that is well-adapted to describe a large class of partial functions. This type is also present in conventional functional programming languages. Its inductive definition has the following form: Print option. Inductive option (A:Set) : Set := Some: A-+option A / None: option A For Some: Argument A is implicit For None: Argument A is implicit For option: Argument scope is [type_scope] For Some: Argument scopes are [type_scope _] For None: Argument scope is [type_scope] 178 6 Inductive Data Types When we need to define a function that is not total from a type A to a type B, it is often possible to describe it as a total function from A to "option B," with the convention that the value "None" is the result when the function is not defined and the value is "Some y" when the function would have been defined with value y. For instance, the Coq library contains a pred function of type nat---+nat that maps any natural number to its predecessor (when it exists) and maps zero to itself. A partial function could have been defined that does not have a value for the number zero. The definition would have been as follows: Definition pred_option (n:nat) : option nat := match n with 0 => None I S P => Some pend. To use a value of the option type, a case analysis is necessary, to express explicitly how the computation proceeds when no true value is given. For instance, the function that returns the predecessor's predecessor can be defined as follows: Definition pred2_option (n:nat) : option nat := match pred_option n with I None => None I Some p => pred_option p end. As a second example, we can consider the function that returns the nth element of a list. The Coq library provides a function called nth for this requirement but that function takes an extra argument which is used for the result when the list has less than n elements. An alternative could be to define a function whose result belongs in the option type. This function can be defined using simultaneous pattern matching on the number and the list. Both arguments decrease at each recursive call, so that the principal recursion argument could be either of them. Here is one of the two possible versions: Fixpoint nth_option (A:Set) (n:nat) (l:list A){struct l} : option A := match n, 1 with I 0, cons a tl => Some a I S p, cons a tl => nth_option A p tl I n, nil => None end. Exercise 6.39 Define the other variant "nth_option'." The arguments are given in the same order, but the principal argument is the number n. Prove that both functions always give the same result when applied on the same input. Exercise 6.40 * Prove V(A:Set)(n:nat)(l:list A), nth_option A n 1 = None ---+ length 1 < n. 6.4 Polymorphic Types 179 Exercise 6.41 * Define a function that maps a type A in sort Set, a function f of type A---+booL, and a list L to the first element :z: in L such that "f x" is true. 6.4.3 The Type of Pairs Pairs prov

1,025 citations


"Formal verification of a realistic ..." refers background in this paper

  • ...Having proved properties (2) or ( 3 ) provides the same guarantee without having to equip the target and intermediate languages with sound type systems and to prove type preservation for the compiler....

    [...]

  • ...To address concern ( 3 ), ongoing work within the...

    [...]

  • ...(Here, Wrong is the set of “going wrong” behaviors.) Property ( 3 ) is generally much easier to prove than property (2), since the proof can proceed by induction on the execution of S. This is the approach that we take in this work....

    [...]

  • ...First, provided the target language of the compiler has deterministic semantics, an appropriate specification for the correctness proof of the compiler is the combination of definitions ( 3 ) and (6):...

    [...]

Proceedings ArticleDOI
Gregory J. Chaitin1
01 Jun 1982
TL;DR: This work has discovered how to extend the graph coloring approach so that it naturally solves the spilling problem, and produces better object code and takes much less compile time.
Abstract: In a previous paper we reported the successful use of graph coloring techniques for doing global register allocation in an experimental PL/I optimizing compiler. When the compiler cannot color the register conflict graph with a number of colors equal to the number of available machine registers, it must add code to spill and reload registers to and from storage. Previously the compiler produced spill code whose quality sometimes left much to be desired, and the ad hoe techniques used took considerable amounts of compile time. We have now discovered how to extend the graph coloring approach so that it naturally solves the spilling problem. Spill decisions are now made on the basis of the register conflict graph and cost estimates of the value of keeping the result of a computation in a register rather than in storage. This new approach produces better object code and takes much less compile time.

895 citations

Frequently Asked Questions (11)
Q1. What contributions have the authors mentioned in the paper "Formal verification of a realistic compiler" ?

This paper reports on the development and formal verification ( proof of semantic preservation ) of CompCert, a compiler from Clight ( a large subset of the C programming language ) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness. 

The CompCert experiment described in this paper is still ongoing, and much work remains to be done: handle a larger subset of C ( e. g. including goto ) ; deploy and prove correct more optimizations ; target other processors beyond PowerPC ; extend the semantic preservation proofs to shared-memory concurrency ; etc. However, the preliminary results obtained so far provide strong evidence that the initial goal of formally verifying a realistic compiler can be achieved, within the limitations of today ’ s proof assistants, and using only elementary semantic and algorithmic approaches. The techniques and tools the authors used are very far from perfect—more proof automation, higher-level semantics and more modern intermediate representations all have the potential to significantly reduce the proof effort—but good enough to achieve the goal. Composed with the CompCert back-end, these efforts could eventually result in a trusted execution path for programs written and verified in Coq, like CompCert itself, therefore increasing confidence further through a form of bootstrapping. 

The behaviors the authors observe in CompCert include termination, divergence, and “going wrong” (invoking an undefined operation that could crash, such as accessing an array out of bounds). 

In the CompCert experiment and the remainder of this paper, the authors focus on source and target languages that are deterministic (programs change their behaviors only in response to different inputs but not because of internal choices) and on execution environments that are deterministic as well (the inputs given to the programs are uniquely determined by their previous outputs). 

For low-assurance software, validated only by testing, the impact of compiler bugs is low: what is tested is the executable code produced by the compiler; rigorous testing should expose compiler-introduced errors along with errors already present in the source program. 

The expected correctness property of the compiler is that it preserves the fact that the source code S satisfies the specification, a fact that has been established separately by formal verification of S:S |= Spec =⇒ C |= Spec (4)It is easy to show that property (2) implies property (4) for all specifications Spec. 

a certifying compiler can be constructed, at least theoretically, from a verified compiler, provided that the verification was conducted in a logic that follows the “propositions as types, proofs as programs” paradigm. 

provided the target language of the compiler has deterministic semantics, an appropriate specification for the correctness proof of the compiler is the combination of definitions (3) and (6):∀S, C, B /∈ Wrong, Comp(S) = OK(C) ∧ 

validation by testing reaches its limits and needs to be complemented or even replaced by the use of formal methods such as model checking, static analysis, and program proof. 

By verified, the authors mean a compiler that is accompanied by a machine-checked proof of a semantic preservation property: the generated machine code behaves as prescribed by the semantics of the source program. 

Bugs in the compiler used to turn this formally verified source code into an executable can potentially invalidate all the guarantees so painfully obtained by the use of formal methods.