scispace - formally typeset
Search or ask a question
Book ChapterDOI

A Concrete Memory Model for CompCert

TL;DR: This paper presents the proof of an enhanced and more concrete memory model for the CompCert C compiler which assigns a definite meaning to more C programs and proves formally the soundness of CompCert’s abstract semantics of pointers.
Abstract: Semantics preserving compilation of low-level C programs is challenging because their semantics is implementation defined according to the C standard. This paper presents the proof of an enhanced and more concrete memory model for the CompCert C compiler which assigns a definite meaning to more C programs. In our new formally verified memory model, pointers are still abstract but are nonetheless mapped to concrete 32-bit integers. Hence, the memory is finite and it is possible to reason about the binary encoding of pointers. We prove that the existing memory model is an abstraction of our more concrete model thus validating formally the soundness of CompCert’s abstract semantics of pointers. We also show how to adapt the front-end of CompCert thus demonstrating that it should be feasible to port the whole compiler to our novel memory model.

Summary (4 min read)

1 Introduction

  • Yet, a theorem about the source code of a safety critical software is not sufficient.
  • The CompCert compiler [17] fills this verification gap: its semantics preservation theorem ensures that when the source program has a defined semantics, program invariants proved at source level still hold for the compiled code.
  • Yet, these approaches are, by essence, limited by the formal semantics of CompCert C: programs exhibiting undefined behaviours cannot benefit from any semantic preservation guarantee.
  • The authors prove that the existing memory model of CompCert is an abstraction of their model thus validating the soundness of the existing semantics.
  • The authors adapt the proof of CompCert’s front-end passes, from CompCert C until Cminor, thus demonstrating the feasibility of their endeavour.

2 A More Concrete Memory Model for CompCert

  • In previous work [3], the authors propose an enhanced memory model (with symbolic expressions) for CompCert.
  • The authors empirically verify, using the reference interpreter of CompCert, that their extension is sound with respect to the existing semantics and that it captures low-level C idioms out of reach of the existing memory model.
  • This section first recalls the main features of the current CompCert memory model and then explains their extension to this memory model.

2.1 CompCert’s Memory Model

  • Leroy et al. [18] give a thorough presentation of the existing memory model of CompCert, that is shared by all the languages of the compiler.
  • The authors give a brief overview of its design in order to highlight the differences with their own model.
  • Pointer arithmetic modifies the offset part of a location, keeping its block identifier part unchanged.
  • The free operation may also fail (e.g. when the locations to be freed have been freed already).
  • In the memory model, the byte-level, in-memory representation of integers and floats is exposed, while pointers are kept abstract [18].

2.2 Motivation for an Enhanced Memory Model

  • The authors memory model with symbolic expressions [3] gives a precise semantics to low-level C idioms which cannot be modelled by the existing memory model.
  • Other examples are robust implementations of malloc: for the sake of checking the integrity of pointers, their trailing bits store a checksum.
  • This is possible because those pointers are also aligned and therefore the trailing bits are necessarily 0s.
  • The expected semantics is therefore that the program returns 1.
  • The transformation is correct and the target code generated by CompCert correctly returns 1.

2.3 A Memory Model with Symbolic Expressions

  • This model lacks an essential property of CompCert’s semantics: determinism.
  • Determinism is instrumental for the simulation proofs of the compiler passes and its absence is a show stopper.
  • The authors define the evaluation of expressions as the function J·Kcm, parametrised by the concrete mapping cm.
  • Pointers are turned into their concrete value, as dictated by cm.
  • The value of the expression is 1 whatever the value of undef and therefore the normalisation succeeds and returns, as expected, the value 1.

3 Proving the Operations of the Memory Model

  • CompCert’s memory model exports an interface summarising all the properties of the memory operations necessary to prove the compiler passes.
  • This section details how the properties and the proofs need to be adapted to accommodate for symbolic expressions.
  • Second, the authors introduce an equivalence relation between symbolic expressions.

3.1 Precise Handling of Undefined Values

  • Symbolic expressions (as presented in Section 2.3) feature a unique undef token.
  • This is a shortcoming that the authors have identified during the proof.
  • With a single undef, the authors do not capture the fact that different occurrences of undef may represent the same unknown value, or different ones.
  • To overcome this problem, each byte of a newly allocated memory chunk is initialised with a fresh undef value.
  • Hence, x − x constructs the symbolic expression undef(b, o)− undef(b, o) for some b and o which obviously normalises to 0, because undef(b, o) now represents a unique value rather than the set of all values.

3.2 Memory Allocation

  • CompCert’s alloc operation always allocates a memory chunk of the requested size and returns a fresh block to the newly allocated memory (i.e. it models an infinite memory).
  • The first guarantee is that for every memory m there exists at least a concrete memory compatible with the abstract CompCert block-based memory.
  • To get this property, the alloc function runs a greedy algorithm constructing a compatible cm mapping.
  • Given a memory m, size_mem(m) returns the size of the constructed memory (i.e. the first fresh address as computed by the allocation).
  • The algorithm makes the pessimistic assumption that the allocated blocks are maximally aligned – for CompCert, this maximum is 3 bits (addresses are divisible by 23).

3.3 Good Variable Properties

  • In CompCert, the so-called good variable properties axiomatise the behaviour of the memory operations.
  • The reverse operation is the concatenation of a symbolic expression sv1 with a symbolic expression sv2 representing a byte.
  • The authors have generalised and proved the axioms of the memory model using the same principle.
  • Moreover, if the structure of the proofs is similar, their proofs are complicated by the fact that the authors reason modulo normalisation of expressions.

4 Cross-validation of Memory Models

  • The semantics of the CompCert C language is part of the trusted computing base of the compiler.
  • If the resulting offset is outside the bounds, their normalisation returns undef.
  • After the easy fix, the authors found two interesting semantics discrepancies with the current semantics of CompCert C. However, when running the compiled program, the pointer is a mere integer, the integer eventually overflows; wraps around and becomes 0.
  • After adjusting both memory models, the authors are able to prove that both semantics agree when the existing CompCert C semantics is defined thus cross-validating the semantics of operators.

5 Redesign of Memory Injections

  • Memory injections are instrumental for proving the correctness of several compiler passes of CompCert.
  • A memory injection defines a mapping between memories; it is a versatile tool to explain how passes reorganise the memory (e.g. construct an activation record from local variables).
  • This section explains how to generalise this concept for symbolic expressions.
  • It requires a careful handling of undefined values undef(l) which are absent from the existing memory model.

5.1 Memory Injections in CompCert

  • The injection relation is defined over values (and called val_inject) and then lifted to memories (and called inject).
  • The val_inject relation distinguishes three cases: 1. For concrete values (i.e. integers or floating-point numbers), the relation is reflexive: e.g. int(i) is in relation with int(i) ; 2. ptr(b, i) is in relation with ptr(b′, i+ δ) when f(b) = b(b′, δ)c; 3. undef is in relation with any value (including undef).
  • The purpose of the injection is twofold: it establishes a relation between pointers using the function f but it can also specialise undef by a defined value.
  • In CompCert, so-called generic memory injections state that every valid location in memory m1 is mapped by function f into a valid location in memory m2; the corresponding location in m2 must be properly aligned with respect to the size of the block; and the values stored at corresponding locations must be in injection.
  • Among other conditions, the authors have that if several blocks in m1 are mapped to the same block in m2, the mapping ensures the absence of overlapping.

5.2 Memory Injection with Symbolic Expressions

  • The function f is still present and serves the same purpose.
  • The authors injection expr_inject is therefore defined as the composition of the function apply_spe spe which specialises undef(l) into concrete bytes, and the function apply_inj f which injects locations.
  • This model makes the implicit assumption that memory blocks are always sufficiently aligned.
  • The existing formalisation of inject has a property mi_representable which states that the offset o+ δ obtained after injection does not overflow.

5.3 Memory Injection and Normalisation

  • The authors normalisation is defined w.r.t. all the concrete memories compatible with the CompCert block-based memory (see Section 2.3).
  • Theorem norm_inject shows that under the condition that all blocks are injected, if e and e′ are in injection, then their normalisations are in injection too.
  • Thus, the normalisation can only get more defined after injection.
  • This is expected as the injection can merge blocks and therefore makes pointer arithmetic more defined.
  • A consequence of this theorem is that the compiler is not allowed to reduce the memory usage.

6 Proving the Front-end of the CompCert Compiler

  • Later compiler passes are architecture dependent and are therefore part of the back-end.
  • This section explains how to adapt the semantics preservation proofs of the front-end to their memory model with symbolic expressions.

6.1 CompCert Front-end with Symbolic Expressions

  • The semantics of all intermediate languages need to be modified in order to account for symbolic expressions.
  • In reality, the transformation is more subtle because, for instance, certain intermediate semantic functions explicitly require locations represented as pairs (b, o).
  • This solution proves wrong and breaks semantics preservation proofs because introduced normalisations may be absent in subsequent intermediate languages.
  • This pass does not transform the memory and therefore the existing proof can be reused.
  • The pass also performs type-directed transformations and removes redundant casts.

2. allocation of local variables

  • This relation is too weak and fails to pass the induction step.
  • The problem is related with the preservation of the memory injection when allocating and de-allocating the variables in C]minor and the stack frame in Cminor.
  • Once again, the authors adapt the two-step proof with a direct induction over the number of variables.
  • To carry out this proof and establish an injection the authors have to reason about the relative sizes of the memories.
  • Here, the authors have to deal with the opposite situation where the stack frame could use less memory than the variables.

8 Conclusion

  • This work is a milestone towards a CompCert compiler proved correct with respect to a more concrete memory model.
  • A side-product of their work is that the authors have uncovered and fixed a problem in the existing semantics of the comparison with the null pointer.
  • The authors are confident that program optimisations based on static analyses will not be problematic.
  • Withstanding the remaining difficulties, the authors believe that the full CompCert compiler can be ported to their novel memory model.
  • This would improve further the confidence in the generated code.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

HAL Id: hal-01194549
https://hal.inria.fr/hal-01194549
Submitted on 7 Sep 2015
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Copyright
A Concrete Memory Model for CompCert
Frédéric Besson, Sandrine Blazy, Pierre Wilke
To cite this version:
Frédéric Besson, Sandrine Blazy, Pierre Wilke. A Concrete Memory Model for CompCert. ITP 2015 :
6th International Conference on Interactive Theorem Proving, Aug 2015, Nanjing, China. pp.67-83,
�10.1007/978-3-319-22102-1_5�. �hal-01194549�

A Concrete Memory Model for CompCert
Fr´ed´eric Besson
1
, Sandrine Blazy
2
, and Pierre Wilke
2
1
Inria, Rennes, France
2
Universit´e Rennes 1 - IRISA, Rennes, France
Abstract. Semantics preserving compilation of low-level C programs is
challenging because their semantics is implementation defined according
to the C standard. This paper presents the proof of an enhanced and
more concrete memory model for the CompCert C compiler which as-
signs a definite meaning to more C programs. In our new formally verified
memory model, pointers are still abstract but are nonetheless mapped
to concrete 32-bit integers. Hence, the memory is finite and it is possible
to reason about the binary encoding of pointers. We prove that the ex-
isting memory model is an abstraction of our more concrete model thus
validating formally the soundness of CompCert’s abstract semantics of
pointers. We also show how to adapt the front-end of CompCert thus
demonstrating that it should be feasible to port the whole compiler to
our novel memory model.
1 Introduction
Formal verification of programs is usually performed at source level. Yet, a the-
orem about the source code of a safety critical software is not sufficient. Even-
tually, what we really value is a guarantee about the run-time behaviour of the
compiled program running on a physical machine. The CompCert compiler [17]
fills this verification gap: its semantics preservation theorem ensures that when
the source program has a defined semantics, program invariants proved at source
level still hold for the compiled code. For the C language the rules governing so
called undefined behaviours are subtle and the absence of undefined behaviours
is in general undecidable. As a corollary, for a given C program, it is undecidable
whether the semantic preservation applies or not.
To alleviate the problem, the semantics of CompCert C is executable and
it is therefore possible to check that a given program execution has a defined
semantics. Jourdan et al. [12] propose a more comprehensive and ambitious
approach: they formalise and verify a precise C static analyser for CompCert
capable of ruling out undefined behaviours for a wide range of programs. Yet,
these approaches are, by essence, limited by the formal semantics of CompCert
C: programs exhibiting undefined behaviours cannot benefit from any semantic
preservation guarantee. This is unfortunate as real programs do have behaviours
that are undefined according to the formal semantics of CompCert C
3
. This can
This work was partially supported by the French ANR-14-CE28-0014 AnaStaSec.
3
The official C standard is in general even stricter.

be a programming mistake but sometimes this is a design feature. In the past,
serious security flaws have been introduced by optimising compilers aggressively
exploiting the latitude provided by undefined behaviours [22,6]. The existing
workaround is not satisfactory and consists in disabling optimisations known to
be triggered by undefined behaviours.
In previous work [3], we proposed a more concrete and defined semantics
for CompCert C able to give a semantics to low-level C idioms. This semantics
relies on symbolic expressions stored in memory that are normalised into genuine
values when needed by the semantics. It handles low-level C idioms that exploit
the concrete encoding of pointers (e.g. alignment constraints) or access partially
undefined data structures (e.g. bit-fields). Such properties cannot be reasoned
about using the existing CompCert memory model [19,18].
The memory model of CompCert consists of two parts: standard operations
on memory (e.g. alloc, store) that are used in the semantics of the languages of
CompCert and their properties (that are required to prove the semantic preser-
vation of the compiler), together with generic transformations operating over
memory. Indeed, certain passes of the compiler perform non-trivial transforma-
tions on memory allocations and accesses: for instance, in the front-end, C local
variables initially mapped to individually-allocated memory blocks are later on
mapped to sub-blocks of a single stack-allocated activation record. Proving the
semantic preservation of these transformations requires extensive reasoning over
memory states, using memory invariants relating memory states during program
execution, that are also defined in the memory model.
In this paper, we extend the memory model of CompCert with symbolic ex-
pressions [3] and tackle the challenge of porting memory transformations and
CompCert’s proofs to our memory model with symbolic expressions. The com-
plete Coq development is available online [1]. Among others, a difficulty is that
we drop the implicit assumption of an infinite memory. This has the consequence
that allocation can fail. Hence, the compiler has to ensure that the compiled pro-
gram is using less memory than the source program.
This paper presents a milestone towards a CompCert compiler adapted with
our semantics; it makes the following contributions.
We present a formal verification of our memory model within CompCert.
We prove that the existing memory model of CompCert is an abstraction of
our model thus validating the soundness of the existing semantics.
We extend the notion of memory injection, the main generic notion of mem-
ory transformation.
We adapt the proof of CompCert’s front-end passes, from CompCert C until
Cminor, thus demonstrating the feasibility of our endeavour.
The paper is organised as follows. Section 2 recalls the main features of
the existing CompCert memory model and our proposed extension. Section 3
explains how to adapt the operations of the existing CompCert memory model
to comply with the new requirements of our memory model. Section 4 shows
that the existing memory model is, in a provable way, an abstraction of our
new memory model. Section 5 presents our re-design of the notion of memory

injection that is the cornerstone of compiler passes modifying the memory layout.
Section 6 details the modifications for the proofs for the compiler front-end
passes. Related work is presented in Section 7; Section 8 concludes.
2 A More Concrete Memory Model for CompCert
In previous work [3], we propose an enhanced memory model (with symbolic
expressions) for CompCert. The model is implemented and evaluated over a
representative set of C programs. We empirically verify, using the reference in-
terpreter of CompCert, that our extension is sound with respect to the existing
semantics and that it captures low-level C idioms out of reach of the existing
memory model. This section first recalls the main features of the current Comp-
Cert memory model and then explains our extension to this memory model.
2.1 CompCert’s Memory Model
Leroy et al. [18] give a thorough presentation of the existing memory model of
CompCert, that is shared by all the languages of the compiler. We give a brief
overview of its design in order to highlight the differences with our own model.
Abstract values used in the semantics of the CompCert languages (see [19])
are the disjoint union of 32-bit integers (written as int(i) ), 32-bit floating-
point numbers (written as float( f) ), locations (written as ptr(l) ), and the
special value undef representing an arbitrary bit pattern, such as the value of an
uninitialised variable. The abstract memory is viewed as a collection of separated
blocks. A location l is a pair ( b, i) where b is a block identifier (i.e. an abstract
address) and i is an integer offset within this block. Pointer arithmetic modifies
the offset part of a location, keeping its block identifier part unchanged. A pointer
ptr ( b, i) is valid for a memory M (written valid_pointer ( M, b, i)) if the offset i
is within the two bounds of the block b.
Abstract values are loaded from (resp. stored into) memory using the load
(resp. store) memory operation. Memory chunks appear in these operations, to
describe concisely the size, type and signedness of the value being stored. These
operations return option types: we write for failure and bxc for a successful
return of a value x. The free operation may also fail (e.g. when the locations to
be freed have been freed already). The memory operation alloc never fails, as
the size of the memory is unbounded.
In the memory model, the byte-level, in-memory representation of integers
and floats is exposed, while pointers are kept abstract [18]. The concrete memory
is modelled as a map associating to each location a concrete value cv that is a
byte-sized quantity describing the current content of a memory cell. It can be
either a concrete 8-bit integer (written as bytev(b)) representing a part of an
integer or a float, ptrv(l, i) to represent the i-th byte (i {1, 2, 3, 4}) of the
location l, or undefv to model uninitialised memory.

struct {
int a0 : 1; int a1 : 1;
} bf ;
int main() {
bf .a1 = 1; return bf .a1;}
(a) Bitfield in C
1 struct { unsigned char bf1 ;} bf ;
2
3 int main(){
4 bf . bf1 = ( bf . bf1 & ˜0x2U) |
5 (( unsigned int ) 1 << 1U & 0x2U) ;
6 return ( int ) ( bf . bf1 << 30) >> 31;}
(b) Bitfield in CompCert C
Fig. 1: Emulation of bitfields in CompCert
2.2 Motivation for an Enhanced Memory Model
Our memory model with symbolic expressions [3] gives a precise semantics to
low-level C idioms which cannot be modelled by the existing memory model. The
reason is that those idioms either exploit the binary representation of pointers as
integers or reason about partially uninitialised data. For instance, it is common
for system calls, e.g. mmap or sbrk, to return 1 (instead of a pointer) to indicate
that there is no memory available. Intuitively, 1 refers to the last memory
address 0xFFFFFFFF and this cannot be a valid address because mmap returns
pointers that are aligned their trailing bits are necessarily 0s. Other examples
are robust implementations of malloc: for the sake of checking the integrity of
pointers, their trailing bits store a checksum. This is possible because those
pointers are also aligned and therefore the trailing bits are necessarily 0s.
Another motivation is illustrated by the current handling of bitfields in
CompCert: they are emulated in terms of bit-level operations by an elabora-
tion pass preceding the formally verified front-end. Fig. 1 gives an example of
such a transformation. The program defines a bitfield bf such that a0 and a1 are
1 bit long. The main function sets the field a1 of bf to 1 and then returns this
value. The expected semantics is therefore that the program returns 1. The trans-
formed code (Fig. 1b) is not very readable but the gist of it is that field accesses
are encoded using bitwise and shift operators. The transformation is correct and
the target code generated by CompCert correctly returns 1. However, using the
existing memory model, the semantics is undefined. Indeed, the program starts
by reading the field __fd1 of the uninitialised structure bf. The value is therefore
undef. Moreover, shift and bitwise operators are strict in undef and therefore
return undef. As a result, the program returns undef. As we show in the next
section, our semantics is able to model partially undefined values and therefore
gives a semantics to bitfields. Even though this case could be easily solved by
modifying the pre-processing step, C programmers might themselves write such
low-level code with reads of undefined memory and expect it to behave correctly.
2.3 A Memory Model with Symbolic Expressions
To give a semantics to the previous idioms, a direct approach is to have a fully
concrete memory model where a pointer is a genuine integer and the memory is

Citations
More filters
Journal ArticleDOI
TL;DR: RefinedC-VIP as mentioned in this paper is a new memory object model aimed at supporting C verification, which sidesteps the complexities of PNVI with a simple but effective idea: a new construct that lets programmers express the intended provenances of integer-pointer casts explicitly.
Abstract: Systems code often requires fine-grained control over memory layout and pointers, expressed using low-level ( e.g. , bitwise) operations on pointer values. Since these operations go beyond what basic pointer arithmetic in C allows, they are performed with the help of integer-pointer casts . Prior work has explored increasingly realistic memory object models for C that account for the desired semantics of integer-pointer casts while also being sound w.r.t. compiler optimisations, culminating in PNVI, the preferred memory object model in ongoing discussions within the ISO WG14 C standards committee. However, its complexity makes it an unappealing target for verification, and no tools currently exist to verify C programs under PNVI. In this paper, we introduce VIP, a new memory object model aimed at supporting C verification. VIP sidesteps the complexities of PNVI with a simple but effective idea: a new construct that lets programmers express the intended provenances of integer-pointer casts explicitly. At the same time, we prove VIP compatible with PNVI, thus enabling verification on top of VIP to benefit from PNVI’s validation with respect to practice. In particular, we build a verification tool, RefinedC-VIP, for verifying programs under VIP semantics. As the name suggests, RefinedC-VIP extends the recently developed RefinedC tool, which is automated yet also produces foundational proofs in Coq. We evaluate RefinedC-VIP on a range of systems-code idioms, and validate VIP’s expressiveness via an implementation in the Cerberus C semantics.
Dissertation
11 Dec 2019
TL;DR: This thesis is based on a software (and not hardware) fault isolation technique, and proposes two semantics for it, single- threaded and multi-threaded, as well as a static analyzer based on abstract interpretation, and presents a proof of correctness for the analyzer.
Abstract: We are used to use computers on which programs from diverse origins are installed and running at the same time. Each of these programs need to access memory for proper operation, but none of them should access or modify the memory of another. If this happened, programs would not be able to trust their memory and could start behaving erratically. Still, programmers do not need to coordinate and agree in advance on what parts of the memory they are allowed to use or not. Hardware takes care of allocating distinct memory zones for each program. This is completely transparent to the programmer. A malware cannot access or modify the memory of another program to attack it directly either. However, there exists a category of programs that do not benefit from this protection: modules that extend the features of other programs, such as plugins in a web browser. This thesis is based on a software (and not hardware) fault isolation technique, and proposes two semantics for it, single-threaded and multi-threaded, as well as a static analyzer based on abstract interpretation. We also present a proof of correctness for the analyzer.
References
More filters
Journal ArticleDOI
TL;DR: This paper reports on the development and formal verification of CompCert, a compiler from Clight (a large subset of the C programming language) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness.
Abstract: This paper reports on the development and formal verification (proof of semantic preservation) of CompCert, a compiler from Clight (a large subset of the C programming language) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness. Such a verified compiler is useful in the context of critical software and its formal verification: the verification of the compiler guarantees that the safety properties proved on the source code hold for the executable compiled code as well.

1,124 citations


"A Concrete Memory Model for CompCer..." refers background or methods in this paper

  • ...The CompCert C semantics [5] provides the specification for the correctness of the CompCert compiler [17]....

    [...]

  • ...[9,15,17])....

    [...]

  • ...The CompCert compiler [17] fills this verification gap: its semantics preservation theorem ensures that when the source program has a defined semantics, program invariants proved at source level still hold for the compiled code....

    [...]

Journal ArticleDOI
04 Jun 2011
TL;DR: Csmith, a randomized test-case generation tool, is created and spent three years using it to find compiler bugs, and a collection of qualitative and quantitative results about the bugs it found are presented.
Abstract: Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to compiler developers. Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. In this paper we present our compiler-testing tool and the results of our bug-hunting study. Our first contribution is to advance the state of the art in compiler testing. Unlike previous tools, Csmith generates programs that cover a large subset of C while avoiding the undefined and unspecified behaviors that would destroy its ability to automatically find wrong-code bugs. Our second contribution is a collection of qualitative and quantitative results about the bugs we have found in open-source C compilers.

799 citations


"A Concrete Memory Model for CompCer..." refers methods in this paper

  • ...With this respect, the CompCert C semantics successfully run hundreds of random test programs generated by CSmith [23]....

    [...]

Book ChapterDOI
20 Aug 2009
TL;DR: This paper motivates VCC, describes the verification methodology, the architecture of VCC is described, and the experience using VCC to verify the Microsoft Hyper-V hypervisor is reported on.
Abstract: VCC is an industrial-strength verification environment for low-level concurrent system code written in C. VCC takes a program (annotated with function contracts, state assertions, and type invariants) and attempts to prove the correctness of these annotations. It includes tools for monitoring proof attempts and constructing partial counterexample executions for failed proofs. This paper motivates VCC, describes our verification methodology, describes the architecture of VCC, and reports on our experience using VCC to verify the Microsoft Hyper-V hypervisor.

584 citations


"A Concrete Memory Model for CompCer..." refers methods in this paper

  • ...VCC [7] generates verification conditions using an abstract typed memory model [8] where the memory is a mapping from typed pointers to structured C values....

    [...]

Journal ArticleDOI
25 Jan 2012
TL;DR: The semantics is shown capable of automatically finding program errors, both statically and at runtime, and it is also used to enumerate nondeterministic behavior.
Abstract: This paper describes an executable formal semantics of C. Being executable, the semantics has been thoroughly tested against the GCC torture test suite and successfully passes 99.2% of 776 test programs. It is the most complete and thoroughly tested formal definition of C to date. The semantics yields an interpreter, debugger, state space search tool, and model checker "for free". The semantics is shown capable of automatically finding program errors, both statically and at runtime. It is also used to enumerate nondeterministic behavior.

209 citations


Additional excerpts

  • ...[9,15,17])....

    [...]

17 Jul 2011
TL;DR: In this paper, the authors present an executable formal semantics of C. The semantics yields an interpreter, debugger, state space search tool, and model checker, which is shown capable of automatically finding program errors, both statically and at runtime.
Abstract: This paper describes an executable formal semantics of C. Being executable, the semantics has been thoroughly tested against the GCC torture test suite and successfully passes 770 of 776 test programs. It is the most complete and thoroughly tested formal definition of C to date. The semantics yields an interpreter, debugger, state space search tool, and model checker “for free”. The semantics is shown capable of automatically finding program errors, both statically and at runtime. It is also used to enumerate nondeterministic behavior.

188 citations

Frequently Asked Questions (2)
Q1. What are the contributions in "A concrete memory model for compcert" ?

This paper presents the proof of an enhanced and more concrete memory model for the CompCert C compiler which assigns a definite meaning to more C programs. The authors prove that the existing memory model is an abstraction of their more concrete model thus validating formally the soundness of CompCert ’ s abstract semantics of pointers. The authors also show how to adapt the front-end of CompCert thus demonstrating that it should be feasible to port the whole compiler to their novel memory model. 

As future work, the authors shall study how to adapt the back-end of CompCert. Withstanding the remaining difficulties, the authors believe that the full CompCert compiler can be ported to their novel memory model. This would improve further the confidence in the generated code.