scispace - formally typeset
Open AccessJournal ArticleDOI

A Verified CompCert Front-End for a Memory Model Supporting Pointer Arithmetic and Uninitialised Data

TLDR
This work proposes a novel memory model for CompCert which gives a defined semantics to challenging features such as bitwise pointer arithmetics and access to uninitialised data and shows how to tame the expressive power of the normalisation so that the memory model fits the proof framework of CompCert.
Abstract
The CompCert C compiler guarantees that the target program behaves as the source program. Yet, source programs without a defined semantics do not benefit from this guarantee and could therefore be miscompiled. To reduce the possibility of a miscompilation, we propose a novel memory model for CompCert which gives a defined semantics to challenging features such as bitwise pointer arithmetics and access to uninitialised data. We evaluate our memory model both theoretically and experimentally. In our experiments, we identify pervasive low-level C idioms that require the additional expressiveness provided by our memory model. We also show that our memory model provably subsumes the existing CompCert memory model thus cross-validating both semantics. Our memory model relies on the core concepts of symbolic value and normalisation. A symbolic value models a delayed computation and the normalisation turns, when possible, a symbolic value into a genuine value. We show how to tame the expressive power of the normalisation so that the memory model fits the proof framework of CompCert. We also adapt the proofs of correctness of the compiler passes performed by CompCert’s front-end, thus demonstrating that our model is well-suited for proving compiler transformations.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

Trace-Relating Compiler Correctness and Secure Compilation

TL;DR: A generalized compiler correctness definition is studied, which provides a generic characterization of the target trace property ensured by correctly compiling a program that satisfies a given source property, and dually, of the source trace property one is required to show in order to obtain a certain target property for the compiled code.
Journal ArticleDOI

Reconciling high-level optimizations and low-level code in LLVM

TL;DR: This work developed a novel memory model for LLVM IR and formalized it, which requires a handful of problematic IR-level optimizations to be removed, but it also supports the addition of new optimizations that were not previously legal.
Book ChapterDOI

CompCertS: A Memory-Aware Verified C Compiler Using Pointer as Integer Semantics

TL;DR: A formally verified C compiler, CompCertS, is presented, which is essentially the CompCert compiler, albeit with a stronger formal guarantee: it gives a semantics to more programs and ensures that the memory consumption is preserved by the compiler.
Book ChapterDOI

Compiling Sandboxes: Formally Verified Software Fault Isolation

TL;DR: This work design, implement and prove correct a program instrumentation phase as part of the formally verified compiler CompCert that enforces a sandboxing security property a priori and eliminates the need for a binary verifier and leverages the soundness proof of the compiler to prove the security of the sandboxing transformation.
Journal ArticleDOI

CompCertS : A Memory-Aware Verified C Compiler Using a Pointer as Integer Semantics

TL;DR: A formally verified C compiler, CompCertS, is presented, which is essentially the CompCert compiler, albeit with a stronger formal guarantee: it gives a semantics to more programs and ensures that the memory consumption is preserved by the compiler.
References
More filters
Book ChapterDOI

Z3: an efficient SMT solver

TL;DR: Z3 is a new and efficient SMT Solver freely available from Microsoft Research that is used in various software verification and analysis applications.
Journal ArticleDOI

Formal verification of a realistic compiler

TL;DR: This paper reports on the development and formal verification of CompCert, a compiler from Clight (a large subset of the C programming language) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness.
Journal ArticleDOI

Finding and understanding bugs in C compilers

TL;DR: Csmith, a randomized test-case generation tool, is created and spent three years using it to find compiler bugs, and a collection of qualitative and quantitative results about the bugs it found are presented.
Book ChapterDOI

VCC: A Practical System for Verifying Concurrent C

TL;DR: This paper motivates VCC, describes the verification methodology, the architecture of VCC is described, and the experience using VCC to verify the Microsoft Hyper-V hypervisor is reported on.
Journal ArticleDOI

A Formally Verified Compiler Back-end

TL;DR: This article describes the development and formal verification of a compiler back-end from Cminor (a simple imperative intermediate language) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its soundness.
Related Papers (5)
Frequently Asked Questions (11)
Q1. What have the authors contributed in "A verified compcert front-end for a memory model supporting pointer arithmetic and uninitialised data" ?

To reduce the possibility of a miscompilation, the authors propose a novel memory model for CompCert which gives a defined semantics to challenging features such as bitwise pointer arithmetics and access to uninitialised data. In their experiments, the authors identify pervasive low-level C idioms that require the additional expressiveness provided by their memory model. The authors also show that their memory model provably subsumes the existing CompCert memory model thus cross-validating both semantics. The authors show how to tame the expressive power of the normalisation so that the memory model fits the proof framework of CompCert. The authors also adapt the proofs of correctness of the compiler passes performed by CompCert ’ s front-end, thus demonstrating that their model is well-suited for proving compiler transformations. 

As future work, the authors shall study how to adapt the back-end of CompCert. In spite of the remaining difficulties, the authors believe that the full CompCert compiler can be ported to their novel memory model. This would improve further the confidence in the generated code. 

Because the memory is infinite is CClight, this program has defined semantics and the simulation the authors are trying to prove requires that this program have defined semantics in SClight as well. 

Note that to accommodate for alignment and padding the stack frame might allocate more bytes than the size of the variables themselves. 

The solution is to normalise symbolic values in a more eager manner i.e. before any write into memory or into a register, and only keep symbolic values when the normalisation fails. 

This is done by converting each smemval into a symbolic value, and then concatenating those symbolic values: the concat function recovers the 64-bit bitvector that represents the original symbolic value, and the decode function applies the from_bits function to the result of concat with the appropriate chunk. 

Algorithm 2 explains how the authors normalise symbolic values into pointers, and is based on the fact that a symbolic value sv can only have ptr(b, o) as normalisation if b appears syntactically in sv. 

The algorithm checks that all the blocks fit in memory by running the function fresh_addr which constructs as witness a valid concrete memory cm and returns the first fresh address addr. 

Because of the alignment constraints on the block b, this symbolic value simplifies to 3 == 0, which in turn evaluates to int(0). 

certain undefined behaviours of C were introduced on purpose to ease either the portability of the language across platforms or the development of efficient compilers. 

Another motivation is illustrated by the current handling of bit-fields in CompCert: they are emulated in terms of bit-level operations by an elaboration pass preceding the formally-verified front-end.