scispace - formally typeset
Search or ask a question
Author

Daryl Zuniga

Bio: Daryl Zuniga is an academic researcher from University of Washington. The author has contributed to research in topics: Liveness & Compiler. The author has an hindex of 1, co-authored 1 publications receiving 31 citations.

Papers
More filters
Proceedings ArticleDOI
02 Jun 2016
TL;DR: Peek is presented, a framework for expressing, verifying, and running meaning-preserving assembly-level program trans- formations in CompCert, and a set of local properties are proved are sufficient to ensure global transformation correctness.
Abstract: Transformations over assembly code are common in many compilers. These transformations are also some of the most bug-dense compiler components. Such bugs could be elim- inated by formally verifying the compiler, but state-of-the- art formally verified compilers like CompCert do not sup- port assembly-level program transformations. This paper presents Peek, a framework for expressing, verifying, and running meaning-preserving assembly-level program trans- formations in CompCert. Peek contributes four new com- ponents: a lower level semantics for CompCert x86 syntax, a liveness analysis, a library for expressing and verifying peephole optimizations, and a verified peephole optimiza- tion pass built into CompCert. Each of these is accompanied by a correctness proof in Coq against realistic assumptions about the calling convention and the system memory alloca- tor. Verifying peephole optimizations in Peek requires prov- ing only a set of local properties, which we have proved are sufficient to ensure global transformation correctness. We have proven these local properties for 28 peephole transfor- mations from the literature. We discuss the development of our new assembly semantics, liveness analysis, representa- tion of program transformations, and execution engine; de- scribe the verification challenges of each component; and detail techniques we applied to mitigate the proof burden.

40 citations


Cited by
More filters
Proceedings ArticleDOI
04 Sep 2016
TL;DR: This paper presents the overall structure of the compiler, including its 12 intermediate languages, and explains how everything fits together, and focuses particularly on the interaction between the verification of the register allocator and the garbage collector, and memory representations.
Abstract: We have developed and mechanically verified a new compiler backend for CakeML. Our new compiler features a sequence of intermediate languages that allows it to incrementally compile away high-level features and enables verification at the right levels of semantic detail. In this way, it resembles mainstream (unverified) compilers for strict functional languages. The compiler supports efficient curried multi-argument functions, configurable data representations, exceptions that unwind the call stack, register allocation, and more. The compiler targets several architectures: x86-64, ARMv6, ARMv8, MIPS-64, and RISC-V. In this paper, we present the overall structure of the compiler, including its 12 intermediate languages, and explain how everything fits together. We focus particularly on the interaction between the verification of the register allocator and the garbage collector, and memory representations. The entire development has been carried out within the HOL4 theorem prover.

68 citations

Posted Content
TL;DR: Souper, a synthesizing superoptimizer, was developed to see how far these ideas might be pushed in the context of LLVM, and it was discovered that Souper's intermediate representation was sufficiently similar to the one in Microsoft Visual C++ that it was applied to that compiler as well.
Abstract: If we can automatically derive compiler optimizations, we might be able to sidestep some of the substantial engineering challenges involved in creating and maintaining a high-quality compiler. We developed Souper, a synthesizing superoptimizer, to see how far these ideas might be pushed in the context of LLVM. Along the way, we discovered that Souper's intermediate representation was sufficiently similar to the one in Microsoft Visual C++ that we applied Souper to that compiler as well. Shipping, or about-to-ship, versions of both compilers contain optimizations suggested by Souper but implemented by hand. Alternately, when Souper is used as a fully automated optimization pass it compiles a Clang compiler binary that is about 3 MB (4.4%) smaller than the one compiled by LLVM.

50 citations

Proceedings ArticleDOI
14 Jun 2017
TL;DR: The current semantics of LLVM's IR fails to justify some cases of loop unswitching, global value numbering, and other important "textbook" optimizations, causing long-standing bugs.
Abstract: A central concern for an optimizing compiler is the design of its intermediate representation (IR) for code. The IR should make it easy to perform transformations, and should also afford efficient and precise static analysis. In this paper we study an aspect of IR design that has received little attention: the role of undefined behavior. The IR for every optimizing compiler we have looked at, including GCC, LLVM, Intel's, and Microsoft's, supports one or more forms of undefined behavior (UB), not only to reflect the semantics of UB-heavy programming languages such as C and C++, but also to model inherently unsafe low-level operations such as memory stores and to avoid over-constraining IR semantics to the point that desirable transformations become illegal. The current semantics of LLVM's IR fails to justify some cases of loop unswitching, global value numbering, and other important "textbook" optimizations, causing long-standing bugs. We present solutions to the problems we have identified in LLVM's IR and show that most optimizations currently in LLVM remain sound, and that some desirable new transformations become permissible. Our solutions do not degrade compile time or performance of generated code.

43 citations

Journal ArticleDOI
TL;DR: The overall design of the compiler backend is presented, including its 12 intermediate languages, and how the semantics and proofs fit together are explained and detail on how the compiler has been bootstrapped inside the logic of a theorem prover is provided.
Abstract: The CakeML compiler is, to the best of our knowledge, the most realistic verified compiler for a functional programming language to date. The architecture of the compiler, a sequence of intermediate languages through which high-level features are compiled away incrementally, enables verification of each compilation pass at an appropriate level of semantic detail. Parts of the compiler's implementation resemble mainstream (unverified) compilers for strict functional languages, and it supports several important features and optimisations. These include efficient curried multi-argument functions, configurable data representations, efficient exceptions, register allocation, and more. The compiler produces machine code for five architectures: x86-64, ARMv6, ARMv8, MIPS-64, and RISC-V. The generated machine code contains the verified runtime system which includes a verified generational copying garbage collector and a verified arbitrary precision arithmetic (bignum) library. In this paper, we present the overall design of the compiler backend, including its 12 intermediate languages. We explain how the semantics and proofs fit together and provide detail on how the compiler has been bootstrapped inside the logic of a theorem prover. The entire development has been carried out within the HOL4 theorem prover.

39 citations

Proceedings ArticleDOI
08 Jan 2018
TL;DR: Œuf as mentioned in this paper is a verified compiler from a subset of Gallina to assembly, which preserves the semantics of the source Gallina program and maintains a small TCB for its front-end by reflecting Gallina programs to Œufsource.
Abstract: Verifying systems by implementing them in the programming language of a proof assistant (e.g., Gallina for Coq) lets us directly leverage the full power of the proof assistant for verifying the system. But, to execute such an implementation requires extraction, a large complicated process that is in the trusted computing base (TCB). This paper presents Œuf, a verified compiler from a subset of Gallina to assembly. Œuf’s correctness theorem ensures that compilation preserves the semantics of the source Gallina program. We describe how Œuf’s specification can be used as a foreign function interface to reason about the interaction between compiled Gallina programs and surrounding shim code. Additionally, Œufmaintains a small TCB for its front-end by reflecting Gallina programs to Œufsource and automatically ensuring equivalence using computational denotation. This design enabled us to implement some early compiler passes (e.g., lambda lifting) in the untrusted reflection and ensure their correctness via translation validation. To evaluate Œuf, we compile Appel’s SHA256 specification from Gallina to x86 and write a shim for the generated code, yielding a verified sha256sum implementation with a small TCB.

35 citations