scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Into the depths of C: elaborating the de facto standards

02 Jun 2016-Vol. 51, Iss: 6, pp 1-15
TL;DR: An in-depth analysis of the design space for the semantics of pointers and memory in C as it is used in practice is described, a step towards clear, consistent, and accepted semantics for the various use-cases of C.
Abstract: C remains central to our computing infrastructure. It is notionally defined by ISO standards, but in reality the properties of C assumed by systems code and those implemented by compilers have diverged, both from the ISO standards and from each other, and none of these are clearly understood. We make two contributions to help improve this error-prone situation. First, we describe an in-depth analysis of the design space for the semantics of pointers and memory in C as it is used in practice. We articulate many specific questions, build a suite of semantic test cases, gather experimental data from multiple implementations, and survey what C experts believe about the de facto standards. We identify questions where there is a consensus (either following ISO or differing) and where there are conflicts. We apply all this to an experimental C implemented above capability hardware. Second, we describe a formal model, Cerberus, for large parts of C. Cerberus is parameterised on its memory model; it is linkable either with a candidate de facto memory object model, under construction, or with an operational C11 concurrency model; it is defined by elaboration to a much simpler Core language for accessibility, and it is executable as a test oracle on small examples. This should provide a solid basis for discussion of what mainstream C is now: what programmers and analysis tools can assume and what compilers aim to implement. Ultimately we hope it will be a step towards clear, consistent, and accepted semantics for the various use-cases of C.

Content maybe subject to copyright    Report

Citations
More filters
01 Jan 1978
TL;DR: This ebook is the first authorized digital version of Kernighan and Ritchie's 1988 classic, The C Programming Language (2nd Ed.), and is a "must-have" reference for every serious programmer's digital library.
Abstract: This ebook is the first authorized digital version of Kernighan and Ritchie's 1988 classic, The C Programming Language (2nd Ed.). One of the best-selling programming books published in the last fifty years, "K&R" has been called everything from the "bible" to "a landmark in computer science" and it has influenced generations of programmers. Available now for all leading ebook platforms, this concise and beautifully written text is a "must-have" reference for every serious programmers digital library. As modestly described by the authors in the Preface to the First Edition, this "is not an introductory programming manual; it assumes some familiarity with basic programming concepts like variables, assignment statements, loops, and functions. Nonetheless, a novice programmer should be able to read along and pick up the language, although access to a more knowledgeable colleague will help."

2,120 citations

Proceedings ArticleDOI
23 Apr 2017
TL;DR: SGXBounds is an efficient memory-safety approach for shielded execution exploiting the architectural features of Intel SGX based on the LLVM compiler framework targeting unmodified multithreaded applications and has performance and memory overheads similar to AddressSanitizer and Intel MPX.
Abstract: Shielded execution based on Intel SGX provides strong security guarantees for legacy applications running on untrusted platforms. However, memory safety attacks such as Heartbleed can render the confidentiality and integrity properties of shielded execution completely ineffective. To prevent these attacks, the state-of-the-art memory-safety approaches can be used in the context of shielded execution.In this work, we first showcase that two prominent software- and hardware-based defenses, AddressSanitizer and Intel MPX respectively, are impractical for shielded execution due to high performance and memory overheads. This motivated our design of SGXBounds---an efficient memory-safety approach for shielded execution exploiting the architectural features of Intel SGX. Our design is based on a simple combination of tagged pointers and compact memory layout.We implemented SGXBounds based on the LLVM compiler framework targeting unmodified multithreaded applications. Our evaluation using Phoenix, PARSEC, and RIPE benchmark suites shows that SGXBounds has performance and memory overheads of 17% and 0.1% respectively, while providing security guarantees similar to AddressSanitizer and Intel MPX. We have obtained similar results with SPEC CPU2006 and four real-world case studies: SQLite, Memcached, Apache, and Nginx.

129 citations

Proceedings ArticleDOI
19 May 2019
TL;DR: This work provides a systematic overview of sanitizers with an emphasis on their role in finding security issues, taxonomize the available tools and the security vulnerabilities they cover, describe their performance and compatibility properties, and highlight various trade-offs.
Abstract: The C and C++ programming languages are notoriously insecure yet remain indispensable. Developers therefore resort to a multi-pronged approach to find security issues before adversaries. These include manual, static, and dynamic program analysis. Dynamic bug finding tools—henceforth "sanitizers"—can find bugs that elude other types of analysis because they observe the actual execution of a program, and can therefore directly observe incorrect program behavior as it happens. A vast number of sanitizers have been prototyped by academics and refined by practitioners. We provide a systematic overview of sanitizers with an emphasis on their role in finding security issues. Specifically, we taxonomize the available tools and the security vulnerabilities they cover, describe their performance and compatibility properties, and highlight various trade-offs.

119 citations


Cites background from "Into the depths of C: elaborating t..."

  • ...The de facto standard includes widely-followed programming practices that do not necessarily comply with the language standard, even though they result in bug-free code [99]....

    [...]

Proceedings ArticleDOI
12 Jun 2018
TL;DR: This work presents the first detailed root cause analysis of problems in the Intel MPX architecture through a cross-layer dissection of the entire system stack, involving the hardware, operating system, compilers, and applications.
Abstract: Memory-safety violations are the primary cause of security and reliability issues in software systems written in unsafe languages. Given the limited adoption of decades-long research in software-based memory safety approaches, as an alternative, Intel released Memory Protection Extensions (MPX)---a hardware-assisted technique to achieve memory safety. In this work, we perform an exhaustive study of Intel MPX architecture along three dimensions: (a) performance overheads, (b) security guarantees, and (c) usability issues.We present the first detailed root cause analysis of problems in the Intel MPX architecture through a cross-layer dissection of the entire system stack, involving the hardware, operating system, compilers, and applications. To put our findings into perspective, we also present an in-depth comparison of Intel MPX with three prominent types of software-based memory safety approaches. Lastly, based on our investigation, we propose directions for potential changes to the Intel MPX architecture to aid the design space exploration of future hardware extensions for memory safety.A complete version of this work appears in the 2018 proceedings of the ACM on Measurement and Analysis of Computing Systems.

107 citations


Cites background from "Into the depths of C: elaborating t..."

  • ...MPX does not work correctly with several common C idioms (see Table 4), especially when narrowing of bounds is applied and applications deviate from the standard memory model [8, 31]....

    [...]

Proceedings ArticleDOI
14 Oct 2017
TL;DR: Experience shows that Hyperkernel can avoid bugs similar to those found in xv6, and that the verification of Hyper kernel can be achieved with a low proof burden.
Abstract: This paper describes an approach to designing, implementing, and formally verifying the functional correctness of an OS kernel, named Hyperkernel, with a high degree of proof automation and low proof burden. We base the design of Hyperkernel's interface on xv6, a Unix-like teaching operating system. Hyperkernel introduces three key ideas to achieve proof automation: it finitizes the kernel interface to avoid unbounded loops or recursion; it separates kernel and user address spaces to simplify reasoning about virtual memory; and it performs verification at the LLVM intermediate representation level to avoid modeling complicated C semantics. We have verified the implementation of Hyperkernel with the Z3 SMT solver, checking a total of 50 system calls and other trap handlers. Experience shows that Hyperkernel can avoid bugs similar to those found in xv6, and that the verification of Hyperkernel can be achieved with a low proof burden.

73 citations


Cites background from "Into the depths of C: elaborating t..."

  • ...A final challenge is that Hyperkernel, like many other OS kernels, is written in C, a programming language that is known to complicate formal reasoning [26, 39, 53]....

    [...]

References
More filters
Book
Bjarne Stroustrup1
01 Jan 1985
TL;DR: Bjarne Stroustrup makes C even more accessible to those new to the language, while adding advanced information and techniques that even expert C programmers will find invaluable.
Abstract: From the Publisher: Written by Bjarne Stroustrup, the creator of C, this is the world's most trusted and widely read book on C. For this special hardcover edition, two new appendixes on locales and standard library exception safety have been added. The result is complete, authoritative coverage of the C language, its standard library, and key design techniques. Based on the ANSI/ISO C standard, The C Programming Language provides current and comprehensive coverage of all C language features and standard library components. For example: abstract classes as interfaces class hierarchies for object-oriented programming templates as the basis for type-safe generic software exceptions for regular error handling namespaces for modularity in large-scale software run-time type identification for loosely coupled systems the C subset of C for C compatibility and system-level work standard containers and algorithms standard strings, I/O streams, and numerics C compatibility, internationalization, and exception safety Bjarne Stroustrup makes C even more accessible to those new to the language, while adding advanced information and techniques that even expert C programmers will find invaluable.

6,795 citations

Proceedings ArticleDOI
10 Jun 2007
TL;DR: Valgrind is described, a DBI framework designed for building heavyweight DBA tools that can be used to build more interesting, heavyweight tools that are difficult or impossible to build with other DBI frameworks such as Pin and DynamoRIO.
Abstract: Dynamic binary instrumentation (DBI) frameworks make it easy to build dynamic binary analysis (DBA) tools such as checkers and profilers. Much of the focus on DBI frameworks has been on performance; little attention has been paid to their capabilities. As a result, we believe the potential of DBI has not been fully exploited.In this paper we describe Valgrind, a DBI framework designed for building heavyweight DBA tools. We focus on its unique support for shadow values-a powerful but previously little-studied and difficult-to-implement DBA technique, which requires a tool to shadow every register and memory value with another value that describes it. This support accounts for several crucial design features that distinguish Valgrind from other DBI frameworks. Because of these features, lightweight tools built with Valgrind run comparatively slowly, but Valgrind can be used to build more interesting, heavyweight tools that are difficult or impossible to build with other DBI frameworks such as Pin and DynamoRIO.

2,540 citations


"Into the depths of C: elaborating t..." refers background in this paper

  • ...Then there is a very extensive literature on static and dynamic analysis for C, and systems-oriented work on bug-finding tools (including tools such as Valgrind [36], Stack [48], and the Clang sanitisers)....

    [...]

01 Jan 1978
TL;DR: This ebook is the first authorized digital version of Kernighan and Ritchie's 1988 classic, The C Programming Language (2nd Ed.), and is a "must-have" reference for every serious programmer's digital library.
Abstract: This ebook is the first authorized digital version of Kernighan and Ritchie's 1988 classic, The C Programming Language (2nd Ed.). One of the best-selling programming books published in the last fifty years, "K&R" has been called everything from the "bible" to "a landmark in computer science" and it has influenced generations of programmers. Available now for all leading ebook platforms, this concise and beautifully written text is a "must-have" reference for every serious programmers digital library. As modestly described by the authors in the Preface to the First Edition, this "is not an introductory programming manual; it assumes some familiarity with basic programming concepts like variables, assignment statements, loops, and functions. Nonetheless, a novice programmer should be able to read along and pick up the language, although access to a more knowledgeable colleague will help."

2,120 citations

Journal ArticleDOI
04 Jun 2011
TL;DR: Csmith, a randomized test-case generation tool, is created and spent three years using it to find compiler bugs, and a collection of qualitative and quantitative results about the bugs it found are presented.
Abstract: Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to compiler developers. Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. In this paper we present our compiler-testing tool and the results of our bug-hunting study. Our first contribution is to advance the state of the art in compiler testing. Unlike previous tools, Csmith generates programs that cover a large subset of C while avoiding the undefined and unspecified behaviors that would destroy its ability to automatically find wrong-code bugs. Our second contribution is a collection of qualitative and quantitative results about the bugs we have found in open-source C compilers.

799 citations


"Into the depths of C: elaborating t..." refers methods in this paper

  • ...Of their 561 Csmith tests [55], Cerberus currently gives the same result as GCC for 556; the other 5 time-out after 5min....

    [...]

Proceedings Article
10 Jun 2002
TL;DR: This paper examines safety violations enabled by C’s design, and shows how Cyclone avoids them, without giving up C”s hallmark control over low-level details such as data representation and memory management.
Abstract: Cyclone is a safe dialect of C. It has been designed from the ground up to prevent the buffer overflows, format string attacks, and memory management errors that are common in C programs, while retaining C’s syntax and semantics. This paper examines safety violations enabled by C’s design, and shows how Cyclone avoids them, without giving up C’s hallmark control over low-level details such as data representation and memory management.

777 citations


Additional excerpts

  • ...[14, 15, 22, 34, 35], including commercial implementations aimed at mass use, such as Intel’s MPX [21]....

    [...]

Trending Questions (1)
What are the skills required for C Developer?

This should provide a solid basis for discussion of what mainstream C is now: what programmers and analysis tools can assume and what compilers aim to implement.