scispace - formally typeset
Open AccessBook ChapterDOI

The SeaHorn Verification Framework

Reads0
Chats0
TLDR
The key distinguishing feature of SeaHorn is its modular design that separates the concerns of the syntax of the programming language, its operational semantics, and the verification semantics that simplifies interfacing with multiple verification tools based on Horn-clauses.
Abstract
In this paper, we present SeaHorn, a software verification framework. The key distinguishing feature of SeaHorn is its modular design that separates the concerns of the syntax of the programming language, its operational semantics, and the verification semantics. SeaHorn encompasses several novelties: it (a) encodes verification conditions using an efficient yet precise inter-procedural technique, (b) provides flexibility in the verification semantics to allow different levels of precision, (c) leverages the state-of-the-art in software model checking and abstract interpretation for verification, and (d) uses Horn-clauses as an intermediate language to represent verification conditions which simplifies interfacing with multiple verification tools based on Horn-clauses. SeaHorn provides users with a powerful verification tool and researchers with an extensible and customizable framework for experimenting with new software verification techniques. The effectiveness and scalability of SeaHorn are demonstrated by an extensive experimental evaluation using benchmarks from SV-COMP 2015 and real avionics code.

read more

Content maybe subject to copyright    Report

The SeaHorn Verification Framework
?
Arie Gurfinkel
1
, Temesghen Kahsai
2
, Anvesh Komuravelli
3
, and Jorge A.
Navas
4
1
Software Engineering Institute / Carnegie Mellon University
2
NASA Ames / Carnegie Mellon University
3
Computer Science Department, Carnegie Mellon University
4
NASA Ames / SGT
Abstract. In this paper, we present SeaHorn, a software verification
framework. The key distinguishing feature of SeaHorn is its modular
design that separates the concerns of the syntax of the programming
language, its operational semantics, and the verification semantics. Sea-
Horn encompasses several novelties: it (a) encodes verification condi-
tions using an efficient yet precise inter-procedural technique, (b) pro-
vides flexibility in the verification semantics to allow different levels of
precision, (c) leverages the state-of-the-art in software model checking
and abstract interpretation for verification, and (d) uses Horn-clauses as
an intermediate language to represent verification conditions which sim-
plifies interfacing with multiple verification tools based on Horn-clauses.
SeaHorn provides users with a powerful verification tool and researchers
with an extensible and customizable framework for experimenting with
new software verification techniques. The effectiveness and scalability
of SeaHorn are demonstrated by an extensive experimental evaluation
using benchmarks from SV-COMP 2015 and real avionics code.
1 Introduction
In this paper, we present SeaHorn, an LLVM-based [38] framework for verifica-
tion of safety properties of programs. SeaHorn is a fully automated verifier that
verifies user-supplied assertions as well as a number of built-in safety properties.
For example, SeaHorn provides built-in checks for buffer and signed integer
overflows. More generally, SeaHorn is a framework that simplifies development
and integration of new verification techniques. Its main features are:
1. It decouples a programming language syntax and semantics from the underly-
ing verification technique. Different programming languages include a diverse
assortments of features, many of which are purely syntactic. Handling them
fully is a major effort for new tool developers. We tackle this problem in
SeaHorn by separating the language syntax, its operational semantics, and
?
This material is based upon work funded and supported by NASA Contract No. NNX14AI09G, NSF Award No.
1422705 and by the Department of Defense under Contract No. FA8721-05-C-0003 with Carnegie Mellon University
for the operation of the Software Engineering Institute, a federally funded research and development center. Any
opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and
do not necessarily reflect the views of the United States Department of Defense, NASA or NSF. This material
has been approved for public release and unlimited distribution. DM-0002153

2 Arie Gurfinkel, Temesghen Kahsai, Anvesh Komuravelli, and Jorge A. Navas
the underlying verification semantics the semantics used by the verification
engine. Specifically, we use the LLVM front-end(s) to deal with the idiosyn-
crasies of the syntax. We use LLVM intermediate representation (IR), called
the bitcode, to deal with the operational semantics, and apply a variety of
transformations to simplify it further. In principle, since the bitcode has
been formalized [54], this provides us with a well-defined formal semantics.
Finally, we use Constrained Horn Clauses (CHC) to logically represent the
verification condition (VC).
2. It provides an efficient and precise analysis of programs with procedure us-
ing new inter-procedural verification techniques. SeaHorn summarizes the
input-output behavior of procedures efficiently without inlining. The expres-
siveness of the summaries is not limited to linear arithmetic (as in our earlier
tools) but extends to richer logics, including, for instance, arrays. Moreover,
it includes a program transformation that lifts deep assertions closer to the
main procedure. This increases context-sensitivity of intra-procedural anal-
yses (used both in verification and compiler optimization), and has a signif-
icant impact on our inter-procedural verification algorithms.
3. It allows developers to customize the verification semantics and offers users
with verification semantics of various degrees of precision. SeaHorn is fully
parametric in the (small-step operational) semantics used for the generation
of VCs. The level of abstraction in the built-in semantics varies from consid-
ering only LLVM numeric registers to considering the whole heap (modeled
as a collection of non-overlapping arrays). In addition to generating VCs
based on small-step semantics [48], it can also automatically lift small-step
semantics to large-step [7, 28] (a.k.a. Large Block Encoding, or LBE).
4. It uses Constrained Horn Clauses (CHC) as its intermediate verification
language. CHC provide a convenient and elegant way to formally represent
many encoding styles of verification conditions. The recent popularity of
CHC as an intermediate language for verification engines makes it possible
to interface SeaHorn with a variety of new and emerging tools.
5. It builds on the state-of-the-art in Software Model Checking (SMC) and Ab-
stract Interpretation (AI). SMC and AI have independently led over the
years to the production of analysis tools that have a substantial impact on
the development of real world software. Interestingly, the two exhibit com-
plementary strengths and weaknesses (see e.g., [1,10,24,27]). While SMC so
far has been proved stronger on software that is mostly control driven, AI is
quite effective on data-dependent programs. SeaHorn combines SMT-based
model checking techniques with program invariants supplied by an abstract
interpretation-based tool.
6. Finally, it is implemented on top of the open-source LLVM compiler infras-
tructure. The latter is a well-maintained, well-documented, and continuously
improving framework. It allows SeaHorn users to easily integrate program
analyses, transformations, and other tools that targets LLVM. Moreover,
since SeaHorn analyses LLVM IR, this allows to exploit a rapidly-growing
frontier of LLVM front-ends, encompassing a diverse set of languages. Sea-

The SeaHorn Verification Framework 3
Legacy
Front-End
Inter
procedural
Encoding = {Small,
Large}
Precision = {Register,
Pointer,
Memory}
LLVM bitcode
SPACER
Z3-PDR
Back End
IKOS
CEX
or
Horn Clause
Middle End
Front End
Fig. 1: Overview of SeaHorn architecture.
Horn itself is released as open-source as well (source code can be downloaded
from http://seahorn.github.io).
The design of SeaHorn provides users, developers, and researchers with
an extensible and customizable environment for experimenting with and imple-
menting new software verification techniques. SeaHorn is implemented in C++
in the LLVM compiler infrastructure [38]. The overall approach is illustrated in
Figure 1. SeaHorn has been developed in a modular fashion; its architecture is
layered in three parts:
Front-End: Takes an LLVM based program (e.g., C) input program and gen-
erates LLVM IR bitcode. Specifically, it performs the pre-processing and op-
timization of the bitcode for verification purposes. More details are reported
in Section 2.
Middle-End: Takes as input the optimized LLVM bitcode and emits verifi-
cation condition as Constrained Horn Clauses (CHC). The middle-end is in
charge of selecting the encoding of the VCs and the degree of precision. More
details are reported in Section 3.
Back-End: Takes CHC as input and outputs the result of the analysis. In prin-
ciple, any verification engine that digests CHC clauses could be used to
discharge the VCs. Currently, SeaHorn employs several SMT-based model
checking engines based on PDR/IC3 [13], including Spacer [35, 36] and
GPDR [33]. Complementary, SeaHorn uses the abstract interpretation-
based analyzer IKOS (Inference Kernel for Open Static Analyzers) [14] for
providing numerical invariants
5
. More details are reported in Section 4.
The effectiveness and scalability of SeaHorn are demonstrated by our ex-
tensive experimental evaluation in Section 5 and the results of SV-COMP 2015.
Related work. Automated analysis of software is an active area of research.
There is a large number of tools with different capabilities and trade-offs [6, 8,
5
While conceptually, IKOS should run on CHC, currently it uses its own custom IR.

4 Arie Gurfinkel, Temesghen Kahsai, Anvesh Komuravelli, and Jorge A. Navas
9, 1518, 20, 42]. Our approach on separating the program semantics from the
verification engine has been previously proposed in numerous tools. From those,
the tool SMACK [49] is the closest to SeaHorn. Like SeaHorn, SMACK tar-
gets programs at the LLVM-IR level. However, SMACK targets Boogie inter-
mediate verification language [22] and Boogie-based verifiers to construct and
discharge the proof obligations. SeaHorn differs from SMACK in several ways:
(i) SeaHorn uses CHC as its intermediate verification language, which allows
to target different solvers and verification techniques (ii) it tightly integrates
and combines both state-of-the-art software model checking techniques and ab-
stract interpretation and (iii) it provides an automatic inter-procedural analysis
to reason modularly about programs with procedures.
Inter-procedural and modular analysis is critical for scaling verification tools
and has been addressed by many researchers (e.g., [2, 33, 35, 37, 40, 51]). Our
approach of using mixed-semantics [30] as a source-to-source transformation has
been also explored in [37]. While in [37], the mixed-semantics is done at the
verification semantics (Boogie in this case), in SeaHorn it is done in the front-
end level allowing mixed-semantics to interact with compiler optimizations.
Constrained Horn clauses have been recently proposed [11] as an intermediate
(or exchange) format for representing verification conditions. However, they have
long been used in the context of static analysis of imperative and object-oriented
languages (e.g., [41, 48]) and more recently adopted by an increasing number of
solvers (e.g., [12,23,33,36,40]) as well as other verifiers such as UFO [4], HSF [26],
VeriMAP [21], Eldarica [50], and TRACER [34].
2 Pre-processing for Verification
In our experience, performance of even the most advanced verification algo-
rithms is significantly impacted by the front-end transformations. In SeaHorn,
the front-end plays a very significant role in the overall architecture. SeaHorn
provides two front-ends: a legacy front-end and an inter-procedural front-end.
The legacy front-end. This front-end has been used by SeaHorn for the SV-
COMP 2015 competition [29] (for C programs). It was originally developed for
UFO [3]. First, the input C program is pre-processed with CIL [46] to insert line
markings for printing user-friendly counterexamples, define missing functions
that are implicitly defined (e.g., malloc-like functions), and initialize all local
variables. Moreover, it creates stubs for functions whose addresses can be taken
and replaces function pointers to those functions with function pointers to the
stubs. Second, the result is translated into LLVM-IR bitcode, using llvm-gcc.
After that, it performs compiler optimizations and preprocessing to simplify the
verification task. As a preprocessing step, we further initialize any uninitial-
ized registers using non-deterministic functions. This is used to bridge the gap
between the verification semantics (which assumes a non-deterministic assign-
ment) and the compiler semantics, which tries to take advantage of the undefined
behavior of uninitialized variables to perform code optimizations. We perform

The SeaHorn Verification Framework 5
a number of program transformations such as function inlining, conversion to
static single assignment (SSA) form, dead code elimination, peephole optimiza-
tions, CFG simplifications, etc. We also internalize all functions to enable global
optimizations such as replacement of global aggregates with scalars.
The legacy front-end has been very effective for solving SV-COMP (2013,
2014, and 2015) problems. However, it has its own limitations: its design is not
modular and it relies on multiple unsupported legacy tools (such as llvm-gcc
and LLVM versions 2.6 and 2.9). Thus, it is difficult to maintain and extend.
The inter-procedural front-end. In this new front-end, SeaHorn can take any
input program that can be translated into LLVM bitcode. For example, Sea-
Horn uses clang and gcc via DragonEgg
6
. Our goal is to make SeaHorn not
to be limited to C programs, but applicable (with various degrees of success) to
a broader set of languages based on LLVM (e.g., C++, Objective C, and Swift).
Once we have obtained LLVM bitcode, the front-end is split into two main
sub-components. The first one is a pre-processor that performs optimizations
and transformations similar to the ones performed by the legacy front-end. Such
pre-processing is optional as its only mission is to optimize the LLVM bitcode
to make the verification task ‘easier’. The second part is focused on a reduced
set of transformations mostly required to produce correct results even if the
pre-processor is disabled. It also performs SSA transformation and internalizes
functions, but in addition, lowers switch instructions into if-then-elses, en-
sures only one exit block per function, inlines global initializers into the main
procedure, and identifies assert-like functions.
Although this front-end can optionally inline functions similarly to the legacy
front-end, its major feature is a transformation that can significantly help the
verification engine to produce procedure summaries.
One typical problem in proving safety of large programs is that assertions
can be nested very deep inside the call graph. As a result, counterexamples are
longer and it is harder to decide for the verification engine what is relevant
for the property of interest. To mitigate this problem, the front-end provides a
transformation based on the concept of mixed semantics
7
[30, 37]. It relies on
the simple observation that any call to a procedure P either fails inside the call
and therefore P does not return, or returns successfully from the call. Based on
this, any call to P can be instrumented as follows:
if P may fail, then make a copy of P ’s body (in main) and jump to the copy.
if P may succeed, then make the call to P as usual. Since P is known not to
fail each assertion in P can be safely replaced with an assume.
Upon completion, only the main function has assertions and each procedure is
inlined at most once. The explanation for the latter is that a function call is
6
DragonEgg (http://dragonegg.llvm.org/) is a GCC plugin that replaces GCC’s
optimizers and code generators with those from LLVM. As result, the output can be
LLVM bitcode.
7
The term mixed semantics refers to a combination of small- with big-step operational
semantics.

Citations
More filters
Proceedings ArticleDOI

ZEUS: Analyzing Safety of Smart Contracts.

TL;DR: This work presents ZEUS—a framework to verify the correctness and validate the fairness of smart contracts, which leverages both abstract interpretation and symbolic model checking, along with the power of constrained horn clauses to quickly verify contracts for safety.
Proceedings ArticleDOI

Finding The Greedy, Prodigal, and Suicidal Contracts at Scale

TL;DR: Maian is implemented, the first tool for specifying and reasoning about trace properties, which employs interprocedural symbolic analysis and concrete validator for exhibiting real exploits.
Journal ArticleDOI

MadMax: surviving out-of-gas conditions in Ethereum smart contracts

TL;DR: MadMax is presented: a static program analysis technique to automatically detect gas-focused vulnerabilities with very high confidence and achieves high precision and scalability.
Book ChapterDOI

Horn Clause Solvers for Program Verification

TL;DR: The authors summarize main useful properties of Horn clauses, illustrate encodings of procedural program verification into Horn clauses and then highlight a number of useful simplification strategies at the level of Horn clause.
Posted Content

Finding The Greedy, Prodigal, and Suicidal Contracts at Scale

TL;DR: MaIAN as discussed by the authors is a tool for precisely specifying and reasoning about trace properties of smart contracts, which employs inter-procedural symbolic analysis and concrete validator for exhibiting real exploits.
References
More filters
Proceedings ArticleDOI

LLVM: a compilation framework for lifelong program analysis & transformation

TL;DR: The design of the LLVM representation and compiler framework is evaluated in three ways: the size and effectiveness of the representation, including the type information it provides; compiler performance for several interprocedural problems; and illustrative examples of the benefits LLVM provides for several challenging compiler problems.
Book ChapterDOI

A Tool for Checking ANSI-C Programs

TL;DR: The tool supports almost all ANSI-C language features, including pointer constructs, dynamic memory allocation, recursion, and the float and double data types, and is integrated into a graphical user interface.
Proceedings ArticleDOI

Points-to analysis in almost linear time

TL;DR: This is the asymptotically fastest non-trivial interprocedural points-to analysis algorithm yet described and is based on a non-standard type system for describing a universally valid storage shape graph for a program in linear space.
Book ChapterDOI

CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs

TL;DR: The structure of CIL is described, with a focus on how it disambiguates those features of C that were found to be most confusing for program analysis and transformation, allowing a complete project to be viewed as a single compilation unit.
Journal ArticleDOI

The octagon abstract domain

TL;DR: The octagon abstract domain this article is a relational numerical abstract domain for static analysis by abstract interpretation, which allows representing conjunctions of constraints of the form ± X ± Y? c where X and Y range among program variables and c is a constant in?,?, or? automatically inferred.
Frequently Asked Questions (1)
Q1. What have the authors contributed in "The seahorn verification framework" ?

In this paper, the authors present SeaHorn, a software verification framework.