What have the authors contributed in "The seahorn verification framework" ?

In this paper, the authors present SeaHorn, a software verification framework.

(Open Access) The SeaHorn Verification Framework (2015) | Arie Gurfinkel

The SeaHorn Veriﬁcation Framework

Arie Gurﬁnkel

, Temesghen Kahsai

, Anvesh Komuravelli

, and Jorge A.

Navas

Software Engineering Institute / Carnegie Mellon University

NASA Ames / Carnegie Mellon University

Computer Science Department, Carnegie Mellon University

NASA Ames / SGT

Abstract. In this paper, we present SeaHorn, a software veriﬁcation

framework. The key distinguishing feature of SeaHorn is its modular

design that separates the concerns of the syntax of the programming

language, its operational semantics, and the veriﬁcation semantics. Sea-

Horn encompasses several novelties: it (a) encodes veriﬁcation condi-

tions using an eﬃcient yet precise inter-procedural technique, (b) pro-

vides ﬂexibility in the veriﬁcation semantics to allow diﬀerent levels of

precision, (c) leverages the state-of-the-art in software model checking

and abstract interpretation for veriﬁcation, and (d) uses Horn-clauses as

an intermediate language to represent veriﬁcation conditions which sim-

pliﬁes interfacing with multiple veriﬁcation tools based on Horn-clauses.

SeaHorn provides users with a powerful veriﬁcation tool and researchers

with an extensible and customizable framework for experimenting with

new software veriﬁcation techniques. The eﬀectiveness and scalability

of SeaHorn are demonstrated by an extensive experimental evaluation

using benchmarks from SV-COMP 2015 and real avionics code.

1 Introduction

In this paper, we present SeaHorn, an LLVM-based [38] framework for veriﬁca-

tion of safety properties of programs. SeaHorn is a fully automated veriﬁer that

veriﬁes user-supplied assertions as well as a number of built-in safety properties.

For example, SeaHorn provides built-in checks for buﬀer and signed integer

overﬂows. More generally, SeaHorn is a framework that simpliﬁes development

and integration of new veriﬁcation techniques. Its main features are:

1. It decouples a programming language syntax and semantics from the underly-

ing veriﬁcation technique. Diﬀerent programming languages include a diverse

assortments of features, many of which are purely syntactic. Handling them

fully is a major eﬀort for new tool developers. We tackle this problem in

SeaHorn by separating the language syntax, its operational semantics, and

This material is based upon work funded and supported by NASA Contract No. NNX14AI09G, NSF Award No.

1422705 and by the Department of Defense under Contract No. FA8721-05-C-0003 with Carnegie Mellon University

for the operation of the Software Engineering Institute, a federally funded research and development center. Any

opinions, ﬁndings and conclusions or recommendations expressed in this material are those of the author(s) and

do not necessarily reﬂect the views of the United States Department of Defense, NASA or NSF. This material

has been approved for public release and unlimited distribution. DM-0002153

2 Arie Gurﬁnkel, Temesghen Kahsai, Anvesh Komuravelli, and Jorge A. Navas

the underlying veriﬁcation semantics – the semantics used by the veriﬁcation

engine. Speciﬁcally, we use the LLVM front-end(s) to deal with the idiosyn-

crasies of the syntax. We use LLVM intermediate representation (IR), called

the bitcode, to deal with the operational semantics, and apply a variety of

transformations to simplify it further. In principle, since the bitcode has

been formalized [54], this provides us with a well-deﬁned formal semantics.

Finally, we use Constrained Horn Clauses (CHC) to logically represent the

veriﬁcation condition (VC).

2. It provides an eﬃcient and precise analysis of programs with procedure us-

ing new inter-procedural veriﬁcation techniques. SeaHorn summarizes the

input-output behavior of procedures eﬃciently without inlining. The expres-

siveness of the summaries is not limited to linear arithmetic (as in our earlier

tools) but extends to richer logics, including, for instance, arrays. Moreover,

it includes a program transformation that lifts deep assertions closer to the

main procedure. This increases context-sensitivity of intra-procedural anal-

yses (used both in veriﬁcation and compiler optimization), and has a signif-

icant impact on our inter-procedural veriﬁcation algorithms.

3. It allows developers to customize the veriﬁcation semantics and oﬀers users

with veriﬁcation semantics of various degrees of precision. SeaHorn is fully

parametric in the (small-step operational) semantics used for the generation

of VCs. The level of abstraction in the built-in semantics varies from consid-

ering only LLVM numeric registers to considering the whole heap (modeled

as a collection of non-overlapping arrays). In addition to generating VCs

based on small-step semantics [48], it can also automatically lift small-step

semantics to large-step [7, 28] (a.k.a. Large Block Encoding, or LBE).

4. It uses Constrained Horn Clauses (CHC) as its intermediate veriﬁcation

language. CHC provide a convenient and elegant way to formally represent

many encoding styles of veriﬁcation conditions. The recent popularity of

CHC as an intermediate language for veriﬁcation engines makes it possible

to interface SeaHorn with a variety of new and emerging tools.

5. It builds on the state-of-the-art in Software Model Checking (SMC) and Ab-

stract Interpretation (AI). SMC and AI have independently led over the

years to the production of analysis tools that have a substantial impact on

the development of real world software. Interestingly, the two exhibit com-

plementary strengths and weaknesses (see e.g., [1,10,24,27]). While SMC so

far has been proved stronger on software that is mostly control driven, AI is

quite eﬀective on data-dependent programs. SeaHorn combines SMT-based

model checking techniques with program invariants supplied by an abstract

interpretation-based tool.

6. Finally, it is implemented on top of the open-source LLVM compiler infras-

tructure. The latter is a well-maintained, well-documented, and continuously

improving framework. It allows SeaHorn users to easily integrate program

analyses, transformations, and other tools that targets LLVM. Moreover,

since SeaHorn analyses LLVM IR, this allows to exploit a rapidly-growing

frontier of LLVM front-ends, encompassing a diverse set of languages. Sea-

The SeaHorn Veriﬁcation Framework 3

Legacy

Front-End

Inter

procedural

Encoding = {Small,

Large}

Precision = {Register,

Pointer,

Memory}

LLVM bitcode

SPACER

Z3-PDR

Back End

IKOS

CEX

Horn Clause

Program

Middle End

Front End

Fig. 1: Overview of SeaHorn architecture.

Horn itself is released as open-source as well (source code can be downloaded

from http://seahorn.github.io).

The design of SeaHorn provides users, developers, and researchers with

an extensible and customizable environment for experimenting with and imple-

menting new software veriﬁcation techniques. SeaHorn is implemented in C++

in the LLVM compiler infrastructure [38]. The overall approach is illustrated in

Figure 1. SeaHorn has been developed in a modular fashion; its architecture is

layered in three parts:

Front-End: Takes an LLVM based program (e.g., C) input program and gen-

erates LLVM IR bitcode. Speciﬁcally, it performs the pre-processing and op-

timization of the bitcode for veriﬁcation purposes. More details are reported

in Section 2.

Middle-End: Takes as input the optimized LLVM bitcode and emits veriﬁ-

cation condition as Constrained Horn Clauses (CHC). The middle-end is in

charge of selecting the encoding of the VCs and the degree of precision. More

details are reported in Section 3.

Back-End: Takes CHC as input and outputs the result of the analysis. In prin-

ciple, any veriﬁcation engine that digests CHC clauses could be used to

discharge the VCs. Currently, SeaHorn employs several SMT-based model

checking engines based on PDR/IC3 [13], including Spacer [35, 36] and

GPDR [33]. Complementary, SeaHorn uses the abstract interpretation-

based analyzer IKOS (Inference Kernel for Open Static Analyzers) [14] for

providing numerical invariants

. More details are reported in Section 4.

The eﬀectiveness and scalability of SeaHorn are demonstrated by our ex-

tensive experimental evaluation in Section 5 and the results of SV-COMP 2015.

Related work. Automated analysis of software is an active area of research.

There is a large number of tools with diﬀerent capabilities and trade-oﬀs [6, 8,

While conceptually, IKOS should run on CHC, currently it uses its own custom IR.

4 Arie Gurﬁnkel, Temesghen Kahsai, Anvesh Komuravelli, and Jorge A. Navas

9, 15–18, 20, 42]. Our approach on separating the program semantics from the

veriﬁcation engine has been previously proposed in numerous tools. From those,

the tool SMACK [49] is the closest to SeaHorn. Like SeaHorn, SMACK tar-

gets programs at the LLVM-IR level. However, SMACK targets Boogie inter-

mediate veriﬁcation language [22] and Boogie-based veriﬁers to construct and

discharge the proof obligations. SeaHorn diﬀers from SMACK in several ways:

(i) SeaHorn uses CHC as its intermediate veriﬁcation language, which allows

to target diﬀerent solvers and veriﬁcation techniques (ii) it tightly integrates

and combines both state-of-the-art software model checking techniques and ab-

stract interpretation and (iii) it provides an automatic inter-procedural analysis

to reason modularly about programs with procedures.

Inter-procedural and modular analysis is critical for scaling veriﬁcation tools

and has been addressed by many researchers (e.g., [2, 33, 35, 37, 40, 51]). Our

approach of using mixed-semantics [30] as a source-to-source transformation has

been also explored in [37]. While in [37], the mixed-semantics is done at the

veriﬁcation semantics (Boogie in this case), in SeaHorn it is done in the front-

end level allowing mixed-semantics to interact with compiler optimizations.

Constrained Horn clauses have been recently proposed [11] as an intermediate

(or exchange) format for representing veriﬁcation conditions. However, they have

long been used in the context of static analysis of imperative and object-oriented

languages (e.g., [41, 48]) and more recently adopted by an increasing number of

solvers (e.g., [12,23,33,36,40]) as well as other veriﬁers such as UFO [4], HSF [26],

VeriMAP [21], Eldarica [50], and TRACER [34].

2 Pre-processing for Veriﬁcation

In our experience, performance of even the most advanced veriﬁcation algo-

rithms is signiﬁcantly impacted by the front-end transformations. In SeaHorn,

the front-end plays a very signiﬁcant role in the overall architecture. SeaHorn

provides two front-ends: a legacy front-end and an inter-procedural front-end.

The legacy front-end. This front-end has been used by SeaHorn for the SV-

COMP 2015 competition [29] (for C programs). It was originally developed for

UFO [3]. First, the input C program is pre-processed with CIL [46] to insert line

markings for printing user-friendly counterexamples, deﬁne missing functions

that are implicitly deﬁned (e.g., malloc-like functions), and initialize all local

variables. Moreover, it creates stubs for functions whose addresses can be taken

and replaces function pointers to those functions with function pointers to the

stubs. Second, the result is translated into LLVM-IR bitcode, using llvm-gcc.

After that, it performs compiler optimizations and preprocessing to simplify the

veriﬁcation task. As a preprocessing step, we further initialize any uninitial-

ized registers using non-deterministic functions. This is used to bridge the gap

between the veriﬁcation semantics (which assumes a non-deterministic assign-

ment) and the compiler semantics, which tries to take advantage of the undeﬁned

behavior of uninitialized variables to perform code optimizations. We perform

The SeaHorn Veriﬁcation Framework 5

a number of program transformations such as function inlining, conversion to

static single assignment (SSA) form, dead code elimination, peephole optimiza-

tions, CFG simpliﬁcations, etc. We also internalize all functions to enable global

optimizations such as replacement of global aggregates with scalars.

The legacy front-end has been very eﬀective for solving SV-COMP (2013,

2014, and 2015) problems. However, it has its own limitations: its design is not

modular and it relies on multiple unsupported legacy tools (such as llvm-gcc

and LLVM versions 2.6 and 2.9). Thus, it is diﬃcult to maintain and extend.

The inter-procedural front-end. In this new front-end, SeaHorn can take any

input program that can be translated into LLVM bitcode. For example, Sea-

Horn uses clang and gcc via DragonEgg

. Our goal is to make SeaHorn not

to be limited to C programs, but applicable (with various degrees of success) to

a broader set of languages based on LLVM (e.g., C++, Objective C, and Swift).

Once we have obtained LLVM bitcode, the front-end is split into two main

sub-components. The ﬁrst one is a pre-processor that performs optimizations

and transformations similar to the ones performed by the legacy front-end. Such

pre-processing is optional as its only mission is to optimize the LLVM bitcode

to make the veriﬁcation task ‘easier’. The second part is focused on a reduced

set of transformations mostly required to produce correct results even if the

pre-processor is disabled. It also performs SSA transformation and internalizes

functions, but in addition, lowers switch instructions into if-then-elses, en-

sures only one exit block per function, inlines global initializers into the main

procedure, and identiﬁes assert-like functions.

Although this front-end can optionally inline functions similarly to the legacy

front-end, its major feature is a transformation that can signiﬁcantly help the

veriﬁcation engine to produce procedure summaries.

One typical problem in proving safety of large programs is that assertions

can be nested very deep inside the call graph. As a result, counterexamples are

longer and it is harder to decide for the veriﬁcation engine what is relevant

for the property of interest. To mitigate this problem, the front-end provides a

transformation based on the concept of mixed semantics

[30, 37]. It relies on

the simple observation that any call to a procedure P either fails inside the call

and therefore P does not return, or returns successfully from the call. Based on

this, any call to P can be instrumented as follows:

– if P may fail, then make a copy of P ’s body (in main) and jump to the copy.

– if P may succeed, then make the call to P as usual. Since P is known not to

fail each assertion in P can be safely replaced with an assume.

Upon completion, only the main function has assertions and each procedure is

inlined at most once. The explanation for the latter is that a function call is

DragonEgg (http://dragonegg.llvm.org/) is a GCC plugin that replaces GCC’s

optimizers and code generators with those from LLVM. As result, the output can be

LLVM bitcode.

The term mixed semantics refers to a combination of small- with big-step operational

semantics.

The SeaHorn Verification Framework

Figures

Citations

ZEUS: Analyzing Safety of Smart Contracts.

Finding The Greedy, Prodigal, and Suicidal Contracts at Scale

MadMax: surviving out-of-gas conditions in Ethereum smart contracts

Horn Clause Solvers for Program Verification

Finding The Greedy, Prodigal, and Suicidal Contracts at Scale

References

LLVM: a compilation framework for lifelong program analysis & transformation

A Tool for Checking ANSI-C Programs

Points-to analysis in almost linear time

CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs

The octagon abstract domain

Related Papers (5)

Z3: an efficient SMT solver

Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints

Synthesizing software verifiers from proof rules

CPACHECKER: a tool for configurable software verification

SAT-based model checking without unrolling

Frequently Asked Questions (1)

Q1. What have the authors contributed in "The seahorn verification framework" ?