scispace - formally typeset
Search or ask a question

Showing papers presented at "Static Analysis Symposium in 2005"


Book ChapterDOI
07 Sep 2005
TL;DR: The termination insensitive secure information flow problem can be reduced to solving a safety problem via a simple program transformation, and this paper generalizes the self-compositional approach with a form of information downgrading recently proposed by Li and Zdancewic.
Abstract: The termination insensitive secure information flow problem can be reduced to solving a safety problem via a simple program transformation. Barthe, D'Argenio, and Rezk coined the term “self-composition” to describe this reduction. This paper generalizes the self-compositional approach with a form of information downgrading recently proposed by Li and Zdancewic. We also identify a problem with applying the self-compositional approach in practice, and we present a solution to this problem that makes use of more traditional type-based approaches. The result is a framework that combines the best of both worlds, i.e., better than traditional type-based approaches and better than the self-compositional approach.

312 citations


Journal ArticleDOI
01 Oct 2005
TL;DR: A novel program development framework which uses modular, incremental abstract interpretation as a fundamental tool to obtain information about the program and can reason with much richer information than, for example, traditional types.
Abstract: The technique of Abstract Interpretation has allowed the development of very sophisticated global program analyses which are at the same time provably correct and practical. We present in a tutorial fashion a novel program development framework which uses abstract interpretation as a fundamental tool. The framework uses modular, incremental abstract interpretation to obtain information about the program. This information is used to validate programs, to detect bugs with respect to partial specifications written using assertions (in the program itself and/or in system libraries), to generate and simplify run-time tests, and to perform high-level program transformations such as multiple abstract specialization, parallelization, and resource usage control, all in a provably correct way. In the case of validation and debugging, the assertions can refer to a variety of program points such as procedure entry, procedure exit, points within procedures, or global computations. The system can reason with much richer information than, for example, traditional types. This includes data structure shape (including pointer sharing), bounds on data structure sizes, and other operational variable instantiation properties, as well as procedure-level properties such as determinacy, termination, nonfailure, and bounds on resource consumption (time or space cost). CiaoPP, the preprocessor of the Ciao multi-paradigm programming system, which implements the described functionality, will be used to illustrate the fundamental ideas.

201 citations


Book ChapterDOI
07 Sep 2005
TL;DR: This paper presents the first known automatic counterexample-guided abstraction refinement algorithm for termination proofs and identifies two reasons for spuriousness: abstractions that are too coarse, and candidate transition invariants that aretoo strong.
Abstract: ion can often lead to spurious counterexamples. Counterexample-guided abstraction refinement is a method of strengthening abstractions based on the analysis of these spurious counterexamples. For invariance properties, a counterexample is a finite trace that violates the invariant; it is spurious if it is possible in the abstraction but not in the original system. When proving termination or other liveness properties of infinite-state systems, a useful notion of spurious counterexamples has remained an open problem. For this reason, no counterexample-guided abstraction refinement algorithm was known for termination. In this paper, we address this problem and present the first known automatic counterexample-guided abstraction refinement algorithm for termination proofs. We exploit recent results on transition invariants and transition predicate abstraction. We identify two reasons for spuriousness: abstractions that are too coarse, and candidate transition invariants that are too strong. Our counterexample-guided abstraction refinement algorithm successively weakens candidate transition invariants and refines the abstraction.

138 citations


Book ChapterDOI
07 Sep 2005
TL;DR: This work proposes a framework for the investigation of the alarms produced by Astree, so as to help classifying them as true errors or false alarms that are due to the approximation inherent in the static analysis.
Abstract: Static analyzers like Astree are incomplete, hence, may produce false alarms. We propose a framework for the investigation of the alarms produced by Astree , so as to help classifying them as true errors or false alarms that are due to the approximation inherent in the static analysis. Our approach is based on the computation of an approximation of a set of traces specified by an initial and a (set of) final state(s). Moreover, we allow for finer analyses to focus on some execution patterns or on some possible inputs. The underlying algorithms were implemented inside Astree and used successfully to track alarms in large, critical embedded applications.

83 citations


Book ChapterDOI
07 Sep 2005
TL;DR: A framework for interprocedural shape analysis, which is context- and flow-sensitive with the ability to perform destructive pointer updates and makes the analysis modular in the heap and thus allows reusing the effect of a procedure at different call-sites and even between different contexts occurring at the same call-site.
Abstract: We present a framework for interprocedural shape analysis, which is context- and flow-sensitive with the ability to perform destructive pointer updates. We limit our attention to cutpoint-free programs—programs in which reasoning on a procedure call only requires consideration of context reachable from the actual parameters. For such programs, we show that our framework is able to perform an efficient modular analysis. Technically, our analysis computes procedure summaries as transformers from inputs to outputs while ignoring parts of the heap not relevant to the procedure. This makes the analysis modular in the heap and thus allows reusing the effect of a procedure at different call-sites and even between different contexts occurring at the same call-site. We have implemented a prototype of our framework and used it to verify interesting properties of cutpoint-free programs, including partial correctness of a recursive quicksort implementation.

76 citations


Book ChapterDOI
07 Sep 2005
TL;DR: A new type system for an object-oriented (OO) language that characterizes the sizes of data structures and the amount of heap memory required to successfully execute methods that operate on these data structures is presented.
Abstract: We present a new type system for an object-oriented (OO) language that characterizes the sizes of data structures and the amount of heap memory required to successfully execute methods that operate on these data structures. Key components of this type system include type assertions that use symbolic Presburger arithmetic expressions to capture data structure sizes, the effect of methods on the data structures that they manipulate, and the amount of memory that methods allocate and deallocate. For each method, we conservatively capture the amount of memory required to execute the method as a function of the sizes of the method's inputs. The safety guarantee is that the method will never attempt to use more memory than its type expressions specify. We have implemented a type checker to verify memory usages of OO programs. Our experience is that the type system can precisely and effectively capture memory bounds for a wide range of programs.

74 citations


Book ChapterDOI
07 Sep 2005
TL;DR: The experience of combining, in a realistic setting, a static analyzer with a statistical analysis in order to reduce the inevitable false alarms from a domain-unaware static analyzers is presented.
Abstract: We present our experience of combining, in a realistic setting, a static analyzer with a statistical analysis. This combination is in order to reduce the inevitable false alarms from a domain-unaware static analyzer. Our analyzer named Airac(Array Index Range Analyzer for C) collects all the true buffer-overrun points in ANSI C programs. The soundness is maintained, and the analysis' cost-accuracy improvement is achieved by techniques that static analysis community has long accumulated. For still inevitable false alarms (e.g. Airac raised 970 buffer-overrun alarms in commercial C programs of 5.3 million lines and 737 among the 970 alarms were false), which are always apt for particular C programs, we use a statistical post analysis. The statistical analysis, given the analysis results (alarms), sifts out probable false alarms and prioritizes true alarms. It estimates the probability of each alarm being true. The probabilities are used in two ways: 1) only the alarms that have true-alarm probabilities higher than a threshold are reported to the user; 2) the alarms are sorted by the probability before reporting, so that the user can check highly probable errors first. In our experiments with Linux kernel sources, if we set the risk of missing true error is about 3 times greater than false alarming, 74.83% of false alarms could be filtered; only 15.17% of false alarms were mixed up until the user observes 50% of the true alarms.

73 citations


Book ChapterDOI
07 Sep 2005
TL;DR: A compositional abstract semantics for the static analysis over Sh, or abstract domain, Sh of pair-sharing is defined and it is proved that Sh induces a Galois insertion w.r.t the concrete domain of program states.
Abstract: Pair-sharing analysis of object-oriented programs determines those pairs of program variables bound at run-time to overlapping data structures This information is useful for program parallelisation and analysis We follow a similar construction for logic programming and formalise the property, or abstract domain, Sh of pair-sharing We prove that Sh induces a Galois insertion wrt the concrete domain of program states We define a compositional abstract semantics for the static analysis over Sh, and prove it correct

64 citations


Book ChapterDOI
07 Sep 2005
TL;DR: Banshee's novel features include a code generator for creating customized constraint resolution engines, incremental analysis based on backtracking, and fast persistence that make Banshee useful as a foundation for production program analyses.
Abstract: We introduce Banshee, a toolkit for constructing constraint-based analyses. Banshee's novel features include a code generator for creating customized constraint resolution engines, incremental analysis based on backtracking, and fast persistence. These features make Banshee useful as a foundation for production program analyses.

56 citations


Book ChapterDOI
07 Sep 2005
TL;DR: A technique for generating invariant polynomial inequalities of bounded degree is presented using the abstract interpretation framework based on overapproximating basic semi-algebraic sets, i.e., sets defined by conjunctions of polynometric inequalities by means of convex polyhedra.
Abstract: A technique for generating invariant polynomial inequalities of bounded degree is presented using the abstract interpretation framework. It is based on overapproximating basic semi-algebraic sets, i.e., sets defined by conjunctions of polynomial inequalities, by means of convex polyhedra. While improving on the existing methods for generating invariant polynomial equalities, since polynomial inequalities are allowed in the guards of the transition system, the approach does not suffer from the prohibitive complexity of the methods based on quantifier-elimination. The application of our implementation to benchmark programs shows that the method produces non-trivial invariants in reasonable time. In some cases the generated invariants are essential to verify safety properties that cannot be proved with classical linear invariants.

51 citations


Journal ArticleDOI
01 Oct 2005
TL;DR: A new client-driven pointer analysis algorithm that automatically adjusts its precision in response to the needs of client analyses is presented, often producing results as accurate as fixed-precision algorithms that are many times more costly.
Abstract: This paper presents a new client-driven pointer analysis algorithm that automatically adjusts its precision in response to the needs of client analyses. Using five significant error detection problems as clients, we evaluate our algorithm on 18 real C programs. We compare the accuracy and performance of our algorithm against several commonly used fixed-precision algorithms. We find that the client-driven approach effectively balances cost and precision, often producing results as accurate as fixed-precision algorithms that are many times more costly. Our algorithm works because many client problems only need a small amount of extra precision applied to selected portions of each input program.

Book ChapterDOI
07 Sep 2005
TL;DR: A projection algorithm that works directly on any sparse system of inequalities and which sacrifices precision only when necessary is presented, based on a novel combination of the Fourier-Motzkin algorithm and Simplex.
Abstract: The intrinsic cost of polyhedra has lead to research on more tractable sub-classes of linear inequalities. Rather than committing to the precision of such a sub-class, this paper presents a projection algorithm that works directly on any sparse system of inequalities and which sacrifices precision only when necessary. The algorithm is based on a novel combination of the Fourier-Motzkin algorithm (for exact projection) and Simplex (for approximate projection). By reformulating the convex hull operation in terms of projection, conversion to the frame representation is avoided altogether. Experimental results conducted on logic programs demonstrate that the resulting analysis is efficient and precise.

Book ChapterDOI
07 Sep 2005
TL;DR: An Algol-like programming language that incorporates data abstraction in its syntax is focused on, and an interaction-sequence-based semantics is used for interpreting potentially spurious counterexamples and computing refined abstractions for the next iteration.
Abstract: This paper presents a semantic framework for data abstraction and refinement for verifying safety properties of open programs. The presentation is focused on an Algol-like programming language that incorporates data abstraction in its syntax. The fully abstract game semantics of the language is used for model-checking safety properties, and an interaction-sequence-based semantics is used for interpreting potentially spurious counterexamples and computing refined abstractions for the next iteration.

Book ChapterDOI
07 Sep 2005
TL;DR: The construction of proper widening operators on several weakly-relational numeric abstractions are discussed, which actually consider the semantic abstract domains, whose elements are geometric shapes, instead of the syntactic abstract domains of constraint networks and matrices.
Abstract: We discuss the construction of proper widening operators on several weakly-relational numeric abstractions. Our proposal differs from previous ones in that we actually consider the semantic abstract domains, whose elements are geometric shapes, instead of the (more concrete) syntactic abstract domains of constraint networks and matrices. Since the closure by entailment operator preserves geometric shapes, but not their syntactic expressions, our widenings are immune from the divergence issues that could be faced by the previous approaches when interleaving the applications of widening and closure. The new widenings, which are variations of the standard widening for convex polyhedra defined by Cousot and Halbwachs, can be made as precise as the previous proposals working on the syntactic domains. The implementation of each new widening relies on the availability of an effective reduction procedure for the considered constraint description: we provide such an algorithm for the domain of octagonal shapes.

Journal ArticleDOI
01 Oct 2005

Book ChapterDOI
07 Sep 2005
TL;DR: It is shown how abstract domain completeness can be used for enforcing the PER model of abstract non-interference, which allows us to derive unconstrained attacker models, which do not necessarily either observe all public information or ignore all private information.
Abstract: In this paper, we study the relationship between two models of secure information flow: the PER model (which uses equivalence relations) and the abstract non-interference model (which uses upper closure operators). We embed the lattice of equivalence relations into the lattice of closures, re-interpreting abstract non-interference over the lattice of equivalence relations. For narrow abstract non-interference, we show that the new definition is equivalent to the original, whereas for abstract non-interference it is strictly less general. The relational presentation of abstract non-interference leads to a simplified construction of the most concrete harmless attacker. Moreover, the PER model of abstract non-interference allows us to derive unconstrained attacker models, which do not necessarily either observe all public information or ignore all private information. Finally, we show how abstract domain completeness can be used for enforcing the PER model of abstract non-interference.

Book ChapterDOI
07 Sep 2005
TL;DR: A method is developed to infer a polymorphic well-typing for a logic program and experiments so far show that the automatically inferred well-typings are close to the declared types and result in termination conditions that are as strong as those obtained with declared types.
Abstract: A method is developed to infer a polymorphic well-typing for a logic program. Our motivation is to improve the automation of termination analysis by deriving types from which norms can automatically be constructed. Previous work on type-based termination analysis used either types declared by the user, or automatically generated monomorphic types describing the success set of predicates. The latter types are less precise and result in weaker termination conditions than those obtained from declared types. Our type inference procedure involves solving set constraints generated from the program and derives a well-typing in contrast to a success-set approximation. Experiments so far show that our automatically inferred well-typings are close to the declared types and result in termination conditions that are as strong as those obtained with declared types. We describe the method, its implementation and experiments with termination analysis based on the inferred types.

Book ChapterDOI
07 Sep 2005
TL;DR: This paper presents an infrastructure that can be used to check independently that the assembly output of source-level instrumentation tools has the desired safety properties, and can therefore check that the x86 assembly code resulting from compilation with CCured is in fact type-safe.
Abstract: There are many source-level analyses or instrumentation tools that enforce various safety properties. In this paper we present an infrastructure that can be used to check independently that the assembly output of such tools has the desired safety properties. By working at assembly level we avoid the complications with unavailability of source code, with source-level parsing, and we certify the code that is actually deployed. The novel feature of the framework is an extensible dependently-typed framework that supports type inference and mutation of dependent values in memory. The type system can be extended with new types as needed for the source-level tool that is certified. Using these dependent types, we are able to express the invariants enforced by CCured, a source-level instrumentation tool that guarantees type safety in legacy C programs. We can therefore check that the x86 assembly code resulting from compilation with CCured is in fact type-safe.

Book ChapterDOI
07 Sep 2005
TL;DR: An overview of existing methods is given, a new family of relational abstract domains is formalized, and sets of functions are abstracted more precisely than with known approaches, while being still machine-representable.
Abstract: This paper concerns the abstraction of sets of functions for use in abstract interpretation. The paper gives an overview of existing methods, which are illustrated with applications to shape analysis, and formalizes a new family of relational abstract domains that allows sets of functions to be abstracted more precisely than with known approaches, while being still machine-representable.

Book ChapterDOI
07 Sep 2005
TL;DR: A new algorithm for solving the problem of finding basic block and variable correspondence between two (low-level) programs generated by a compiler from the same source using different optimizations is proposed.
Abstract: Having in mind the ultimate goal of translation validation for optimizing compilers, we propose a new algorithm for solving the problem of finding basic block and variable correspondence between two (low-level) programs generated by a compiler from the same source using different optimizations. The essence of our technique is interpretation of the two programs on random inputs and comparing the histories of value changes for variables. We describe an architecture of a system for finding basic block and variable correspondence and provide experimental evidence of its usefulness.

Book ChapterDOI
07 Sep 2005
TL;DR: A general definition in the context of abstract interpretation is given, it is shown that arbitrary locality-based abstractions are hard to compute in general, and two solutions are provided.
Abstract: We present locality-based abstractions, in which a set of states of a distributed system is abstracted to the collection of views that some observers have of the states. Special cases of locality-abstractions have been used in different contexts (planning, analysis of concurrent programs, concurrency theory). In this paper we give a general definition in the context of abstract interpretation, show that arbitrary locality-based abstractions are hard to compute in general, and provide two solutions to this problem. The solutions are evaluated in several case studies.

Book ChapterDOI
07 Sep 2005
TL;DR: It is proved that method inlining preserves typability, and the experimental results show that the new approach inlines considerably more call sites than Class Hierarchy Analysis.
Abstract: Programmers increasingly implement plugin architectures in type-safe object-oriented languages such as Java. A virtual machine can dynamically load class files containing plugins, and a JIT compiler can do optimisations such as method inlining. Until now, the best known approach to type-safe method inlining in the presence of dynamic class loading is based on Class Hierarchy Analysis. Flow analyses that are more powerful than Class Hierarchy Analysis lead to more inlining but are more time consuming and not known to be type safe. In this paper we present and justify a new approach to type-safe method inlining in the presence of dynamic class loading. First we present experimental results that show that there are major advantages to analysing all locally available plugins at start-up time. If we analyse the locally available plugins at start-up time, then flow analysis is only needed at start-up time and when downloading plugins from the Internet, that is, when long pauses are expected anyway. Second, inspired by the experimental results, we design a new framework for type-safe method inlining which is based on a new type system and an existing flow analysis. In the new type system, a type is a pair of Java types, one from the original program and one that reflects the flow analysis. We prove that method inlining preserves typability, and the experimental results show that the new approach inlines considerably more call sites than Class Hierarchy Analysis.

Journal ArticleDOI
01 Oct 2005
TL;DR: A static analysis that estimates reusable memory cells and a source-level transformation that adds explicit memory reuse commands into the program text and achieves a memory reuse ratio between 5.2% and 91.3% is presented.
Abstract: We present a static analysis that estimates reusable memory cells and a source-level transformation that adds explicit memory reuse commands into the program text. For benchmark ML programs, our analysis and transformation system achieves a memory reuse ratio from 5.2% to 91.3% and reduces the memory peak from 0.0% to 71.9%. The small-ratio cases are for programs that have a number of data structures that are shared. For other cases, our experimental results are encouraging in terms of accuracy and cost. Major features of our analysis and transformation are: (1) polyvariant analysis of functions by parameterization for the argument heap cells; (2) use of multiset formulas in expressing the sharings and partitionings of heap cells; (3) deallocations conditioned by dynamic flags that are passed as extra arguments to functions; (4) individual heap cells as the granularity of explicit memory reuse. Our analysis and transformation system is fully automatic.

Book ChapterDOI
07 Sep 2005
TL;DR: Game Semantics has a concrete aspect: programs are interpreted as strategies for two-player games, and these strategies can be represented by automa ta, which led to a novel approach to compositional model-checking and static analysis.
Abstract: Game Semantics has been developed over the past 12 years or so as a dist inctive approach to the semantics of programming language. It is composit ional in the tradition of denotational semantics, and has led to the cons truction of fully abstract models for programming languages incorporating a wide variety of features which have proved resistant to more tradition al approaches, including (combinations of): higher-order procedures, loca lly scoped variables and references, non-local control operators, non-det erminism, probability, concurrency and more. At the same time, game seman tics has a concrete aspect: programs are interpreted as strategies for ce rtain two-player games, and these strategies can be represented by automa ta. This algorithmic aspect of game semantics has been developed over the past few years, by Dan Ghica, Luke Ong, Andrzej Murawski and the present author. This has led to a novel approach to compositional model-checking and static analysis. We will survey some of the work which has been done , and discuss some directions for future research in this area.

Book ChapterDOI
07 Sep 2005
TL;DR: This paper presents a memory space conscious loop iteration duplication approach that can reduce memory requirements of full duplication (of array data), without decreasing the level of reliability the latter provides, and reuses the memory locations from the same array to store the duplicates of the elements of a given array.
Abstract: Soft errors, a form of transient errors that cause bit flips in memory and other hardware components, are a growing concern for embedded systems as technology scales down. While hardware-based approaches to detect/correct soft errors are important, software-based techniques can be much more flexible. One simple software-based strategy would be full duplication of computations and data, and comparing the results of the corresponding original and duplicate computations. However, while the performance overhead of this strategy can be hidden during execution if there are idle hardware resources, the memory demand increase due to data duplication can be dramatic, particularly for array-based applications that process large amounts of data. Focusing on array-based embedded computing, this paper presents a memory space conscious loop iteration duplication approach that can reduce memory requirements of full duplication (of array data), without decreasing the level of reliability the latter provides. Our “in-place duplication” approach reuses the memory locations from the same array to store the duplicates of the elements of a given array. Consequently, the memory overhead brought by the duplicates can be reduced. Further, we extend this approach to incorporate “global duplication”, which reuses memory locations from other arrays to store duplicates of the elements of a given array. This paper also discusses how our approach operates under a memory size constraint. The experimental results from our implementation show that the proposed approach is successful in reducing memory requirements of the full duplication scheme for twelve array-based applications.

Book ChapterDOI
Andrew D. Gordon1
07 Sep 2005
TL;DR: This talk argues that recent developments in security types for process calculi can lead to better source-based security by typing, and removes a significant limitation of previous type systems.
Abstract: The source-based security problem is to build tools to check security properties of the actual source code of a system, as opposed to some abstract model. Static analysis of C for buffer overruns is one approach. Another is to introduce security types as a programming language feature so that the typechecker proves security properties; for example, languages like Jif and Flow Caml can check noninterference properties of application-level code. Independently, security types have arisen in the setting of process calculi, for checking secrecy and authentication properties of abstract models of low-level cryptographic protocols, for instance. My talk argues that recent developments in security types for process calculi can lead to better source-based security by typing. One development [2] removes a significant limitation of previous type systems and checks security in spite of the partial compromise of a dynamically-growing population of principals. Another [1] generalizes a type system for authentication to check authorization properties, by augmenting the typechecker with Datalog inference relative to a declarative authorization policy. Both developments rely on the idea of enriching process calculi with inert processes to represent both logical facts arising at runtime and also expected security invariants.