scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Mix your contexts well: opportunities unleashed by recent advances in scaling context-sensitivity

TL;DR: A detailed comparative study of the existing precise context-sensitive heap analyses and proposes novel context abstractions that lead to a new sweet-spot in the arena, and shows that the newer proposals not only enhance the precision of both LSRV contexts and object-sensitive analyses, but also scale well to large programs.
Abstract: Existing precise context-sensitive heap analyses do not scale well for large OO programs. Further, identifying the right context abstraction becomes quite intriguing as two of the most popular categories of context abstractions (call-site- and object-sensitive) lead to theoretically incomparable precision. In this paper, we address this problem by first doing a detailed comparative study (in terms of precision and efficiency) of the existing approaches, both with and without heap cloning. In addition, we propose novel context abstractions that lead to a new sweet-spot in the arena. We first enhance the precision of level-summarized relevant value (LSRV) contexts (a highly scalable abstraction with precision matching that of call-site-sensitivity) using heap cloning. Then, motivated by the resultant scalability, we propose the idea of mixing various context abstractions, and add the advantages of k-object-sensitive analyses to LSRV contexts, in an efficient manner. The resultant context abstraction, which we call lsrvkobjH, also leads to a novel connection between the two broad variants of otherwise incomparable context-sensitive analyses. Our evaluation shows that the newer proposals not only enhance the precision of both LSRV contexts and object-sensitive analyses (to perform control-flow analysis of Java programs), but also scale well to large programs.
Citations
More filters
Journal ArticleDOI
15 Oct 2021
TL;DR: The Unity-Relay framework as discussed by the authors is a one-two-punch approach to combine and maximize the precision of all components of a context-sensitive pointer analysis for hard-to-analyze Java programs.
Abstract: Traditional context-sensitive pointer analysis is hard to scale for large and complex Java programs. To address this issue, a series of selective context-sensitivity approaches have been proposed and exhibit promising results. In this work, we move one step further towards producing highly-precise pointer analyses for hard-to-analyze Java programs by presenting the Unity-Relay framework, which takes selective context sensitivity to the next level. Briefly, Unity-Relay is a one-two punch: given a set of different selective context-sensitivity approaches, say S = S1, . . . , Sn, Unity-Relay first provides a mechanism (called Unity)to combine and maximize the precision of all components of S. When Unity fails to scale, Unity-Relay offers a scheme (called Relay) to pass and accumulate the precision from one approach Si in S to the next, Si+1, leading to an analysis that is more precise than all approaches in S. As a proof-of-concept, we instantiate Unity-Relay into a tool called Baton and extensively evaluate it on a set of hard-to-analyze Java programs, using general precision metrics and popular clients. Compared with the state of the art, Baton achieves the best precision for all metrics and clients for all evaluated programs. The difference in precision is often dramatic — up to 71% of alias pairs reported by previously-best algorithms are found to be spurious and eliminated.

13 citations

Journal ArticleDOI
TL;DR: In this paper, the authors describe, implement, and evaluate an algorithm that performs efficient context-sensitive analysis incrementally on modular partitions of programs, where modifications are small and isolated within a few components, and it is desirable to reuse as much as possible previous analysis results.
Abstract: Context-sensitive global analysis of large code bases can be expensive, which can make its use impractical during software development. However, there are many situations in which modifications are small and isolated within a few components, and it is desirable to reuse as much as possible previous analysis results. This has been achieved to date through incremental global analysis fixpoint algorithms that achieve cost reductions at fine levels of granularity, such as changes in program lines. However, these fine-grained techniques are neither directly applicable to modular programs nor are they designed to take advantage of modular structures. This paper describes, implements, and evaluates an algorithm that performs efficient context-sensitive analysis incrementally on modular partitions of programs. The experimental results show that the proposed modular algorithm shows significant improvements, in both time and memory consumption, when compared to existing non-modular, fine-grain incremental analysis techniques. Furthermore, thanks to the proposed intermodular propagation of analysis information, our algorithm also outperforms traditional modular analysis even when analyzing from scratch.

8 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a survey of techniques for translating verification problems for different programming languages, and in general software systems, into satisfiability problems for constrained Horn clauses (CHCs), a term that has become popular in the verification field to refer to CLP programs.
Abstract: This paper surveys recent work on applying analysis and transformation techniques that originate in the field of constraint logic programming (CLP) to the problem of verifying software systems. We present specialisation-based techniques for translating verification problems for different programming languages, and in general software systems, into satisfiability problems for constrained Horn clauses (CHCs), a term that has become popular in the verification field to refer to CLP programs. Then, we describe static analysis techniques for CHCs that may be used for inferring relevant program properties, such as loop invariants. We also give an overview of some transformation techniques based on specialisation and fold/unfold rules, which are useful for improving the effectiveness of CHC satisfiability tools. Finally, we discuss future developments in applying these techniques.

7 citations

Journal ArticleDOI
13 Nov 2020
TL;DR: This paper combines finite state machines and dynamic dispatching to allow fully context-sensitive specialization while cloning only functions that are effectively optimized, which makes it possible to apply very liberal optimizations, such as context- sensitive constant propagation, in large programs—something that could not have been easily done before.
Abstract: Academia has spent much effort into making context-sensitive analyses practical, with great profit. However, the implementation of context-sensitive optimizations, in contrast to analyses, is still not practical, due to code-size explosion. This growth happens because current technology requires the cloning of full paths in the Calling Context Tree. In this paper, we present a solution to this problem. We combine finite state machines and dynamic dispatching to allow fully context-sensitive specialization while cloning only functions that are effectively optimized. This technique makes it possible to apply very liberal optimizations, such as context-sensitive constant propagation, in large programs—something that could not have been easily done before. We demonstrate the viability of our idea by formalizing it in Prolog, and implementing it in LLVM. As a proof of concept, we have used our state machines to implement context-sensitive constant propagation in LLVM. The binaries produced by traditional full cloning are 2.63 times larger than the binaries that we generate with our state machines. When applied on Mozilla Firefox, our optimization increases binary size from 7.2MB to 9.2MB. Full cloning, in contrast, yields a binary of 34MB.

3 citations


Additional excerpts

  • ...…2014; Ghiya and Hendren 1996; Hind et al. 1999; Jeong et al. 2017; Li et al. 2020; Might et al. 2010; Milanova 2007; Milanova et al. 2014; Oh et al. 2014; Späth et al. 2019, 2016; Thakur and Nandivada 2019, 2020; Thiessen and Lhoták 2017; Wei and Ryder 2015; Wilson and Lam 1995; Yu et al. 2010]....

    [...]

Proceedings ArticleDOI
18 Nov 2020
TL;DR: This paper summarizes some such learnings from its author's research to help readers beat the state-of-the-art in (Java) pointer analysis, as they move into their research careers beyond 2020.
Abstract: Despite being a very old discipline, pointer analysis still attracts several research papers every year in premier programming language venues. While a major goal of contemporary pointer analysis research is to improve its efficiency without sacrificing precision, we also see works that introduce novel ways of solving the problem itself. What does this mean? Research in this area is not going to die soon. I too have been writing pointer analyses of various kinds, specially for object-oriented languages such as Java. While some standard ways of writing such analyses are clear, I have realized that there are an umpteen number of nooks and pitfalls that make the task difficult and error prone. In particular, there are several misconceptions and undocumented practices, being aware of which would save significant research time. On the other hand, there are lessons from my own research that might go a long way in writing correct, precise and efficient pointer analyses, faster. This paper summarizes some such learnings, with a hope to help readers beat the state-of-the-art in (Java) pointer analysis, as they move into their research careers beyond 2020.

1 citations


Cites background or methods from "Mix your contexts well: opportuniti..."

  • ...For example, Thakur and Nandivada [40] estimate the required amount of value contexts [14, 29] (points-to graphs reaching the entry points of methods) by computing the depth of the subgraphs reachable from each parameter of a method, in a pre-analysis; this information is independent of the flow and does not require performing an expensive iterative dataflow analysis....

    [...]

  • ...Meanwhile, I encountered interesting challenges and insights related to contextsensitive pointer analyses, which led to the development of some novel abstractions [40, 42] for context-sensitivity....

    [...]

  • ...In particular, for scaling context-sensitivity, Thakur and Nandivada [40, 42] propose several variants of novel analysis-specific context abstractions, and also use them as part of scaling precise analyses for JIT compilers [41]....

    [...]

  • ...For a more comprehensive discussion on the relative precisions of various context abstractions from Java program-analysis literature, the reader is referred to a recent work by Thakur and Nandivada [42]....

    [...]

  • ...Kanvar and Khedker [13] present a detailed study of the various choices available while writing such analyses in general, and Thakur and Nandivada [42] evaluate existing and novel choices of context abstractions for Java programs....

    [...]

References
More filters
Posted Content
TL;DR: Soot as discussed by the authors uses data flow values for context-sensitivity and uses the tabulation method of the functional approach and the technique of value-based termination of call string construction.
Abstract: An interprocedural analysis is precise if it is flow sensitive and fully context-sensitive even in the presence of recursion. Many methods of interprocedural analysis sacrifice precision for scalability while some are precise but limited to only a certain class of problems. Soot currently supports interprocedural analysis of Java programs using graph reachability. However, this approach is restricted to IFDS/IDE problems, and is not suitable for general data flow frameworks such as heap reference analysis and points-to analysis which have non-distributive flow functions. We describe a general-purpose interprocedural analysis framework for Soot using data flow values for context-sensitivity. This framework is not restricted to problems with distributive flow functions, although the lattice must be finite. It combines the key ideas of the tabulation method of the functional approach and the technique of value-based termination of call string construction. The efficiency and precision of interprocedural analyses is heavily affected by the precision of the underlying call graph. This is especially important for object-oriented languages like Java where virtual method invocations cause an explosion of spurious call edges if the call graph is constructed naively. We have instantiated our framework with a flow and context-sensitive points-to analysis in Soot, which enables the construction of call graphs that are far more precise than those constructed by Soot's SPARK engine.

33 citations

Journal ArticleDOI
10 Oct 2019
TL;DR: The novelty of Eagle is to enable k-obj to analyze a method with partial context-sensitivity, i.e., context-sensitively for only some of its selected variables/allocation sites during a lightweight pre-analysis by reasoning about context-free-language (CFL) reachability at the level of variables/objects in the program, based on a new CFL-reachability formulation of k- obj.
Abstract: Object-sensitivity is widely used as a context abstraction for computing the points-to information context-sensitively for object-oriented languages like Java. Due to the combinatorial explosion of contexts in large programs, k-object-sensitive pointer analysis (under k-limiting), denoted k-obj, is scalable only for small values of k, where k⩽2 typically. A few recent solutions attempt to improve its efficiency by instructing k-obj to analyze only some methods in the program context-sensitively, determined heuristically by a pre-analysis. While already effective, these heuristics-based pre-analyses do not provide precision guarantees, and consequently, are limited in the efficiency gains achieved. We introduce a radically different approach, Eagle, that makes k-obj run significantly faster than the prior art while maintaining its precision. The novelty of Eagle is to enable k-obj to analyze a method with partial context-sensitivity, i.e., context-sensitively for only some of its selected variables/allocation sites. Eagle makes these selections during a lightweight pre-analysis by reasoning about context-free-language (CFL) reachability at the level of variables/objects in the program, based on a new CFL-reachability formulation of k-obj. We demonstrate the advances made by Eagle by comparing it with the prior art in terms of a set of popular Java benchmarks and applications.

24 citations


"Mix your contexts well: opportuniti..." refers methods in this paper

  • ...Some prior works [13, 15] use a pre-analysis to approximate the methods/objects that follow some insightful patterns, and apply context-sensitivity partially to the identified methods/objects....

    [...]

Proceedings ArticleDOI
29 Oct 2013
TL;DR: The new elements of the approach are the ability to eliminate statements, and not just variables, as well as its modularity: set-based pre-analysis can be performed on the input just once, e.g., allowing the pre-optimization of libraries that are subsequently reused many times and for different analyses.
Abstract: We present set-based pre-analysis: a virtually universal optimization technique for flow-insensitive points-to analysis. Points-to analysis computes a static abstraction of how object values flow through a program's variables. Set-based pre-analysis relies on the observation that much of this reasoning can take place at the set level rather than the value level. Computing constraints at the set level results in significant optimization opportunities: we can rewrite the input program into a simplified form with the same essential points-to properties. This rewrite results in removing both local variables and instructions, thus simplifying the subsequent value-based points-to computation. Effectively, set-based pre-analysis puts the program in a normal form optimized for points-to analysis. Compared to other techniques for off-line optimization of points-to analyses in the literature, the new elements of our approach are the ability to eliminate statements, and not just variables, as well as its modularity: set-based pre-analysis can be performed on the input just once, e.g., allowing the pre-optimization of libraries that are subsequently reused many times and for different analyses. In experiments with Java programs, set-based pre-analysis eliminates 30% of the program's local variables and 30% or more of computed context-sensitive points-to facts, over a wide set of benchmarks and analyses, resulting in a ~20% average speedup (max: 110%, median: 18%).

23 citations


"Mix your contexts well: opportuniti..." refers methods in this paper

  • ...Prior works [25, 27] use a pre-analysis to identify code portions that do not affect the analysis results or may degrade scalability, and analyze them conservatively....

    [...]

Proceedings ArticleDOI
16 Feb 2019
TL;DR: This paper proposes a three-stage analysis approach that lets us scale complex whole-program value-contexts based heap analyses for large programs, without losing their precision, and demonstrates the usefulness of the approach by using it to perform whole- program context-, flow- and field-sensitive thread-escape analysis and control-flow analysis of Java programs.
Abstract: The precision of heap analyses determines the precision of several associated optimizations, and has been a prominent area in compiler research. It has been shown that context-sensitive heap analyses are more precise than the insensitive ones, but their scalability continues to be a cause of concern. Though the value-contexts approach improves the scalability of classical call-string based context-sensitive analyses, it still does not scale well for several popular whole-program heap analyses. In this paper, we propose a three-stage analysis approach that lets us scale complex whole-program value-contexts based heap analyses for large programs, without losing their precision. Our approach is based on a novel idea of level-summarized relevant value-contexts (LSRV-contexts), which take into account an important observation that we do not need to compare the complete value-contexts at each call-site. Our overall approach consists of three stages: (i) a fast pre-analysis stage that finds the portion of the caller-context which is actually needed in the callee; (ii) a main-analysis stage which uses LSRV-contexts to defer the analysis of methods that do not impact the callers' heap and analyze the rest efficiently; and (iii) a post-analysis stage that analyzes the deferred methods separately. We demonstrate the usefulness of our approach by using it to perform whole-program context-, flow- and field-sensitive thread-escape analysis and control-flow analysis of Java programs. Our evaluation of the two analyses against their traditional value-contexts based versions shows that we not only reduce the analysis time and memory consumption significantly, but also succeed in analyzing otherwise unanalyzable programs in less than 40 minutes.

11 citations


"Mix your contexts well: opportuniti..." refers background or methods in this paper

  • ...In order to find whether a method or its callees satisfy Insight 1, we modify the multi-stage analysis approach (consisting of a pre-, a main- and a post-analysis) already in place for LSRV contexts [30], as discussed next....

    [...]

  • ...This is because of the identification of relevance and the notion of level-summarization, and the splitting of the overall approach into three stages: pre-analysis, main-analysis, and post-analysis [30]....

    [...]

  • ...However, as shown by Thakur and Nandivada [30],valcs does not scale for popular whole-program heap analyses....

    [...]

  • ...For a given analysis, Thakur and Nandivada [30] show that LSRV contexts (lsrv) only scale the corresponding valuecontexts based analysis, without affecting its precision....

    [...]

  • ...Thakur and Nandivada [30] identify the relevant portions of value contexts and summarize them to form the analysis-specific abstraction of LSRV contexts; their approach scales individual value-contexts based analyses, and hence call-strings based analyses, without compromising on the precision of the analysis....

    [...]

Journal ArticleDOI
24 Oct 2018
TL;DR: This paper proposes a novel approach based on object sensitivity analysis that takes as input a set of client queries, and tries to answer them using an initial round of inexpensive object sensitivityAnalysis that uses a low object-name length bound at all allocation sites.
Abstract: Object sensitivity analysis is a well-known form of context-sensitive points-to analysis. This analysis is parameterized by a bound on the names of symbolic objects associated with each allocation site. In this paper, we propose a novel approach based on object sensitivity analysis that takes as input a set of client queries, and tries to answer them using an initial round of inexpensive object sensitivity analysis that uses a low object-name length bound at all allocation sites. For the queries that are answered unsatisfactorily, the approach then pin points "bad" points-to facts, which are the ones that are responsible for the imprecision. It then employs a form of program slicing to identify allocation sites that are potentially causing these bad points-to facts to be generated. The approach then runs object sensitivity analysis once again, this time using longer names for just these allocation sites, with the objective of resolving the imprecision in this round. We describe our approach formally, prove its completeness, and describe a Datalog-based implementation of it on top of the Petablox framework. Our evaluation of our approach on a set of large Java benchmarks, using two separate clients, reveals that our approach is more precise than the baseline object sensitivity approach, by around 29% for one of the clients and by around 19% for the other client. Our approach is also more precise on most large benchmarks than a recently proposed approach that uses SAT solvers to identify allocation sites to refine.

7 citations


"Mix your contexts well: opportuniti..." refers background or methods in this paper

  • ...[22] use slicing to incrementally enhance the precision of k-limited object-sensitive analysis, and increase the value of k for objects that are identified as precision-critical....

    [...]

  • ...Over the last few years, limited-length versions of object-sensitivity (abbreviated as kobj) have been shown to offer reasonably wellbalanced precision-scalability trade-offs for object-oriented programs [28, 29]; their variants have also been introduced with a great interest, specially in order to make the original approaches scale to large programs [13, 14, 22]....

    [...]

  • ...Over the last few years, object sensitivity has become a much used context abstraction; further, there are several recent works that focus on scaling existing object-sensitive analyses, usually by compromising on their precision [13, 14, 22, 29]....

    [...]

  • ...For our evaluation, we set k = 1, as it is well-known [13, 14, 22] that the next more precise object-sensitive analysis 2objH , does not directly scale for many large benchmarks....

    [...]

  • ...We focus only on the heap-cloning enabled variants, as they have been used more popularly in recent literature [13, 14, 22] due to their higher precision....

    [...]