scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Mix your contexts well: opportunities unleashed by recent advances in scaling context-sensitivity

TL;DR: A detailed comparative study of the existing precise context-sensitive heap analyses and proposes novel context abstractions that lead to a new sweet-spot in the arena, and shows that the newer proposals not only enhance the precision of both LSRV contexts and object-sensitive analyses, but also scale well to large programs.
Abstract: Existing precise context-sensitive heap analyses do not scale well for large OO programs. Further, identifying the right context abstraction becomes quite intriguing as two of the most popular categories of context abstractions (call-site- and object-sensitive) lead to theoretically incomparable precision. In this paper, we address this problem by first doing a detailed comparative study (in terms of precision and efficiency) of the existing approaches, both with and without heap cloning. In addition, we propose novel context abstractions that lead to a new sweet-spot in the arena. We first enhance the precision of level-summarized relevant value (LSRV) contexts (a highly scalable abstraction with precision matching that of call-site-sensitivity) using heap cloning. Then, motivated by the resultant scalability, we propose the idea of mixing various context abstractions, and add the advantages of k-object-sensitive analyses to LSRV contexts, in an efficient manner. The resultant context abstraction, which we call lsrvkobjH, also leads to a novel connection between the two broad variants of otherwise incomparable context-sensitive analyses. Our evaluation shows that the newer proposals not only enhance the precision of both LSRV contexts and object-sensitive analyses (to perform control-flow analysis of Java programs), but also scale well to large programs.
Citations
More filters
Journal ArticleDOI
15 Oct 2021
TL;DR: The Unity-Relay framework as discussed by the authors is a one-two-punch approach to combine and maximize the precision of all components of a context-sensitive pointer analysis for hard-to-analyze Java programs.
Abstract: Traditional context-sensitive pointer analysis is hard to scale for large and complex Java programs. To address this issue, a series of selective context-sensitivity approaches have been proposed and exhibit promising results. In this work, we move one step further towards producing highly-precise pointer analyses for hard-to-analyze Java programs by presenting the Unity-Relay framework, which takes selective context sensitivity to the next level. Briefly, Unity-Relay is a one-two punch: given a set of different selective context-sensitivity approaches, say S = S1, . . . , Sn, Unity-Relay first provides a mechanism (called Unity)to combine and maximize the precision of all components of S. When Unity fails to scale, Unity-Relay offers a scheme (called Relay) to pass and accumulate the precision from one approach Si in S to the next, Si+1, leading to an analysis that is more precise than all approaches in S. As a proof-of-concept, we instantiate Unity-Relay into a tool called Baton and extensively evaluate it on a set of hard-to-analyze Java programs, using general precision metrics and popular clients. Compared with the state of the art, Baton achieves the best precision for all metrics and clients for all evaluated programs. The difference in precision is often dramatic — up to 71% of alias pairs reported by previously-best algorithms are found to be spurious and eliminated.

13 citations

Journal ArticleDOI
TL;DR: In this paper, the authors describe, implement, and evaluate an algorithm that performs efficient context-sensitive analysis incrementally on modular partitions of programs, where modifications are small and isolated within a few components, and it is desirable to reuse as much as possible previous analysis results.
Abstract: Context-sensitive global analysis of large code bases can be expensive, which can make its use impractical during software development. However, there are many situations in which modifications are small and isolated within a few components, and it is desirable to reuse as much as possible previous analysis results. This has been achieved to date through incremental global analysis fixpoint algorithms that achieve cost reductions at fine levels of granularity, such as changes in program lines. However, these fine-grained techniques are neither directly applicable to modular programs nor are they designed to take advantage of modular structures. This paper describes, implements, and evaluates an algorithm that performs efficient context-sensitive analysis incrementally on modular partitions of programs. The experimental results show that the proposed modular algorithm shows significant improvements, in both time and memory consumption, when compared to existing non-modular, fine-grain incremental analysis techniques. Furthermore, thanks to the proposed intermodular propagation of analysis information, our algorithm also outperforms traditional modular analysis even when analyzing from scratch.

8 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a survey of techniques for translating verification problems for different programming languages, and in general software systems, into satisfiability problems for constrained Horn clauses (CHCs), a term that has become popular in the verification field to refer to CLP programs.
Abstract: This paper surveys recent work on applying analysis and transformation techniques that originate in the field of constraint logic programming (CLP) to the problem of verifying software systems. We present specialisation-based techniques for translating verification problems for different programming languages, and in general software systems, into satisfiability problems for constrained Horn clauses (CHCs), a term that has become popular in the verification field to refer to CLP programs. Then, we describe static analysis techniques for CHCs that may be used for inferring relevant program properties, such as loop invariants. We also give an overview of some transformation techniques based on specialisation and fold/unfold rules, which are useful for improving the effectiveness of CHC satisfiability tools. Finally, we discuss future developments in applying these techniques.

7 citations

Journal ArticleDOI
13 Nov 2020
TL;DR: This paper combines finite state machines and dynamic dispatching to allow fully context-sensitive specialization while cloning only functions that are effectively optimized, which makes it possible to apply very liberal optimizations, such as context- sensitive constant propagation, in large programs—something that could not have been easily done before.
Abstract: Academia has spent much effort into making context-sensitive analyses practical, with great profit. However, the implementation of context-sensitive optimizations, in contrast to analyses, is still not practical, due to code-size explosion. This growth happens because current technology requires the cloning of full paths in the Calling Context Tree. In this paper, we present a solution to this problem. We combine finite state machines and dynamic dispatching to allow fully context-sensitive specialization while cloning only functions that are effectively optimized. This technique makes it possible to apply very liberal optimizations, such as context-sensitive constant propagation, in large programs—something that could not have been easily done before. We demonstrate the viability of our idea by formalizing it in Prolog, and implementing it in LLVM. As a proof of concept, we have used our state machines to implement context-sensitive constant propagation in LLVM. The binaries produced by traditional full cloning are 2.63 times larger than the binaries that we generate with our state machines. When applied on Mozilla Firefox, our optimization increases binary size from 7.2MB to 9.2MB. Full cloning, in contrast, yields a binary of 34MB.

3 citations


Additional excerpts

  • ...…2014; Ghiya and Hendren 1996; Hind et al. 1999; Jeong et al. 2017; Li et al. 2020; Might et al. 2010; Milanova 2007; Milanova et al. 2014; Oh et al. 2014; Späth et al. 2019, 2016; Thakur and Nandivada 2019, 2020; Thiessen and Lhoták 2017; Wei and Ryder 2015; Wilson and Lam 1995; Yu et al. 2010]....

    [...]

Proceedings ArticleDOI
18 Nov 2020
TL;DR: This paper summarizes some such learnings from its author's research to help readers beat the state-of-the-art in (Java) pointer analysis, as they move into their research careers beyond 2020.
Abstract: Despite being a very old discipline, pointer analysis still attracts several research papers every year in premier programming language venues. While a major goal of contemporary pointer analysis research is to improve its efficiency without sacrificing precision, we also see works that introduce novel ways of solving the problem itself. What does this mean? Research in this area is not going to die soon. I too have been writing pointer analyses of various kinds, specially for object-oriented languages such as Java. While some standard ways of writing such analyses are clear, I have realized that there are an umpteen number of nooks and pitfalls that make the task difficult and error prone. In particular, there are several misconceptions and undocumented practices, being aware of which would save significant research time. On the other hand, there are lessons from my own research that might go a long way in writing correct, precise and efficient pointer analyses, faster. This paper summarizes some such learnings, with a hope to help readers beat the state-of-the-art in (Java) pointer analysis, as they move into their research careers beyond 2020.

1 citations


Cites background or methods from "Mix your contexts well: opportuniti..."

  • ...For example, Thakur and Nandivada [40] estimate the required amount of value contexts [14, 29] (points-to graphs reaching the entry points of methods) by computing the depth of the subgraphs reachable from each parameter of a method, in a pre-analysis; this information is independent of the flow and does not require performing an expensive iterative dataflow analysis....

    [...]

  • ...Meanwhile, I encountered interesting challenges and insights related to contextsensitive pointer analyses, which led to the development of some novel abstractions [40, 42] for context-sensitivity....

    [...]

  • ...In particular, for scaling context-sensitivity, Thakur and Nandivada [40, 42] propose several variants of novel analysis-specific context abstractions, and also use them as part of scaling precise analyses for JIT compilers [41]....

    [...]

  • ...For a more comprehensive discussion on the relative precisions of various context abstractions from Java program-analysis literature, the reader is referred to a recent work by Thakur and Nandivada [42]....

    [...]

  • ...Kanvar and Khedker [13] present a detailed study of the various choices available while writing such analyses in general, and Thakur and Nandivada [42] evaluate existing and novel choices of context abstractions for Java programs....

    [...]

References
More filters
Proceedings ArticleDOI
14 Jun 2017
TL;DR: MAHJONG is a novel heap abstraction that is specifically developed to address the needs of an important class of type-dependent clients, such as call graph construction, devirtualization and may-fail casting, and is expected to provide significant benefits for many program analyses where call graphs are required.
Abstract: Mainstream points-to analysis techniques for object-oriented languages rely predominantly on the allocation-site abstraction to model heap objects. We present MAHJONG, a novel heap abstraction that is specifically developed to address the needs of an important class of type-dependent clients, such as call graph construction, devirtualization and may-fail casting. By merging equivalent automata representing type-consistent objects that are created by the allocation-site abstraction, MAHJONG enables an allocation-site-based points-to analysis to run significantly faster while achieving nearly the same precision for type-dependent clients. MAHJONG is simple conceptually, efficient, and drops easily on any allocation-site-based points-to analysis. We demonstrate its effectiveness by discussing some insights on why it is a better alternative of the allocation-site abstraction for type-dependent clients and evaluating it extensively on 12 large real-world Java programs with five context-sensitive points-to analyses and three widely used type-dependent clients. MAHJONG is expected to provide significant benefits for many program analyses where call graphs are required.

63 citations


"Mix your contexts well: opportuniti..." refers background in this paper

  • ...Over the last few years, object sensitivity has become a much used context abstraction; further, there are several recent works that focus on scaling existing object-sensitive analyses, usually by compromising on their precision [13, 14, 22, 29]....

    [...]

  • ...[29] scale context-sensitive analyses for call-graph construction by merging type-consistent objects identified using equivalent automata....

    [...]

  • ...Over the last few years, limited-length versions of object-sensitivity (abbreviated as kobj) have been shown to offer reasonably wellbalanced precision-scalability trade-offs for object-oriented programs [28, 29]; their variants have also been introduced with a great interest, specially in order to make the original approaches scale to large programs [13, 14, 22]....

    [...]

Book ChapterDOI
08 Sep 2016
TL;DR: Bean, a general approach for improving the precision of any k-object-sensitive analysis, denoted \(k\)-obj, by still using a k-limiting context abstraction, is introduced and implemented as an open-source tool and applied to refine two state-of-the-art whole-program pointer analyses in Doop.
Abstract: Object-sensitivity is regarded as arguably the best context abstraction for pointer analysis in object-oriented languages. However, a k-object-sensitive pointer analysis, which uses a sequence of k allocation sites (as k context elements) to represent a calling context of a method call, may end up using some context elements redundantly without inducing a finer partition of the space of (concrete) calling contexts for the method call. In this paper, we introduce Bean, a general approach for improving the precision of any k-object-sensitive analysis, denoted \(k\)-obj, by still using a k-limiting context abstraction. The novelty is to identify allocation sites that are redundant context elements in \(k\)-obj from an Object Allocation Graph (OAG), which is built based on a pre-analysis (e.g., a context-insensitive Andersen’s analysis) performed initially on a program and then avoid them in the subsequent k-object-sensitive analysis for the program. Bean is generally more precise than \(k\)-obj, with a precision that is guaranteed to be as good as \(k\)-obj in the worst case. We have implemented Bean as an open-source tool and applied it to refine two state-of-the-art whole-program pointer analyses in Doop. For two representative clients (may-alias and may-fail-cast) evaluated on a set of nine large Java programs from the DaCapo benchmark suite, Bean has succeeded in making both analyses more precise for all these benchmarks under each client at only small increases in analysis cost.

58 citations

01 Apr 1990
TL;DR: This paper presents in detail a fixpoint algorithm that has been developed for this purpose and the motivation behind it, and presents an informal proof of correctness of the algorithm and some results obtained from an implementation.
Abstract: Bruynooghe described a framework for the top-down abstract interpretation of logic programs. In this framework, abstract interpretation is carried out by constructing an abstract and-or tree in a top-down fashion for a given query and program. Such an abstract interpreter requires fixpoint computation for programs which contain recursive predicates. This paper presents in detail a fixpoint algorithm that has been developed for this purpose and the motivation behind it. We start off by describing a simple-minded algorithm. After pointing out its shortcomings, we present a series of refinements to this algorithm, until we reach the final version. The aim is to give an intuitive grasp and provide justification for the relative complexity of the final algorithm. We also present an informal proof of correctness of the algorithm and some results obtained from an implementation.

53 citations


"Mix your contexts well: opportuniti..." refers result in this paper

  • ...It would also be interesting to compare our approaches to scale context-sensitivity with approaches used in other domains such as logic programming [6, 17]....

    [...]

Proceedings ArticleDOI
07 Jun 2004
TL;DR: From empirical study, it is found that restriction based on escape information is often, but not always, sufficient at prohibiting the explosive nature of specialization.
Abstract: Specialization of heap objects is critical for pointer analysis to effectively analyze complex memory activity. This paper discusses heap specialization with respect to call chains. Due to the sheer number of distinct call chains, exhaustive specialization can be cumbersome. On the other hand, insufficient specialization can miss valuable opportunities to prevent spurious data flow, which results in not only reduced accuracy but also increased overhead.In determining whether further specialization will be fruitful, an object's escape information can be exploited. From empirical study, we found that restriction based on escape information is often, but not always, sufficient at prohibiting the explosive nature of specialization.For in-depth case study, four representative benchmarks are selected. For each benchmark, we vary the degree of heap specialization and examine its impact on analysis results and time. To provide better visibility into the impact, we present the points-to set and pointed-to-by set sizes in the form of histograms.

49 citations


"Mix your contexts well: opportuniti..." refers methods in this paper

  • ...In this paper, we enhance the precision of LSRV contexts using heap cloning [18] (to obtain lsrvH ), mix lsrvH with object-sensitivity to obtain a newer, more precise context abstraction (called lsrvkobjH ), and use a pre-analysis to scale the proposed context abstraction....

    [...]

  • ...[18] proposed heap cloning as yet another technique to improve the precision of existing context abstractions....

    [...]

  • ...Heap cloning [18] is a technique to specialize the different instances of the objects allocated on the heap, based on the context in which they are created....

    [...]

Journal ArticleDOI
TL;DR: This work describes various summarization techniques based on k-limiting, allocation sites, patterns, variables, other generic instrumentation predicates, and higher-order logics and classify the heap models as storeless, store based, and hybrid.
Abstract: Heap data is potentially unbounded and seemingly arbitrary. Hence, unlike stack and static data, heap data cannot be abstracted in terms of a fixed set of program variables. This makes it an interesting topic of study and there is an abundance of literature employing heap abstractions. Although most studies have addressed similar concerns, insights gained in one description of heap abstraction may not directly carry over to some other description. In our search of a unified theme, we view heap abstraction as consisting of two steps: (a) heap modelling, which is the process of representing a heap memory (i.e., an unbounded set of concrete locations) as a heap model (i.e., an unbounded set of abstract locations), and (b) summarization, which is the process of bounding the heap model by merging multiple abstract locations into summary locations. We classify the heap models as storeless, store based, and hybrid. We describe various summarization techniques based on k-limiting, allocation sites, patterns, variables, other generic instrumentation predicates, and higher-order logics. This approach allows us to compare the insights of a large number of seemingly dissimilar heap abstractions and also paves the way for creating new abstractions by mix and match of models and summarization techniques.

45 citations


"Mix your contexts well: opportuniti..." refers background in this paper

  • ...Kanvar and Khedker [7] present a survey of existing heap abstractions, including for contextsensitivity, and assert the importance of the abstraction used towards the precision and the scalability of a given analysis....

    [...]

  • ...Thus, as also noted by prior works [7, 26], the choice of context abstraction plays a very important role in deciding whether a given context-sensitive analysis gives good enough precision for its associated cost....

    [...]

  • ...4.1 kcs versus valcs The value-contexts approach (valcs) simply scales call-string based analyses (kcs), and as shown by Padhye and Khedker [20], each value context for a method can be mapped back to a call-string based context....

    [...]