scispace - formally typeset

Proceedings ArticleDOI

Mix your contexts well: opportunities unleashed by recent advances in scaling context-sensitivity

22 Feb 2020-pp 27-38

TL;DR: A detailed comparative study of the existing precise context-sensitive heap analyses and proposes novel context abstractions that lead to a new sweet-spot in the arena, and shows that the newer proposals not only enhance the precision of both LSRV contexts and object-sensitive analyses, but also scale well to large programs.
Abstract: Existing precise context-sensitive heap analyses do not scale well for large OO programs. Further, identifying the right context abstraction becomes quite intriguing as two of the most popular categories of context abstractions (call-site- and object-sensitive) lead to theoretically incomparable precision. In this paper, we address this problem by first doing a detailed comparative study (in terms of precision and efficiency) of the existing approaches, both with and without heap cloning. In addition, we propose novel context abstractions that lead to a new sweet-spot in the arena. We first enhance the precision of level-summarized relevant value (LSRV) contexts (a highly scalable abstraction with precision matching that of call-site-sensitivity) using heap cloning. Then, motivated by the resultant scalability, we propose the idea of mixing various context abstractions, and add the advantages of k-object-sensitive analyses to LSRV contexts, in an efficient manner. The resultant context abstraction, which we call lsrvkobjH, also leads to a novel connection between the two broad variants of otherwise incomparable context-sensitive analyses. Our evaluation shows that the newer proposals not only enhance the precision of both LSRV contexts and object-sensitive analyses (to perform control-flow analysis of Java programs), but also scale well to large programs.
Citations
More filters

Journal ArticleDOI
Abstract: Context-sensitive global analysis of large code bases can be expensive, which can make its use impractical during software development. However, there are many situations in which modifications are small and isolated within a few components, and it is desirable to reuse as much as possible previous analysis results. This has been achieved to date through incremental global analysis fixpoint algorithms that achieve cost reductions at fine levels of granularity, such as changes in program lines. However, these fine-grained techniques are neither directly applicable to modular programs nor are they designed to take advantage of modular structures. This paper describes, implements, and evaluates an algorithm that performs efficient context-sensitive analysis incrementally on modular partitions of programs. The experimental results show that the proposed modular algorithm shows significant improvements, in both time and memory consumption, when compared to existing non-modular, fine-grain incremental analysis techniques. Furthermore, thanks to the proposed intermodular propagation of analysis information, our algorithm also outperforms traditional modular analysis even when analyzing from scratch.

4 citations


Journal ArticleDOI
13 Nov 2020
TL;DR: This paper combines finite state machines and dynamic dispatching to allow fully context-sensitive specialization while cloning only functions that are effectively optimized, which makes it possible to apply very liberal optimizations, such as context- sensitive constant propagation, in large programs—something that could not have been easily done before.
Abstract: Academia has spent much effort into making context-sensitive analyses practical, with great profit. However, the implementation of context-sensitive optimizations, in contrast to analyses, is still not practical, due to code-size explosion. This growth happens because current technology requires the cloning of full paths in the Calling Context Tree. In this paper, we present a solution to this problem. We combine finite state machines and dynamic dispatching to allow fully context-sensitive specialization while cloning only functions that are effectively optimized. This technique makes it possible to apply very liberal optimizations, such as context-sensitive constant propagation, in large programs—something that could not have been easily done before. We demonstrate the viability of our idea by formalizing it in Prolog, and implementing it in LLVM. As a proof of concept, we have used our state machines to implement context-sensitive constant propagation in LLVM. The binaries produced by traditional full cloning are 2.63 times larger than the binaries that we generate with our state machines. When applied on Mozilla Firefox, our optimization increases binary size from 7.2MB to 9.2MB. Full cloning, in contrast, yields a binary of 34MB.

2 citations


Additional excerpts

  • ...…2014; Ghiya and Hendren 1996; Hind et al. 1999; Jeong et al. 2017; Li et al. 2020; Might et al. 2010; Milanova 2007; Milanova et al. 2014; Oh et al. 2014; Späth et al. 2019, 2016; Thakur and Nandivada 2019, 2020; Thiessen and Lhoták 2017; Wei and Ryder 2015; Wilson and Lam 1995; Yu et al. 2010]....

    [...]


Journal ArticleDOI
Abstract: This paper surveys recent work on applying analysis and transformation techniques that originate in the field of constraint logic programming (CLP) to the problem of verifying software systems. We present specialisation-based techniques for translating verification problems for different programming languages, and in general software systems, into satisfiability problems for constrained Horn clauses (CHCs), a term that has become popular in the verification field to refer to CLP programs. Then, we describe static analysis techniques for CHCs that may be used for inferring relevant program properties, such as loop invariants. We also give an overview of some transformation techniques based on specialisation and fold/unfold rules, which are useful for improving the effectiveness of CHC satisfiability tools. Finally, we discuss future developments in applying these techniques.

2 citations


Proceedings ArticleDOI
Manas Thakur1Institutions (1)
18 Nov 2020
TL;DR: This paper summarizes some such learnings from its author's research to help readers beat the state-of-the-art in (Java) pointer analysis, as they move into their research careers beyond 2020.
Abstract: Despite being a very old discipline, pointer analysis still attracts several research papers every year in premier programming language venues. While a major goal of contemporary pointer analysis research is to improve its efficiency without sacrificing precision, we also see works that introduce novel ways of solving the problem itself. What does this mean? Research in this area is not going to die soon. I too have been writing pointer analyses of various kinds, specially for object-oriented languages such as Java. While some standard ways of writing such analyses are clear, I have realized that there are an umpteen number of nooks and pitfalls that make the task difficult and error prone. In particular, there are several misconceptions and undocumented practices, being aware of which would save significant research time. On the other hand, there are lessons from my own research that might go a long way in writing correct, precise and efficient pointer analyses, faster. This paper summarizes some such learnings, with a hope to help readers beat the state-of-the-art in (Java) pointer analysis, as they move into their research careers beyond 2020.

Cites background or methods from "Mix your contexts well: opportuniti..."

  • ...For example, Thakur and Nandivada [40] estimate the required amount of value contexts [14, 29] (points-to graphs reaching the entry points of methods) by computing the depth of the subgraphs reachable from each parameter of a method, in a pre-analysis; this information is independent of the flow and does not require performing an expensive iterative dataflow analysis....

    [...]

  • ...Meanwhile, I encountered interesting challenges and insights related to contextsensitive pointer analyses, which led to the development of some novel abstractions [40, 42] for context-sensitivity....

    [...]

  • ...In particular, for scaling context-sensitivity, Thakur and Nandivada [40, 42] propose several variants of novel analysis-specific context abstractions, and also use them as part of scaling precise analyses for JIT compilers [41]....

    [...]

  • ...For a more comprehensive discussion on the relative precisions of various context abstractions from Java program-analysis literature, the reader is referred to a recent work by Thakur and Nandivada [42]....

    [...]

  • ...Kanvar and Khedker [13] present a detailed study of the various choices available while writing such analyses in general, and Thakur and Nandivada [42] evaluate existing and novel choices of context abstractions for Java programs....

    [...]


Journal ArticleDOI
Tian Tan1, Yue Li1, Xiaoxing Ma1, Chang Xu1  +1 moreInstitutions (2)
15 Oct 2021
Abstract: Traditional context-sensitive pointer analysis is hard to scale for large and complex Java programs. To address this issue, a series of selective context-sensitivity approaches have been proposed and exhibit promising results. In this work, we move one step further towards producing highly-precise pointer analyses for hard-to-analyze Java programs by presenting the Unity-Relay framework, which takes selective context sensitivity to the next level. Briefly, Unity-Relay is a one-two punch: given a set of different selective context-sensitivity approaches, say S = S1, . . . , Sn, Unity-Relay first provides a mechanism (called Unity)to combine and maximize the precision of all components of S. When Unity fails to scale, Unity-Relay offers a scheme (called Relay) to pass and accumulate the precision from one approach Si in S to the next, Si+1, leading to an analysis that is more precise than all approaches in S. As a proof-of-concept, we instantiate Unity-Relay into a tool called Baton and extensively evaluate it on a set of hard-to-analyze Java programs, using general precision metrics and popular clients. Compared with the state of the art, Baton achieves the best precision for all metrics and clients for all evaluated programs. The difference in precision is often dramatic — up to 71% of alias pairs reported by previously-best algorithms are found to be spurious and eliminated.

References
More filters

Proceedings ArticleDOI
Stephen M. Blackburn1, Robin Garner1, Chris Hoffmann2, Asjad M. Khang2  +16 moreInstitutions (10)
16 Oct 2006
TL;DR: This paper recommends benchmarking selection and evaluation methodologies, and introduces the DaCapo benchmarks, a set of open source, client-side Java benchmarks that improve over SPEC Java in a variety of ways, including more complex code, richer object behaviors, and more demanding memory system requirements.
Abstract: Since benchmarks drive computer science research and industry product development, which ones we use and how we evaluate them are key questions for the community. Despite complex runtime tradeoffs due to dynamic compilation and garbage collection required for Java programs, many evaluations still use methodologies developed for C, C++, and Fortran. SPEC, the dominant purveyor of benchmarks, compounded this problem by institutionalizing these methodologies for their Java benchmark suite. This paper recommends benchmarking selection and evaluation methodologies, and introduces the DaCapo benchmarks, a set of open source, client-side Java benchmarks. We demonstrate that the complex interactions of (1) architecture, (2) compiler, (3) virtual machine, (4) memory management, and (5) application require more extensive evaluation than C, C++, and Fortran which stress (4) much less, and do not require (3). We use and introduce new value, time-series, and statistical metrics for static and dynamic properties such as code complexity, code size, heap composition, and pointer mutations. No benchmark suite is definitive, but these metrics show that DaCapo improves over SPEC Java in a variety of ways, including more complex code, richer object behaviors, and more demanding memory system requirements. This paper takes a step towards improving methodologies for choosing and evaluating benchmarks to foster innovation in system design and implementation for Java and other managed languages.

1,469 citations


"Mix your contexts well: opportuniti..." refers background in this paper

  • ...12 suite [2], and three benchmarks from Section C (with large applications) of the JGF suite [5]; these benchmarks are listed in Figure 8, along with some static characteristics....

    [...]


Proceedings ArticleDOI
Raja Vallée-Rai1, Phong Co1, Etienne Gagnon1, Laurie Hendren1  +2 moreInstitutions (1)
01 Nov 2010
TL;DR: Soot, a framework for optimizing Java* bytecode, is implemented in Java and supports three intermediate representations for representing Java bytecode: Baf, a streamlined representation of bytecode which is simple to manipulate; Jimple, a typed 3-address intermediate representation suitable for optimization; and Grimp, an aggregated version of Jimple suitable for decompilation.
Abstract: This paper presents Soot, a framework for optimizing Java* bytecode. The framework is implemented in Java and supports three intermediate representations for representing Java bytecode: Baf, a streamlined representation of bytecode which is simple to manipulate; Jimple, a typed 3-address intermediate representation suitable for optimization; and Grimp, an aggregated version of Jimple suitable for decompilation. We describe the motivation for each representation, and the salient points in translating from one representation to another. In order to demonstrate the usefulness of the framework, we have implemented intraprocedural and whole program optimizations. To show that whole program bytecode optimization can give performance improvements, we provide experimental results for 12 large benchmarks, including 8 SPECjvm98 benchmarks running on JDK 1.2 for GNU/Linuxtm. These results show up to 8% improvement when the optimized bytecode is run using the interpreter and up to 21% when run using the JIT compiler.

1,030 citations


"Mix your contexts well: opportuniti..." refers methods in this paper

  • ...We implement the proposed approaches to perform Java control-flow analysis, in the Soot framework [32]....

    [...]

  • ...The benchmarks excluded from the DaCapo suite are the ones which either could not be translated by TamiFlex, or could not be analyzed by Soot (using OpenJDK8) after the TamiFlex pass....

    [...]

  • ...We have implemented the various context abstractions to perform control-flow analysis [21, 24] of Java programs, in Soot [32] version 2.5.0....

    [...]

  • ...We have implemented the various context abstractions to perform control-flow analysis [21, 24] of Java programs, in Soot [32] version 2....

    [...]

  • ...We used the extremely helpful tool TamiFlex [3] to resolve reflective calls in the original DaCapo benchmarks, so that they could be analyzed by Soot....

    [...]


Book
14 Sep 2011

634 citations


"Mix your contexts well: opportuniti..." refers methods in this paper

  • ...The call-string based approach [23, 24] identifies contexts based on the call-string formed by a method’s callers....

    [...]

  • ...The classical call-strings approach [23, 24], which uses the string formed by the callers of a method as the context, statically models the run-time stack, and hence is arguably the most intuitive context abstraction....

    [...]


Book ChapterDOI
Ondřej Lhoták1, Laurie Hendren1Institutions (1)
07 Apr 2003
TL;DR: SPARK is introduced, a flexible framework for experimenting with points-to analyses for Java that supports equality- and subset-based analyses, variations in field sensitivity, respect for declared types, variationsIn call graph construction, off-line simplification, and several solving algorithms.
Abstract: Most points-to analysis research has been done on different systems by different groups, making it difficult to compare results, and to understand interactions between individual factors each group studied. Furthermore, points-to analysis for Java has been studied much less thoroughly than for C, and the tradeoffs appear very different. We introduce SPARK, a flexible framework for experimenting with points-to analyses for Java. SPARK supports equality- and subset-based analyses, variations in field sensitivity, respect for declared types, variations in call graph construction, off-line simplification, and several solving algorithms. SPARK is composed of building blocks on which new analyses can be based. We demonstrate SPARK in a substantial study of factors affecting precision and efficiency of subset-based points-to analyses, including interactions between these factors. Our results show that SPARK is not only flexible and modular, but also offers superior time/space performance when compared to other points-to analysis implementations.

409 citations


"Mix your contexts well: opportuniti..." refers methods in this paper

  • ...JDK classes computed using Spark’s [11] call graph....

    [...]

  • ...Figure 8 also shows the number of JDK classes referred by each benchmark (gives the total number of analyzed classes), computed using the call-graph generated by our default call-graph tool Spark [11]....

    [...]


Journal ArticleDOI
TL;DR: This work presents object sensitivity, a new form of context sensitivity for flow-insensitive points-to analysis for Java, and proposes a parameterization framework that allows analysis designers to control the tradeoffs between cost and precision in the object-sensitive analysis.
Abstract: The goal of points-to analysis for Java is to determine the set of objects pointed to by a reference variable or a reference object field. We present object sensitivity, a new form of context sensitivity for flow-insensitive points-to analysis for Java. The key idea of our approach is to analyze a method separately for each of the object names that represent run-time objects on which this method may be invoked. To ensure flexibility and practicality, we propose a parameterization framework that allows analysis designers to control the tradeoffs between cost and precision in the object-sensitive analysis.Side-effect analysis determines the memory locations that may be modified by the execution of a program statement. Def-use analysis identifies pairs of statements that set the value of a memory location and subsequently use that value. The information computed by such analyses has a wide variety of uses in compilers and software tools. This work proposes new versions of these analyses that are based on object-sensitive points-to analysis.We have implemented two instantiations of our parameterized object-sensitive points-to analysis. On a set of 23 Java programs, our experiments show that these analyses have comparable cost to a context-insensitive points-to analysis for Java which is based on Andersen's analysis for C. Our results also show that object sensitivity significantly improves the precision of side-effect analysis and call graph construction, compared to (1) context-insensitive analysis, and (2) context-sensitive points-to analysis that models context using the invoking call site. These experiments demonstrate that object-sensitive analyses can achieve substantial precision improvement, while at the same time remaining efficient and practical.

408 citations


Network Information
Related Papers (5)
04 Sep 2006

Andrés Fortier, Nicolás Cañibano +3 more

10 Jun 2012

Christer Bäckström, Peter Jonsson

09 Jul 2020

Adrian Francalanza, Claudio Antares Mezzina +1 more

Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20213
20202