scispace - formally typeset
Search or ask a question

Showing papers in "ACM Transactions on Programming Languages and Systems in 1998"


Journal ArticleDOI
TL;DR: An algorithm is given that uses finite shape graphs to approximate conservatively the possible “shapes” that heap-allocated structures in a program can take on, and is sometimes able to show that when the input to the program is a circular list, the output is also a circular lists.
Abstract: This article concerns the static analysis of programs that perform destructive updating on heap-allocated storage. We give an algorithm that uses finite shape graphs to approximate conservatively the possible “shapes” that heap-allocated structures in a program can take on. For certain programs, our technique is able to determine such properties as (1) when the input to the program is a list, the output is also a list and (2) when the input to the program is a tree, the output is also a tree. For example, the method can determine that “listness” is preserved by (1) a program that performs list reversal via destructive updating of the input list and (2) a program that searches a list and splices a new element into the list. None of the previously known methods that use graphs to model the program's store are capable of determining that “listness” is preserved on these examples (or examples of similar complexity). In contrast with most previous work, our shape analysis algorithm is even accurate for certain programs that update cyclic data structures; that is, it is sometimes able to show that when the input to the program is a circular list, the output is also a circular list. For example, the shape-analysis algorithm can determine that an insertion into a circular list preserves “circular listness.”

306 citations


Journal ArticleDOI
TL;DR: The basic idea of preserving the serial semantics simplifies the program development process, and the concept of using data access specifications to guide the parallelization offers significant advantages over more traditional control-based approaches.
Abstract: Jade is a portable, implicitly parallel language designed for exploiting task-level concurrency.Jade programmers start with a program written in a standard serial, imperative language, then use Jade constructs to declare how parts of the program access data. The Jade implementation uses this data access information to automatically extract the concurrency and map the application onto the machine at hand. The resulting parallel execution preserves the semantics of the original serial program. We have implemented Jade as an extension to C, and Jade implementations exist for s hared-memory multiprocessors, homogeneous message-passing machines, and heterogeneous networks of workstations. In this atricle we discuss the design goals and decisions that determined the final form of Jade and present an overview of the Jade implementation. We also present our experience using Jade to implement several complete scientific and engineering applications. We use this experience to evaluate how the different Jade language features were used in practice and how well Jade as a whole supports the process of developing parallel applications. We find that the basic idea of preserving the serial semantics simplifies the program development process, and that the concept of using data access specifications to guide the parallelization offers significant advantages over more traditional control-based approaches. We also find that the Jade data model can interact poorly with concurrency patterns that write disjoint pieces of a single aggregate data structure, although this problem arises in only one of the applications.

213 citations


Journal ArticleDOI
TL;DR: The proposed framework for automatic data layout selection builds and examines search spaces of candidate data layouts and capitalizes on state-of-the-art 0-1 integer programming technology to compute optimal solutions of these NP-complete problems.
Abstract: The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet efficient machine-independent parallel programming model. After the algorithm selection, the data layout choice is the key intellectual challenge in writing an efficient program in such languages. The performance of a data layout depends on the target compilation system, the target machine, the problem size, and the number of available processors. This makes the choice of a good layout extremely difficult for most users of such languages. If languages such as HPF are to find general acceptance, the need for data layout selection support has to be addressed. We beleive that the appropriate way to provide the needed support is through a tool that generates data layout specifications automatically. This article discusses the design and implementation of a data layout selection tool that generates HPF-style data layout specifications automatically. Because layout is done in a tool that is not embedded in the target compiler and hence will be run only a few times during the tuning phase of an application, it can use techniques such as integer programming that may be considered too computationally expensive for inclusion in production compilers. The proposed framework for automatic data layout selection builds and examines search spaces of candidate data layouts. A candidate layout is an efficient layout for some part of the program. After the generation of search spaces, a single candidate layout is selected for each program part, resulting in a data layout for the entire program. A good overall data layout may require the remapping of arrays between program parts. A performance estimator based on a compiler model, an execution model, and a machine model are needed to predict the execution time of each candidate layout and the costs of possible remappings between candidate data layouts. In the proposed framework, instances of NP-complete problems are solved during the construction of candidate layout search spaces and the final selection of candidate layouts from each search space. Rather than resorting to heuristics, the framework capitalizes on state-of-the-art 0-1 integer programming technology to compute optimal solutions of these NP-complete problems. A prototype data layout assistant tool based on our framework has been implemented as part of the D system currently under development at Rice University. The article reports preliminary experimental results. The results indicate that the framework is efficient and allows the generation of data layouts of high quality.

150 citations


Journal ArticleDOI
TL;DR: This article describes the architecture of the SLG-WAM for a powerful class of programs, the class of fixed-order dynamically stratified programs, and offers a detailed description of the algorithms, data structures, and instructions that the WAM adds, and a performance analysis of engine overhead due to the extensions.
Abstract: SLG resolution uses tabling to evaluate nonfloundering normal logic pr ograms according to the well-founded semantics. The SLG-WAM, which forms the engine of the XSB system, can compute in-memory recursive queries an order of magnitute faster than current deductive databases. At the same time, the SLG-WAM tightly intergrates Prolog code with tabled SLG code, and executes Prolog code with minimal overhead compared to the WAM. As a result, the SLG-WAM brings to logic programming important termination and complexity properties of deductive databases. This article describes the architecture of the SLG-WAM for a powerful class of programs, the class of fixed-order dynamically stratified programs. We offer a detailed description of the algorithms, data structures, and instructions that the SLG-WAM adds to the WAM, and a performance analysis of engine overhead due to the extensions.

142 citations


Journal ArticleDOI
TL;DR: A novel static type system for process calculus is proposed, which ensures both partial deadlock-freedom and partial confluence and classijication of communication channels into reliable and unreliable channels based on their usage and a guarantee of the usage by the type system.
Abstract: We propose a novel static type system fora process calculus, which ensures both partial deadlock-freedom and partial confluence. The key novel ideas are (1) introduction of the order of channel use as type information, and (2) classijication of communication channels into reliable and unreliable channels based on their usage and a guarantee of the usage by the type system. We can ensure that communication on reliable channels never causes deadlock and also that certain reliable channels never introduce nondeterminism. With the type system, for example, the simply typed X-calculus can be encoded into the deadlock-free and confluent fragment of ourprocess calculus; we can therefore recover behavior of the typed X-calculus in the level of process calculi. We also show that typical concurrent objects can also be encoded into the deadlock-fne fragment.

135 citations


Journal ArticleDOI
TL;DR: It is proved that the algorithm is sound with respect to the region inference rules and that it always terminates even though the area inference rules permit polymorphic recursion in regions.
Abstract: Region Inference is a program analysis which infers lifetimes of values. It is targeted at a runtime model in which the store consists of a stack of regions and memory management predominantly consists of pushing and popping regions, rather than performing garbage collection. Region Inference has previously been specified by a set of inference rules which formalize when regions may be allocated and deallocated. This article presents an algorithm which implements the specification. We prove that the algorithm is sound with respect to the region inference rules and that it always terminates even though the region inference rules permit polymorphic recursion in regions. The algorithm is the result of several years of experiments with region inference algorithms in the ML Kit, a compiler from Standard ML to assembly language. We report on practical experience with the algorithm and give hints on how to implement it.

133 citations


Journal ArticleDOI
Oukseh Lee1, Kwangkeun Yi1
TL;DR: This article formally defines the context-sensitive, top-down type inference algorithm “”, proves its soundness and completeness, and shows a distinguishing property that it always stops earlier than M if the input program is ill typed.
Abstract: The Hindley/Milner let-polymorphic type inference system has two different algorithms: one is the de factostandard Algorithm 𝒲 that is bottom-up (or context-insensitive), and the other is a “folklore” algorithm that is top-down (or context-sensitive). Because the latter algorithm has not been formally presented with its soundness and completeness proofs, and its relation with the 𝒲 algorithm has not been rigorously investigated, its use in place of (or in combination with) 𝒲 is not well founded. In this article, we formally define the context-sensitive, top-down type inference algorithm (named “M”), prove its soundness and completeness, and show a distinguishing property that M always stops earlier than 𝒲 if the input program is ill typed. Our proofs can be seen as theoretical justifications for various type-checking strategies being used in practice.

121 citations


Journal ArticleDOI
TL;DR: Techniques for phrasing questions about the flow of values in arrays, to check the legality of array privatization, and about the conditions under which a dependence exists in terms of systems of contstraints are described.
Abstract: Traditional array dependence analysis, which detects potential memory aliasing of array references is a key analysis technique for automatic parallelization. Recent studies of benchmark codes indicate that limitations of analysis cause many compilers to overlook large amounts of potential parallelism, and that exploiting this parallelism requires algorithms to answer new question about array references, not just get better answers to the old questions of aliasing. We need to ask about the flow of values in arrays, to check the legality of array privatization, and about the conditions under which a dependence exists, to obtain information about conditional parallelism. In some cases, we must answer these questions about code containing nonlinear terms in loop bounds or subscripts. This article describes techniques for phrasing these questions in terms of systems of contstraints. Conditional dependence analysis can be performed with a constraint operation we call the "gist" operation. When subscripts and loop bounds are affine, questions about the flow of values in array variables can be phrased in terms of Presburger Arithmetic. When the constraints describing a dependence are not affine, we introduce uninterpreted function symbols to represent the nonaffine terms. Our constraint language also provides a rich language for communication with the dependence analyzer, by either the programmer or other phases of the compiler. This article also documents our investigations of the praticality of our approach. The worst-case complexity of Presburger Arithmetic indicates that it might be unsuitable for any practical application. However, we have found that analysis of benchmark programs does not cause the exponential growth in the number of constraints that could occur in the worst case. We have studied the constraints produced during our aanalysis, and identified characteristics that keep our algorithms free of exponential behavior in practice.

120 citations


Journal ArticleDOI
TL;DR: This article elaborate global control for partial deduction, using the concept of a characteristic tree, encapsulating specialization behavior rather than syntactic structure, to guide generalization and polyvariance, and shows how this can be done in a correct and elegant way.
Abstract: Given a program and some input data, partial deduction computes a specialized program handling any remaining input more efficiently. However, controlling the process well is a rather difficult problem. In this article, we elaborate global control for partial deduction: for which atoms, among possibly infinitely many, should specialized relations be produced, meanwhile guaranteeing correctness as well as termination? Our work is based on two ingredients. First, we use the concept of a characteristic tree, encapsulating specialization behavior rather than syntactic structure, to guide generalization and polyvariance, and we show how this can be done in a correct and elegant way. Second, we structure combinations of atoms and associated characteristic trees in global trees registering “causal” relationships among such pairs. This allows us to spot looming nontermination and perform proper generalization in order to avert the danger, without having to impose a depth bound on characteristic trees. The practical relevance and benefits of the work are illustrated through extensive experiments. Finally, a similar approach may improve upon current (on-line) control strategies for program transformation in general such as (positive) supercompilation of functional programs. It also seems valuable in the context of abstract interpretation to handle infinite domains of infinite height with more precision.

111 citations


Journal ArticleDOI
TL;DR: This article presents a partial evaluation scheme for functional logic languages based on an automatic unfolding algorithm which builds narrowing trees and is believed to be the first generic algorithm for the specialization of functional logic programs.
Abstract: Languages that integrate functional and logic programming with a complete operational semantics are based on narrowing, a unification-based goal-solving mechanism which subsumes the reduction principle of functional languages and the resolution principle of logic languages. In this article, we present a partial evaluation scheme for functional logic languages based on an automatic unfolding algorithm which builds narrowing trees. The method is formalized within the theoretical framework established by Lloyd and Shepherdson for the partial deduction of logic programs, which we have generalized for dealing with functional computations. A generic specialization algorithm is proposed which does not depend on the eager or lazy nature of the narrower being used. To the best of our knowledge, this is the first generic algorithm for the specialization of functional logic programs. We also discuss the relation to work on partial evaluation in functional programming, term-rewriting systems, and logic programming. Finally, we present some experimental results with an implementation of the algorithm which show in practice that the narrowing-driven partial evaluator effectively combines the propagation of partial data structures (by means of logical variables and unification) with better opportunities for optimization (thanks to the functional dimension).

109 citations


Journal ArticleDOI
TL;DR: The cache-and-prune method presented in the article consists of three stages: the original program is extended to cache the results of all its intermediate subcomputations as well as the final result, the extended program is incrementalized so that computation on a new input can use all intermediate results on an old input.
Abstract: A systematic approach is given for deriving incremental programs that exploit caching. The cache-and-prune method presented in the article consists of three stages: (I) the original program is extended to cache the results of all its intermediate subcomputations as well as the final result, (II)) the extended program is incrementalized so that computation on a new input can use all intermediate results on an old input, and (III) unused results cached by the extended program and maintained by the incremental program are pruned away, leaving a pruned extended program that caches only useful intermediate results and a pruned incremental program that uses and maintains only useful results. All three stages utilize static analyses and semantics-preserving transformations. Stages I and III are simple, clean, and fully automatable. The overall method has a kind of optimality with respect to the techniques used in Stage II. The method can be applied straightfowardly to provide a systematic approach to program improvement via caching.

Journal ArticleDOI
TL;DR: This article shows how to synthesize concurrent systems consisting of many similar sequential processes, providing a simple template that can be instantiated for each process to yield a solution for any fixed K.
Abstract: Methods for synthesizing concurrent programs from temporal logic specifications based on the use of a decision procedure for testing temporal satisfiability have been proposed by Emerson and Clarke and by Manna and Wolper. An important advantage of these synthesis methods is that they obviate the need to manually compose a program and manually construct a proof of its correctness. One only has to formulate a precise problem specification; the synthesis method then mechanically constructs a correct solution. A serious drawback of these methods in practice, however, is that they suffer from the state explosion problem. To synthesize a concurrent system consisting of K sequential processes, each having N states in its local transition diagram, requires construction of the global product-machine having about NK global states in general. This exponential growth in K makes it infeasible to synthesize systems composed of more than 2 or 3 processes. In this article, we show how to synthesize concurrent systems consisting of many (i.e., a finite but arbitrarily large number K of) similar sequential processes. Our approach avoids construction of the global product-machine for K processes; instead, it constructs a two-process product-machine for a single pair of generic sequential processes. The method is uniform in K, providing a simple template that can be instantiated for each process to yield a solution for any fixed K. The method is also illustrated on synchronization problems from the literature.

Journal ArticleDOI
TL;DR: A new linear-time algorithm to find the immediate dominators of all vertices in a flowgraph is presented, which combines the use of microtrees and memoization with new observations on a restricted class of path compressions.
Abstract: We present a new linear-time algorithm to find the immediate dominators of all vertices in a flowgraph. Our algorithm is simpler than previous linear-time algorithms: rather than employ complicated data structures, we combine the use of microtrees and memoization with new observations on a restricted class of path compressions. We have implemented our algorithm, and we report experimental results that show that the constant factors are low. Compared to the standard, slightly superlinear algorithm of Lengauer and Tarjan, which has much less overhead, our algorithm runs 10-20% slower on real flowgraphs of reasonable size and only a few percent slower on very large flowgraphs.

Journal ArticleDOI
TL;DR: A general-purpose program analysis that computes global control-flow and data-flow information for higher-order, call-by-value languages using a novel form of polyvariance called polymorhic splitting that uses let-expressions as syntactic clues to gain precision.
Abstract: This article describes a general-purpose program analysis that computes global control-flow and data-flow information for higher-order, call-by-value languages. The analysis employs a novel form of polyvariance called polymorhic splitting that uses let-expressions as syntactic clues to gain precision. The information derived from the analysis is used both to eliminate run-time checks and to inline procedure. The analysis and optimizations have been applied to a suite of Scheme programs. Experimental results obtained from the prototype implementation indicate that the analysis is extremely precise and has reasonable cost. Compared to monovariant flow analyses such as 0CFA, or analyses based on type inference such as soft typing, the analysis eliminates significantly more run-time checks. Run-time check elimination and inlining together typically yield a 20 to 40% performance improvement for the benchmark suite, with some programs running four times as fast.

Journal ArticleDOI
TL;DR: This article presents a programming language and system that integrates task and data parallelism using shared objects, and shows how to write task-and data-parallel programs with this shared object model.
Abstract: Many programming languages support either task parallelism, but few languages provide a uniform framework for writing applications that need both types of parallelism or data parallelism. We present a programming language and system that integrates task and data parallelism using shared objects. Shared objects may be stored on one processor or may be replicated. Objects may also be partitioned and distributed on several processors.Task parallelism is achieved by forking processes remotely and have them communicate and synchronize through objects. Data parallelism is achieved by executing operations on partitioned objects in parallel. Writing task-and data-parallel applications with shared objects has several advantages. Programmers use the objects as if they were stored in a memory common to all processors. On distributed-memory machines, if objects are remote, replicated, or partitioned, the system takes care of many low-level details such as data transfers and consistency semantics. In this article, we show how to write task-and data-parallel programs with our shared object model. We also desribe a portable implementation of the model. To assess the performance of the system, we wrote several applications that use task and data parallelism and excuted them on a collection of Pentium Pros connected by Myrinet. The performance of these applications is also discussed in this article.

Journal ArticleDOI
TL;DR: An implemented small programming language, called Alma-O, that augments the expressive power of imperative programming by a limited number of features inspired by the logic programming paradigm, to encourage declarative programming and make it a more attractive vehicle for problems that involve search.
Abstract: We describe here an implemented small programming language, called Alma-O, that augments the expressive power of imperative programming by a limited number of features inspired by the logic programming paradigm. These additions encourage declarative programming and make it a more attractive vehicle for problems that involve search. We illustrate the use of Alma-O by presenting solutions to a number of classical problems, including a-b search, STRIPS planning, knapsack, and Eight Queens. These solutions are substantially simpler than their counterparts written in the imperative or in the logic programming style and can be used for different purposes without any modification. We also discuss here the implementation of Alma-O and an operational, executable, semantics of a large subset of the language.

Journal ArticleDOI
TL;DR: It is proved that Heyting completion provides a model for Cousot's reduced cardinal power of abstract domains and that it supplies a logical basis to specify relational domains for program analysis and abstract interpretation.
Abstract: In this article we introduce the notion of Heyting completion in abstract interpretation. We prove that Heyting completion provides a model for Cousot's reduced cardinal power of abstract domains and that it supplies a logical basis to specify relational domains for program analysis and abstract interpretation. We study the algebraic properties of Heyting completion in relation with other well-known domain transformers, like reduced product and disjunctive completion. This provides a uniform algebraic setting where complex abstract domains can be specified by simple logic formulas, or as solutions of recursive abstract domain equations, involving few basic operations for domain construction, all characterized by a clean logical interpretation. We apply our framework to characterize directionality and condensing and in downward closed analysis of (constraint) logic programs.

Journal ArticleDOI
TL;DR: The analysis described in this article unifies and extends previous work on flow analysis for higher-order languages supporting assignment and control operators and is parameterized over two polyvariance operators and a projection operator.
Abstract: A flow analysis collects data-flow and control-flow information about programs. A compiler can use this information to enable optimizations. The analysis described in this article unifies and extends previous work on flow analysis for higher-order languages supporting assignment and control operators. The analysis is abstract interpretation based and is parameterized over two polyvariance operators and a projection operator. These operators are used to regulate the speed and accuracy of the analysis. An implementation of the analysis is incorporated into and used in a production Scheme compiler. The analysis can process any legal Scheme program without modification. Others have demonstrated that a 0CFA analysis can enables the optimizations, but a 0CFA analysis is O(n)3). An O(n) instantiation of our analysis successfully enables the optimization of closure representations and procedure calls. Experiments with the cheaper instantiation show that it is as effective as 0CFA for these optimizations.

Journal ArticleDOI
TL;DR: The approach extends the theory of sentential-form parsing to allow for ambiguity in the grammar, exploiting it for notational convenience, to denote sequences, and to construct compact (“abstract”) syntax trees directly.
Abstract: Previously published algorithms for LR (k) incremental parsing are inefficient, unnecessarily restrictive, and in some cases incorrect. We present a simple algorithm based on parsing LR(k) sentential forms that can incrementally parse an arbitrary number of textual and/or structural modifications in optimal time and with no storage overhead. The central role of balanced sequences in achieving truly incremental behavior from analysis algorithms is described, along with automated methods to support balancing during parse table generation and parsing. Our approach extends the theory of sentential-form parsing to allow for ambiguity in the grammar, exploiting it for notational convenience, to denote sequences, and to construct compact (“abstract”) syntax trees directly. Combined, these techniques make the use of automatically generated incremental parsers in interactive software development environments both practical and effective. In addition, we address information preservation in these environments: Optimal node reuse is defined; previous definitions are shown to be insufficient; and a method for detecting node reuse is provided that is both simpler and faster than existing techniques. A program representation based on self-versioning documents is used to detect changes in the program, generate efficient change reports for subsequent analyses, and allow the parsing transformation itself to be treated as a reversible modification in the edit log.

Journal ArticleDOI
TL;DR: This study focuses on the current state of error-handling technology and concludes with recommendations for further research in error handling for component-based real-time software.
Abstract: This study focuses on the current state of error-handling technology and concludes with recommendations for further research in error handling for component-based real-time software. With real-time programs growing in size and complexity, the quality and cost of developing and maintaining them are still deep concerns to embedded software industries. Component-based software is a promising approach in reducing development cost while increasing quality and reliability. As with any other real-time software, component-based software needs exception detection and handling mechanisms to satisfy reliability requirements. The current lack of suitable error-handling techniques can make an application composed of reusable software nondeterministic and difficult to understand in the presence of errors.

Journal ArticleDOI
TL;DR: It is argued that the standard definition of fairness often is unnecessarily weak and can be replaced by the stronger, yet still abstract, notion of finitary fairness, which solves the fundamental problem of consensus in a faulty asynchronous distributed environment.
Abstract: Fairness is a mathematical abstraction: in a multiprogramming environment, fairness abstracts the details of admissible (“fair”) schedulers; in a distributed environment, fairness abstracts the relative speeds of processors. We argue that the standard definition of fairness often is unnecessarily weak and can be replaced by the stronger, yet still abstract, notion of finitary fairness. While standard weak fairness requires that no enabled transition is postponed forever, finitary weak fairness requires that for every computation of a system there is an unknown bound k such that no enabled transition is postponed more than k consecutive times. In general, the finitary restriction fin(F) of any given fairness requirement Fis the union of all o-regular safety properties contained in F. The adequacy of the proposed abstraction is shown in two ways. Suppose we prove a program property under the assumption of finitary fairness. In a multiprogramming environment, the program then satisfies the property for all fair finite-state schedulers. In a distributed environment, the program then satisfies the property for all choices of lower and upper bounds on the speeds (or timings) of processors. The benefits of finitary fairness are twofold. First, the proof rules for verifying liveness properties of concurrent programs are simplified: well-founded induction over the natural numbers is adequate to prove termination under finitary fairness. Second, the fundamental problem of consensus in a faulty asynchronous distributed environment can be solved assuming finitary fairness.

Journal ArticleDOI
TL;DR: A set of new condensation theories: Iot-failure equivalence, IOT-state equivalence and firing-dependence theory to cope with the state-explosion problem of formal verification of large-scale software systems are introduced.
Abstract: The state-explosion problem of formal verification has obstructed its application to large-scale software systems. In this article, we introduce a set of new condensation theories: IOT-failure equivalence, IOT-state equivalence, and firing-dependence theory to cope with this problem. Our condensation theories are much weaker than current theories used for the compositional verification of Petri nets. More significantly, our new condensation theories can eliminate the interleaved behaviors caused by asynchronously sending actions. Therefore, our technique provides a much more powerful means for the compositional verification of asynchronous processes. Our technique can efficiently analyze several state-based properties: boundedness, reachable markings, reachable submarkings, and deadlock states. Based on the notion of our new theories, we develop a set of condensation rules for efficient verification of large-scale software systems. The experimental results show a significant improvement in the analysis large-scale concurrent systems.

Journal ArticleDOI
TL;DR: This article proposes a simple multimethod dispatch scheme based on compressed dispatch tables that is applicable to any object-oriented language using a method precedence order and guarantees that dynamic dispatch is performed in constant time.
Abstract: The efficiency of dynamic dispatch is a major impediment to the adoption of multimethods in object-oriented languages In this article, we propose a simple multimethod dispatch scheme based on compressed dispatch tables This scheme is applicable to any object-oriented language using a method precedence order that satisfies a specific monotonous property (eg, as Cecil and Dylan) and guarantees that dynamic dispatch is performed in constant time, the latter being a major requirement for some languages and applications We provide efficient algorithms to build the dispatch tables, provide their worst-case complexity, and demonstrate the effectiveness of our scheme by real measurements performed on two large object-oriented applications Finally, we provide a detailed comparison of our technique with other existing techniques

Journal ArticleDOI
TL;DR: A new method for finding relational models that exploits the permutation invariance of models by partitioning the space into equivalence classes of symmetrical interpretations is presented and makes automatic analysis of relational specifications possible for the first time.
Abstract: Software specifications often involve data structures with huge numbers of value, and consequently they cannot be checked using standard state exploration or model-checking techniques. Data structures can be expressed with binary relations, and operations over such structures can be expressed as formulae involving relational variables. Checking properties such as preservation of an invariant thus reduces to determining the validity of a formula or, equivalently, finding a model (of the formula's negation). A new method for finding relational models is presented. It exploits the permutation invariance of models—if two interpretations are isomorphic, then neither is a model, or both are—by partitioning the space into equivalence classes of symmetrical interpretations. Representatives of these classes are constructed incrementally by using the symmetry of the partial interpretation to limit the enumeration of new relation values. The notion of symmetry depends on the type structure of the formula; by picking the weakest typing, larger equivalence classes (and thus fewer representatives) are obtained. A more refined notion of symmetry that exploits the meaning of the relational operators is also described. The method typically leads to exponential reductions; in combination with other, simpler, reductions it makes automatic analysis of relational specifications possible for the first time.

Journal ArticleDOI
TL;DR: This work generalizes Knoop et al.'s Lazy Code Motion algorithm for partial redundancy elimination so that the generalized version also performs strength reduction, resulting in a broader class of candidate expressions, stronger safety guarantees, and the elimination of the potential for performance loss instead of gain.
Abstract: We generalize Knoop et al.'s Lazy Code Motion (LCM) algorithm for partial redundancy elimination so that the generalized version also performs strength reduction. Although Knoop et al. have themselves extended LCM to strength reduction with their Lazy Strength Reduction algorithm, our approach differs substantially from theirs and results in a broader class of candidate expressions, stronger safety guarantees, and the elimination of the potential for performance loss instead of gain. Also, our general framework is not limited to traditional strength reduction, but rather can also handle a wide variety of optimizations in which data-flow information enables the replacement of a computation with a less expensive one. As a simple example, computations can be hoisted to points where they are constant foldable. Another example we sketch is the hoisting of polymorphic operations to points where type analysis provides leverage for optimization. Our general approach consists of placing computations so as to minimize their cost, rather than merely their number. So long as the cost differences between flowgraph nodes obey a certain natural constraint, a cost-optimal code motion transformation that does not unnecessarily prolong the lifetime of temporary varibles can be found using techniques completely analogous to LCM. Specifically, the cost differences can be discovered using a wide variety of forward data-flow analyses in a manner which we describe.

Journal ArticleDOI
TL;DR: It is shown that maximal-munch tokenization can always be performed in time linear in the size of the input.
Abstract: The lexical-analysis (or scanning) phase of a compiler attempts to partition an input string into a sequence of tokens. The convention in most languages is that the input is scanned left to right, and each token identified is a “maximal munch” of the remaining input—the longest prefix of the remaining input that is a token of the language. Although most of the standard compiler textbooks present a way to perform maximal-munch tokenization, the algorithm they describe is one that, for certain sets of token definitions, can cause the scanner to exhibit quadratic behavior in the worst case. In the article, we show that maximal-munch tokenization can always be performed in time linear in the size of the input.

Journal ArticleDOI
TL;DR: It is proved that the minimal unrolled graph is an adequate basis for performing bit-vector data flow analyses at a breakpoint and is shown to help in recovery of a dynamically noncurrent variable.
Abstract: Compiler optimizations pose many problems to source-level debugging of an optimized program due to reordering, insertion, and deletion of code. On such problem is to determine whether the value of a varible is current at a breakpoint—that is, whether its actual value is the same as its expected value. We use the notion of dynamic currency of a variable in source-level debugging and propose the use of a minimal unrolled graph to reduce the run-time overhead of dynamic currency determination. We prove that the minimal unrolled graph is an adequate basis for performing bit-vector data flow analyses at a breakpoint. This property is used to perform dynamic currency determination. It is also shown to help in recovery of a dynamically noncurrent variable.

Journal ArticleDOI
Jens Palsberg1
TL;DR: This article falsify Heintze's assertion that a program can be safety checked with an equality-based control-flow analysis if and only if it can be typed with recursive types, and presents a type system equivalent to equality- based control- flow analysis.
Abstract: Equality-based control-flow analysis has been studied by Henglein, Bondorf and Jorgensen, DeFouw, Grove, and Chambers, and others. It is faster than the subset-based-0-CFA, but also more approximate. Heintze asserted in 1995 that a program can be safety checked with an equality-based control-flow analysis if and only if it can be typed with recursive types. In this article we falsify Heintze's assertion, and we present a type system equivalent to equality-based control-flow analysis. The new type system contains both recursive types and an unusual notion of subtyping. We have s ≤ t if s and t unfold to the same regular tree, and we have ⊥≤t≤⊤ where t is a function type. In particular, there is no nontrivial subtyping between function types.

Journal ArticleDOI
TL;DR: This article presents a new framework for elimination-based exhaustive and incremental data flow analysis using the DJ graph representation of a program that can handle arbitrary structural and nonstructural changes to program flowgraphs, including irreducibility.
Abstract: In this article, we present a new framework for elimination-based exhaustive and incremental data flow analysis using the DJ graph representation of a program. Unlike previous approaches to elimination-based incremental data flow analysis, our approach can handle arbitrary structural and nonstructural changes to program flowgraphs, including irreducibility. We show how our approach is related to dominance frontiers, and we exploit this relationship to establish the complexity of our exhaustive analysis and to aid the design of our incremental analysis.

Journal ArticleDOI
TL;DR: This study aims at covering the whole known design space of sequential functional language implementations as well as environment- and graph-based implementations, and introduces a unified framework to describe, relate, compare, and classifyfunctional language implementations.
Abstract: We introduce a unified framework to describe, relate, compare, and classify functional language implementations. The compilation process is expressed as a succession of program transformations in the common framework. At each step, different transformations model fundamental choices. A benefit of this approach is to structure and decompose the implementation process. The correctness proofs can be tackled independently for each step and amount to proving program transformations in the functional world. This approach also paves the way to formal comparisons by making it possible to estimate the complexity of individual transformations or compositions of them. Our study aims at covering the whole known design space of sequential functional language implementations. In particular, we consider call-by-value, call-by-name, call-by-need reduction strategies as well as environment- and graph-based implementations. We describe for each compilation step the diverse alternatives as program transformations. In some cases, we illustrate how to compare or relate compilation techniques, express global optimizations, or hybrid implementations. We also provide a classification of well-known abstract machines.