scispace - formally typeset
Search or ask a question

Showing papers in "ACM Transactions on Programming Languages and Systems in 1995"


Journal ArticleDOI
TL;DR: It is shown how to specify components of concurrent systems and considers both the decomposition of a given system into parts, and the composition of given parts to form a system.
Abstract: We show how to specify components of concurrent systems. The specification of a system is the conjunction of its components' specifications. Properties of the system are proved by reasoning about its components. We consider both the decomposition of a given system into parts, and the composition of given parts to form a system.

506 citations


Journal ArticleDOI
TL;DR: An execution model for supporting programs that use pointer-based dynamic data structures is described that uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy task creation.
Abstract: Compiling for distributed-memory machines has been a very active research area in recent years. Much of this work has concentrated on programs that use arrays as their primary data structures. To date, little work has been done to address the problem of supporting programs that use pointer-based dynamic data structures. The techniques developed for supporting SPMD execution of array-based programs rely on the fact that arrays are statically defined and directly addressable. Recursive data structures do not have these properties, so new techniques must be developed. In this article, we describe an execution model for supporting programs that use pointer-based dynamic data structures. This model uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy task creation. We intend to exploit this execution model using compiler analyses and automatic parallelization techniques. We have implemented a prototype system, which we call Olden, that runs on the Intel iPSC/860 and the Thinking Machines CM-5. We discuss our implementation and report on experiments with five benchmarks.

265 citations


Journal ArticleDOI
TL;DR: It is argued that covariance and contravariance appropriately characterize two distinct and independent mechanisms that each have their place in object-oriented systems and can (and should) be integrated in a type-safe manner inobject-oriented languages.
Abstract: In type-theoretic research on object-oriented programming, the issue of “covariance versus contravariance” is a topic of continuing debate. In this short note we argue that covariance and contravariance appropriately characterize two distinct and independent mechanisms. The so-called contravariance rule correctly captures the subtyping relation (that relation which establishes which sets of functions can replace another given set in every context). A covariant relation, instead, characterizes the specialization of code (i.e., the definition of new code which replaces old definitions in some particular cases). Therefore, covariance and contravariance are not opposing views, but distinct concepts that each have their place in object-oriented systems. Both can (and should) be integrated in a type-safe manner in object-oriented languages. We also show that the independence of the two mechanisms is not characteristic of a particular model but is valid in general, since covariant specialization is present in record-based models, although it is hidden by a deficiency of all existing calculi that realize this model. As an aside, we show that the l&-calculus can be taken as the basic calculus for both an overloading-based and a record-based model. Using this approach, one not only obtains a more uniform vision of object-oriented type theories, but in the case of the record-based approach, one also gains multiple dispatching, a feature that existing record-based models do not capture

188 citations


Journal ArticleDOI
TL;DR: This article presents a framework for combining constant propagation, value numbering, and unreachable-code elimination, and shows how to combine two such frameworks and how to reason about the properties of the resulting framework.
Abstract: Modern optimizing compilers use several passes over a program's intermediate representation to generate good code. Many of these optimizations exhibit a phase-ordering problem. Getting the best code may require iterating optimizations until a fixed point is reached. Combining these phases can lead to the discovery of more facts about the program, exposing more opportunities for optimization. This article presents a framework for describing optimizations. It shows how to combine two such frameworks and how to reason about the properties of the resulting framework. The structure of the frame work provides insight into when a combination yields better results. To make the ideas more concrete, this article presents a framework for combining constant propagation, value numbering, and unreachable-code elimination. It is an open question as to what other frameworks can be combined in this way.

173 citations


Journal ArticleDOI
TL;DR: This article presents a practical technique for detecting a broader class of linear induction variables than is usually recognized, as well as several other sequence forms, including periodic, polynomial, geometric, monotonic, and wrap-around variables.
Abstract: Linear induction variable detection is usually associated with the strength reduction optimization. For restructuring compilers, effective data dependence analysis requires that the compiler detect and accurately describe linear and nonlinear induction variables as well as more general sequences. In this article we present a practical technique for detecting a broader class of linear induction variables than is usually recognized, as well as several other sequence forms, including periodic, polynomial, geometric, monotonic, and wrap-around variables. Our method is based on Factored Use-Def (FUD) chains, a demand-driven representation of the popular Static Single Assignment (SSA) form. In this form, strongly connected components of the associated SSA graph correspond to sequences in the source program: we describe a simple yet efficient algorithm for detecting and classifying these sequences. We have implemented this algorithm in Nascent, our restructuring Fortran 90+ compiler, and we present some results showing the effectiveness of our approach.

158 citations


Journal ArticleDOI
Atsushi Ohori1
TL;DR: This work defines a second-order, polymorphic record calculus as an extension of Girard-Reynolds polymorphic lambda calculus, and develops an ML-style type inference algorithm for a predicative subset of the second- order record calculus.
Abstract: The motivation of this work is to provide a type-theoretical basis for developing a practical polymorphic programming language with labeled records and labeled variants. Our goal is to establish both a polymorphic type discipline and an efficient compilation method for a calculus with those labeled data structures. We define a second-order, polymorphic record calculus as an extension of Girard-Reynolds polymorphic lambda calculus. We then develop an ML-style type inference algorithm for a predicative subset of the second-order record calculus. The soundness of the type system and the completeness of the type inference algorithm are shown. These results extend Milner's type inference algorithm, Damas and Milner's account of ML's let polymorphism, and Harper and Mitchell's analysis on XML. To establish an efficient compilation method for the polymorphic record calculus, we first define an implementation calculus, where records are represented as vectors whose elements are accessed by direct indexing, and variants are represented as values tagged with a natural number indicating the position in the vector of functions in a switch statement. We then develop an algorithm to translate the polymorphic record calculus into the implementation calculus using type information obtained by the type inference algorithm. The correctness of the compilation algorithm is proved ; that is, the compilation algorithm is shown to preserve both typing and the operational behavior of a program. Based on these results, Standard ML has been extended with labeled records, and its compiler has been implemented.

108 citations


Journal ArticleDOI
TL;DR: A polynomial-time, static typechecking algorithm that checks the conformance, completeness, and consistency of a group of method implementations with respect to declared message signatures and a module system that enables independently developed code to be fully encapsulated and statically typechecked on a per-module basis.
Abstract: Two major obstacles that hinder the wider acceptance of multimethods are (1) concerns over the lack of encapsulation and modularity and (2) the absence of static typechecking in existing multimethod-based languages This article addresses both of these problems We present a polynomial-time, static typechecking algorithm that checks the conformance, completeness, and consistency of a group of method implementations with respect to declared message signatures This algorithm improves on previous algorithms by handling separate type and inheritance hierarchies, abstract classes, and graph-based method lookup semantics We also present a module system that enables independently developed code to be fully encapsulated and statically typechecked on a per-module basis To guarantee that potential conflicts between independently developed modules have been resolved, a simple well-formedness condition on the modules comprising a program is checked at link-time The typechecking algorithm and module system are applicable to a range of multimethod-based languages, but the article uses the Cecil language as a concrete example of how they can be applied

90 citations


Journal ArticleDOI
Jens Palsberg1
TL;DR: The core of the proof technique is to define closure analysis using a constraint system, which is equivalent to the closure analysis of Bondorf, which in turn is based on Sestoft's.
Abstract: Flow analyses of untyped higher-order functional programs have in the past decade been presented by Ayers, Bondorf, Consel, Jones, Heintze, Sestoft, Shivers, Steckler, Wand, and others. The analyses are usually defined as abstract interpretations and are used for rather different tasks such as type recovery, globalization, and binding-time analysis. The analyses all contain a global closure analysis that computes information about higher-order control-flow. Sestoft proved in 1989 and 1991 that closure analysis is correct with respect to call-by-name and call-by-value semantics, but it remained open if correctness holds for arbitrary beta-reduction.This article answers the question; both closure analysis and others are correct with respect to arbitrary beta-reduction. We also prove a subject-reduction result: closure information is still valid after beta-reduction. The core of our proof technique is to define closure analysis using a constraint system. The constraint system is equivalent to the closure analysis of Bondorf, which in turn is based on Sestoft's.

85 citations


Journal ArticleDOI
TL;DR: A distributed implementation of Scheme that permits efficient transmission of higher-order objects such as closures and continuations, and is the first distributed dialect of Scheme (or a related language) that addresses lightweight communication abstractions for higher- order objects.
Abstract: We describe a distributed implementation of Scheme that permits efficient transmission of higher-order objects such as closures and continuations. The integration of distributed communication facilities within a higher-order programming language engenders a number of new abstractions and paradigms for distributed computing. Among these are user-specified load-balancing and migration policies for threads, incrementally linked distributed computations, and parameterized client-server applications. To our knowledge, this is the first distributed dialect of Scheme (or a related language) that addresses lightweight communication abstractions for higher-order objects.

79 citations


Journal ArticleDOI
TL;DR: This work investigates the problem of evaluating Fortran 90-style array expressions on massively parallel distributed-memory machines and presents algorithms based on dynamic programming that solve the embedding problem optimally for several communication cost metrics: multidimensional grids and rings, hypercubes, fat-trees, and the discrete metric.
Abstract: We investigate the problem of evaluating Fortran 90-style array expressions on massively parallel distributed-memory machines. On such a machine, an elementwise operation can be performed in constant time for arrays whose corresponding elements are in the same processor. If the arrays are not aligned in this manner, the cost of aligning them is part of the cost of evaluating the expression tree. The choice of where to perform the operation then affects this cost.We describe the communication cost of the parallel machine theoretically as a metric space; we model the alignment problem as that of finding a minimum-cost embedding of the expression tree into this space. We present algorithms based on dynamic programming that solve the embedding problem optimally for several communication cost metrics: multidimensional grids and rings, hypercubes, fat-trees, and the discrete metric. We also extend our approach to handle operations that change the shape of the arrays.

69 citations


Journal ArticleDOI
TL;DR: The approach is demonstrated by showing that a combined sharing analysis—constructed from “old” proposals—compares well with other “new’ proposals suggested in recent literature both from the point of view of efficiency and accuracy.
Abstract: This article considers static analysis based on abstract interpretation of logic programs over combined domains It is known that analyses over combined domains provide more information potentially than obtained by the independent analyses However, the construction of a combined analysis often requires redefining the basic operations for the combined domain A practical approach to maintain precision in combined analyses of logic programs which reuses the individual analyses and does not redefine the basic operations is illustrated The advantages of the approach are that (1) proofs of correctness for the new domains are not required and (2) implementations can be reused The approach is demonstrated by showing that a combined sharing analysis—constructed from “old” proposals—compares well with other “new” proposals suggested in recent literature both from the point of view of efficiency and accuracy

Journal ArticleDOI
TL;DR: This article proves that Amadio and Cardelli's type system with subtyping and recursive types accepts the same programs as a certain safety analysis, and obtains an inference algorithm for the type system, thereby solving an open problem.
Abstract: Flow-based safety analysis of higher-order languages has been studied by Shivers, and Palsberg and Schwartzbach. Open until now is the problem of finding a type system that accepts exactly the same programs as safety analysis. In this article we prove that Amadio and Cardelli's type system with subtyping and recursive types accepts the same programs as a certain safety analysis. The proof involves mappings from types to flow information and back. As a result, we obtain an inference algorithm for the type system, thereby solving an open problem.

Journal ArticleDOI
TL;DR: Results from using SLICE, a dynamic program slicer for C programs, designed and implemented to experiment with several different kinds of program slices and to study them both qualitatively and quantitatively are presented.
Abstract: Program slicing is a program analysis technique that has been studied in the context of several different applications in the construction, optimization, maintenance, testing, and debugging of programs. Algorithms are available for constructing slices for a particular execution of a program (dynamic slices), as well as to approximate a subset of the behavior over all possible executions of a program (static slices). However, these algorithms have been studied only in the context of small abstract languages. Program slicing is bound to remain an academic exercise unless one can not only demonstrate the feasibility of building a slicer for nontrivial programs written in a real programming language, but also verify that a type of slice is sufficiently thin, on the average, for the application for which it is chosen. In this article, we present results from using SLICE, a dynamic program slicer for C programs, designed and implemented to experiment with several different kinds of program slices and to study them both qualitatively and quantitatively. Several application programs, ranging in size (i.e., number of lines of code) over two orders of magnitude, were sliced exhaustively to obtain average worst-case metrics for the size of program slices.

Journal ArticleDOI
TL;DR: A simple and efficient algorithm for generating bottom-up rewrite system (BURS) tables is described, and previously published methods for on-the-fly elimination of states are generalized and simplified to create a new method, triangle trimming, that is employed in the algorithm.
Abstract: A simple and efficient algorithm for generating bottom-up rewrite system (BURS) tables is described. A small code-generator generator implementation produces BURS tables efficiently, even for complex instruction set descriptions. The algorithm does not require novel data structures or complicated algorithmic techniques. Previously published methods for on-the-fly elimination of states are generalized and simplified to create a new method, triangle trimming, that is employed in the algorithm. A prototype implementation, burg, generates BURS tables very efficiently.

Journal ArticleDOI
TL;DR: An efficient implementation of adaptive programs, given an adaptive program and a class graph, and the soundness of a proof system for conservatively checking consistency is proved, and it is shown how to implement it efficiently.
Abstract: Adaptive programs compute with objects, just like object-oriented programs. Each task to be accomplished is specified by a so-called propagation pattern which traverses the receiver object. The object traversal is a recursive descent via the instance variables where information is collected or propagated along the way. A propagation pattern consists of (1) a name for the task, (2) a succinct specification of the parts of the receiver object that should be traversed, and (3) code fragments to be executed when specific object types are encountered. The propagation patterns need to be complemented by a class graph which defines the detailed object structure. The separation of structure and behavior yields a degree of flexibility and understandability not present in traditional object-oriented languages. For example, the class graph can be changed without changing the adaptive program at all. We present an efficient implementation of adaptive programs. Given an adaptive program and a class graph, we generate an efficient object-oriented program, for example, in C++. Moreover, we prove the correctness of the core of this translation. A key assumption in the theorem is that the traversal specifications are consistent with the class graph. We prove the soundness of a proof system for conservatively checking consistency, and we show how to implement it efficiently.

Journal ArticleDOI
TL;DR: A temporal counterpart to the knowledge change theorem of Chandy and Misra is established which formally proves that the global view of a distributed system provided by its various observations does not differ too much from its truth behavior.
Abstract: The definitions of the predicates Possibly φ and Definitely φ, where φ is a global predicate of a distributed computation, lead to the definitions of two predicate transformers P and D. We show that P plays the same role with respect to time as the predicate transformers Ki in knowledge theory play with respect to space. Pursuing this analogy, we prove that local predicates are exactly the fixed points of the Ki's while the stable predicates are the fixed points of P. In terms of the predicate transformers P and D, we define a new class of predicates that we call observer-independent predicates and for which the detection of Possibly φ and Definitely φ is quite easy. Finally, we establish a temporal counterpart to the knowledge change theorem of Chandy and Misra which formally proves that the global view of a distributed system provided by its various observations does not differ too much from its truth behavior.

Journal ArticleDOI
TL;DR: An algorithm for the resource allocation problem is presented that achieves a constant failure locality of four along with a quadratic response time and aquadratic message complexity and applications to other process synchronization problems in distributed systems are demonstrated.
Abstract: Solutions to resource allocation problems and other related synchronization problems in distributed systems are examined with respect to the measures of response time, message complexity, and failure locality. Response time measures the time it takes for an algorithm to respond to the requests of a process; message complexity measures the number of messages sent and received by a process; and failure locality characterizes the size of the network that is affected by the failure of a single process. An algorithm for the resource allocation problem that achieves a constant failure locality of four along with a quadratic response time and a quadratic message complexity is presented. Applications of the algorithm to other process synchronization problems in distributed systems are also demonstrated.

Journal ArticleDOI
TL;DR: This article study the problem of detecting, expressing, and optimizing task-level parallelism, where “task” refers to a program statement of arbitrary granularity, and shows that there exists a unique minimum set of essential data dependences.
Abstract: Automatic detection of task-level parallelism (also referred to as functional, DAG, unstructured, or thread parallelism) at various levels of program granularity is becoming increasingly important for parallelizing and back-end compilers. Parallelizing compilers detect iteration-level or coarser granularity parallelism which is suitable for parallel computers; detection of parallelism at the statement-or operation-level is essential for most modern microprocessors, including superscalar and VLIW architectures. In this article we study the problem of detecting, expressing, and optimizing task-level parallelism, where “task” refers to a program statement of arbitrary granularity. Optimizing the amount of functional parallelism (by allowing synchronization between arbitrary nodes) in sequential programs requires the notion of precedence in terms of paths in graphs which incorporate control and data dependences. Precedences have been defined before in a different context; however, the definition was dependent on the ideas of parallel execution and time. We show that the problem of determining precedences statically is NP-complete. Determining precedence relationships is useful in finding the essential data dependences. We show that there exists a unique minimum set of essential data dependences; finding this minimum set is NP-hard and NP-easy. We also propose a heuristic algorithm for finding the set of essential data dependences. Static analysis of a program in the Perfect Benchmarks was done, and we present some experimental results.

Journal ArticleDOI
TL;DR: An almost-linear algorithm for determining exactly the same set of flow graph nodes, so-called &fgr;-nodes, found by computing the iterated dominance frontiers of the initial set of nodes.
Abstract: Recently, Static Single-Assignment Form and Sparse Evaluation Graphs have been advanced for the efficient solution of program optimization problems. Each method is provided with an initial set of flow graph nodes that inherently affect a problem's solution. Other relevant nodes are those where potentially disparate solutions must combine. Previously, these so-called φ-nodes were found by computing the iterated dominance frontiers of the initial set of nodes, a process that could take worst-case quadratic time with respect to the input flow graph. In this article we present an almost-linear algorithm for determining exactly the same set of φ-nodes.

Journal ArticleDOI
TL;DR: This communication sets the problem of incremental parsing in the context of a complete incremental compiling system and finds that, according to the incrementally paradigm of the attribute evaluator and data-flow analyzer to be used, two definitions of optimal incrementality in a parser are possible.
Abstract: This communication sets the problem of incremental parsing in the context of a complete incremental compiling system. It turns out that, according to the incrementally paradigm of the attribute evaluator and data-flow analyzer to be used, two definitions of optimal incrementality in a parser are possible. Algorithms for achieving both forms of optimality are given, both of them based on ordinary LALR(1) parse tables. Optimality and correctness proofs, which are merely outlined in this communication, are made intuitive thanks to the concept of a well-formed list of threaded trees, a natural extension of the concept of threaded tree found in earlier works on incremental parsing.

Journal ArticleDOI
TL;DR: The article presents the results of experiments on a version of the LALR(1)-based parser generator Bison to which the algorithm was added and shows how the method can be integrated with lookahead to avoid finding repairs that immediately result in further syntax errors.
Abstract: Local error repair of strings during CFG parsing requires the insertion and deletion of symbols in the region of a syntax error to produce a string that is error free. Rather than precalculating tables at parser generation time to assist in finding such repairs, this article shows how such repairs can be found during shift-reduce parsing by using the parsing tables themselves. This results in a substantial space saving over methods that require precalculated tables. Furthermore, the article shows how the method can be integrated with lookahead to avoid finding repairs that immediately result in further syntax errors. The article presents the results of experiments on a version of the LALR(1)-based parser generator Bison to which the algorithm was added.

Journal ArticleDOI
TL;DR: A specification language for nondeterministic operators and multialgebraic semantics is defined and the first complete reasoning system for such specifications is introduced and the calculus is shown to be sound and complete also with respect to the new semantics.
Abstract: The current algebraic models for nondeterminism focus on the notion of possibility rather than necessity and consequently equate (nondeterministic) terms that one would intuitively not consider equal. Furthermore, existing models for nondeterminism depart radically from the standard models for (equational) specifications of deterministic operators. One would prefer that a specification language for nondeterministic operators be based on an extension of the standard model concepts, preferably in such a way that the reasoning system for (possibly nondeterministic) operators becomes the standard equational one whenever restricted to the deterministic operators—the objective should be to minimize the departure from the standard frameworks. In this article we define a specification language for nondeterministic operators and multialgebraic semantics. The first complete reasoning system for such specifications is introduced. We also define a transformation of specifications of nondeterministic operators into derived specifications of deterministic ones, obtaining a “computational” semantics of nondeterministic specification by adopting the standard semantics of the derived specification as the semantics of the original one. This semantics turns out to be a refinement of multialgebra semantics. The calculus is shown to be sound and complete also with respect to the new semantics.

Journal ArticleDOI
TL;DR: Using the k-tuple representation, the general results of standard data flow frameworks concerning convergence time and solution precision for multisource problems are accessed and results on function space properties for join-of-meets frameworks indicate precise solutions for most of them will be difficult to obtain.
Abstract: Multisource data flow problems involve information which may enter nodes independently through different classes of edges In some cases, dissimilar meet operations appear to be used for different types of nodes These problems include bidirectional and flow-sensitive problems as well as many static analyses of concurrent programs with synchronization K-tuple frameworks, a type of standard data flow framework, provide a natural encoding for multisource problems using a single meet operator Previously, the solution of these problems has been described as the fixed point of a set of data flow equations Using our k-tuple representation, we can access the general results of standard data flow frameworks concerning convergence time and solution precision for these problems We demonstrate this for the bidirectional component of partial redundancy suppression and two problems on the program summary graph An interesting subclass of k-tuple frameworks, the join-of-meets frameworks, is useful for reachability problems, especially those stemming from analyses of explicitly parallel programs We give results on function space properties for join-of-meets frameworks that indicate precise solutions for most of them will be difficult to obtain

Journal ArticleDOI
TL;DR: The issue of precisely evaluating cross-interferences in blocked loops with blocked matrix-vector multiply is illustrated and it is shown that a precise rather than an approximate evaluation of cache conflicts is sometimes necessary to obtain near-optimal performance.
Abstract: State-of-the art data locality optimizing algorithms are targeted for local memories rather than for cache memories. Recent work on cache interferences seems to indicate that these phenomena can severely affect blocked algorithms cache performance. Because of cache conflicts, it is not possible to know the precise gain brought by blocking. It is even difficult to determine for which problem sizes blocking is useful. Computing the actual optimal block size is difficult because cache conflicts are highly irregular. In this article, we illustrate the issue of precisely evaluating cross-interferences in blocked loops with blocked matrix-vector multiply. Most significant interference phenomena are captured because unusual parameters such as array base addresses are being considered. The techniques used allow us to compute the precise improvement due to blocking and the threshold value of problem parameters for which the blocked loop should be preferred. It is also possible to derive an expression of the optimal block size as a function of problem parameters. Finally, it is shown that a precise rather than an approximate evaluation of cache conflicts is sometimes necessary to obtain near-optimal performance.

Journal ArticleDOI
TL;DR: The proposed ee-analysis method addresses the primary aspect of the strictness analysis problem, namely nontermination of evaluation, and can yield simplicity both in formulation and computational algorithms, and leads to a simple and natural treatment of polymorphism.
Abstract: Strictness analysis is a well-known technique used in compilers for optimization of sequential and parallel evaluation of lazy functional programming languages. Ever since Mycroft's pioneering research in strictness analysis, there have been substantial research efforts in advancing the basic technique in several directions such as higher-order functions, nonflat domains, etc. While almost all of these methods define strictness based on denotational semantics, in 1990 we proposed an operational method called ee-analysis. Operational methods directly address the primary aspect of the strictness analysis problem, namely nontermination of evaluation, and can yield simplicity both in formulation and computational algorithms. Moreover, the use of operational approach has led us to a simple and natural treatment of polymorphism. While ee-analysis reasoned about normal-form evaluation herein we extend its power substantially so as to deal with other degrees of evaluation that are intermediate between normal and head-normal form. An interesting aspect of our approach is our formulation of a strictness property as a constraint that relates the demand placed on the output of a function to the demands placed on its arguments. Strictness properties are then computed using symbolic constraint-solving techniques. An important advantage of constraint-driven analysis is that it captures interargument dependencies accurately. Moreover, the analysis performance is relatively insensitive to domain size. Based on our implementation of this method, we show that our analysis techniques are efficient, as well as effective.

Journal ArticleDOI
TL;DR: A heuristic that schedules DAGs and is based on the optimal expression-tree-scheduling algorithm is presented and compared with Goodman and Hsu's algorithm Integrated Prepass Scheduling (IPS), which outperforms IPS and is significantly faster.
Abstract: A fast, optimal code-scheduling algorithm for processors with a delayed load of one instruction cycle is described. The algorithm minimizes both execution time and register use and runs in time proportional to the size of the expression-tree. An extension that spills registers when too few registers are available is also presented. The algorithm also performs very well for delayed loads of greater than one instruction cycle. A heuristic that schedules DAGs and is based on our optimal expression-tree-scheduling algorithm is presented and compared with Goodman and Hsu's algorithm Integrated Prepass Scheduling (IPS). Both schedulers perform well on benchmarks with small basic blocks, but on large basic blocks our scheduler outperforms IPS and is significantly faster.

Journal ArticleDOI
TL;DR: The proposed enhanced method for optimizing array subscript range checks is however unsafe and may generate optimized programs whose behavior is different from the original program.
Abstract: Jonathan Asuru proposed recently an enhanced method for optimizing array subscript range checks. The proposed method is however unsafe and may generate optimized programs whose behavior is different from the original program. Two main flaws in Asuru's method are described, together with suggested remedies and improvements.

Journal ArticleDOI
TL;DR: A systematic comparison was conducted between a hand-coded translator for the Icon programming language and one generated by the Eli compiler construction system, showing that efficient compilers can be generated from specifications that are much smaller and more problem oriented than the equivalent source code.
Abstract: Compilers or language translators can be generated using a variety of formal specification techniques. Whether generation is worthwhile depends on the effort required to specify the translation task and the quality of the generated compiler. A systematic comparison was conducted between a hand-coded translator for the Icon programming language and one generated by the Eli compiler construction system. A direct comparison could be made since the generated program performs the same translation as the hand-coded program. The results of the comparison show that efficient compilers can be generated from specifications that are much smaller and more problem oriented than the equivalent source code. We also found that further work must be done to reduce the dynamic memory usage of the generated compilers.

Journal ArticleDOI
TL;DR: The technique proposed, namely the use of “exactness sets” to study relationships between complexity and precision of analyses, is not specific to logic programming in any way, and is equally applicable to flow analyses of other language families.
Abstract: It is widely held that there is a correlation between complexity and precision in dataflow analysis, in the sense that the more precise an analysis algorithm, the more computationally expensive it must be. The details of this relationship, however, appear to not have been explored extensively. This article reports some results on this correlation in the context of logic programs. A formal notion of the “precision” of an analysis algorithm is proposed, and this is used to characterize the worst-case computational complexity of a number of dataflow analyses with different degrees of precision. While this article considers the analysis of logic programs, the technique proposed, namely the use of “exactness sets” to study relationships between complexity and precision of analyses, is not specific to logic programming in any way, and is equally applicable to flow analyses of other language families.

Journal ArticleDOI
TL;DR: Experimental results on small and real-life problems indicate that semantic backtracking produces significant reduction in memory space, while keeping the time overhead reasonably small.
Abstract: Existing CLP languages support backtracking by generalizing traditional Prolog implementations: modifications to the constraint system are trailed and restored on backtracking. Although simple and efficient, trailing may be very demanding in memory space, since the constraint system may potentially be saved at each choice point.This article proposes a new implementation scheme for backtracking in CLP languages over linear (rational or real) arithmetic. The new scheme, called semantic backtracking, does not use trailing but rather exploits the semantics of the constraints to undo the effect of newly added constraints. Semantic backtracking reduces the space complexity compared to implementations based on trailing by making it essentially independent of the number of choice points. In addition, semantic backtracking introduces negligible space and time overhead on deterministic programs. The price for this improvement is an increase in backtracking time, although constraint-solving time may actually decrease. The scheme has been implemented as part of a complete CLP system CLP (ℜLin) and compared analytically and experimentally with optimized trailing implementations. Experimental results on small and real-life problems indicate that semantic backtracking produces significant reduction in memory space, while keeping the time overhead reasonably small.