scispace - formally typeset
Search or ask a question

Showing papers on "Program transformation published in 2000"


Proceedings ArticleDOI
15 Oct 2000
TL;DR: This paper addresses the problem of how to select tile sizes and unroll factors simultaneously by means of iterative compilation and shows how to quantitatively trade-off the number of profiles needed and the level of optimization that can be reached.
Abstract: Loop tiling and unrolling are two important program transformations to exploit locality and expose instruction level parallelism, respectively. In this paper, we address the problem of how to select tile sizes and unroll factors simultaneously. We approach this problem in an architecturally adaptive manner by means of iterative compilation, where we generate many versions of a program and decide upon the best by actually executing them and measuring their execution time. We evaluate several iterative strategies. We compare the levels of optimization obtained by iterative compilation to several well-known static techniques and show that we outperform each of them on a range of benchmarks across a variety of architectures. Finally, we show how to quantitatively trade-off the number of profiles needed and the level of optimization that can be reached.

196 citations


Journal ArticleDOI
01 Aug 2000
TL;DR: The goal of this paper is to study, from a theoretical point of view, several variants of the loop fusion problem -- identifying polynomially solvable cases and NP-complete cases -- and to make the link between these problems and some scheduling problems that arise from completely different areas.
Abstract: Loop fusion is a program transformation that combines several loops into one. It is used in parallelizing compilers mainly for increasing the granularity of loops and for improving data reuse. The goal of this paper is to study, from a theoretical point of view, several variants of the loop fusion problem – identifying polynomially solvable cases and NP-complete cases – and to make the link between these problems and some scheduling problems that arise from completely different areas. We study, among others, the fusion of loops of different types, and the fusion of loops when combined with loop shifting.

129 citations


Book ChapterDOI
29 Jun 2000
TL;DR: This work lifts standard deterministic and nondeterministic semantics of imperative programs to probabilistic semantics, which allows for random external inputs of known or unknown probability and random number generators.
Abstract: Following earlier models, we lift standard deterministic and nondeterministic semantics of imperative programs to probabilistic semantics. This semantics allows for random external inputs of known or unknown probability and random number generators.

122 citations


Book ChapterDOI
15 Jul 2000
TL;DR: An algorithm that constructs a finite state “abstract” program from a given, possibly infinite state, “concrete’ program by means of a syntactic program transformation, which generalizes several known algorithms for analyzing syntactically restricted, data-insensitive programs.
Abstract: We present an algorithm that constructs a finite state “abstract” program from a given, possibly infinite state, “concrete” program by means of a syntactic program transformation Starting with an initial set of predicates from a specification, the algorithm iteratively computes the predicates required for the abstraction relative to that specification These predicates are represented by boolean variables in the abstract program We show that the method is sound, in that the abstract program is always guaranteed to simulate the original We also show that the method is complete, in that, if the concrete program has a finite abstraction with respect to simulation (bisimulation) equivalence, the algorithm can produce a finite simulation-equivalent (bisimulation-equivalent) abstract program Syntactic abstraction has two key advantages: it can be applied to infinite state programs or programs with large data paths, and it permits the effective application of other reduction methods for model checking We show that our method generalizes several known algorithms for analyzing syntactically restricted, data-insensitive programs

113 citations


Journal Article
TL;DR: In this paper, the authors present an algorithm that constructs a finite state abstract program from a given, possibly infinite state, concrete program by means of a syntactic program transformation, starting with an initial set of predicates from a specification, iteratively computes the predicates required for the abstraction relative to that specification.
Abstract: We present an algorithm that constructs a finite state abstract program from a given, possibly infinite state, concrete program by means of a syntactic program transformation. Starting with an initial set of predicates from a specification, the algorithm iteratively computes the predicates required for the abstraction relative to that specification. These predicates are represented by boolean variables in the abstract program. We show that the method is sound, in that the abstract program is always guaranteed to simulate the original. We also show that the method is complete, in that, if the concrete program has a finite abstraction with respect to simulation (bisimulation) equivalence, the algorithm can produce a finite simulation-equivalent (bisimulation-equivalent) abstract program. Syntactic abstraction has two key advantages: it can be applied to infinite state programs or programs with large data paths, and it permits the effective application of other reduction methods for model checking. We show that our method generalizes several known algorithms for analyzing syntactically restricted, data-insensitive programs.

112 citations


Proceedings ArticleDOI
05 Jan 2000
TL;DR: An automatic method to enforce trace properties on programs that integrates static analyses in order to avoid useless transformations and never rejects programs but adds dynamic checks when necessary.
Abstract: We propose an automatic method to enforce trace properties on programs. The programmer specifies the property separately from the program; a program transformer takes the program and the property and automatically produces another “equivalent” pogram satisfying the property. This separation of concerns makes the program easier to develop and maintain. Our approach is both static and dynamic. It integrates static analyses in order to avoid useless transformations. On the other hand, it never rejects programs but adds dynamic checks when necessary. An important challenge is to make this dynamic enforcement as inexpensive as possible. The most obvious application domain is the enforcement of security policies. In particular, a potential use of the method is the securization of mobile code upon receipt.

108 citations


Book ChapterDOI
TL;DR: This paper proposes a Java bytecode transformation algorithm for realizing transparent thread migration in a portable and efficient manner that does not need extended virtual machines nor source code of target programs.
Abstract: This paper proposes a Java bytecode transformation algorithm for realizing transparent thread migration in a portable and efficient manner. In contrast to previous studies, our approach does not need extended virtual machines nor source code of target programs. The whole state of stack frames is saved, and then restored at a remote site. To accomplish this goal, a type system for Java bytecode is used to correctly determine valid frame variables and valid entries in the operand stack. A target program is transformed based on the type information into a form so that it can perform transparent thread migration. We have also measured execution efficiency of transformed programs and growth in bytecode size, and obtained better results compared to previous studies.

106 citations


Proceedings ArticleDOI
05 Jan 2000
TL;DR: This paper is believed to be the first paper to provide an algorithm for semantics-preserving procedures extraction given an arbitrary set of selected statements in an arbitrary control-flow graph.
Abstract: Procedure extraction is an important program transformation that can be used to make programs easier to understand and maintain, to facilitate code reuse, and to convert “monolithic” code to modular or object-oriented code. Procedure extraction involves the following steps: The statements to be extracted are identified (by the programmer or by a programming tool).If the statements are not contiguous, they are moved together so that they form a sequence that can be extracted into a procedure, and so that the semantics of the original code is preserved.The statements are extracted into a new procedure, and are replaced with an appropriate call.This paper addresses step 2: in particular, the conditions under which it is possible to move a set of selected statements together so that they become “extractable”, while preserving semantics. Since semantic equivalence is, in general, undecidable, we identify sufficient conditions based on control and data dependences, and define an algorithm that moves the selected statements together when the conditions hold. We also include an outline of a proof that our algorithm is semantics-preserving.While there has been considerable previous work on procedure extraction, we believe that this is the first paper to provide an algorithm for semantics-preserving procedures extraction given an arbitrary set of selected statements in an arbitrary control-flow graph.

105 citations


Book ChapterDOI
20 Sep 2000
TL;DR: In this paper, the authors describe a technique for producing optimizing compilers for DSELs, based on Kamin's idea of DSEL for program generation, which uses a data type of syntax for basic types, a set of smart constructors that perform rewriting over those types, some code motion transformations, and a back-end code generator.
Abstract: Functional languages are particularly well-suited to the implementation of interpreters for domain-specific embedded languages (DSELs). We describe an implemented technique for producing optimizing compilers for DSELs, based on Kamin's idea of DSELs for program generation. The technique uses a data type of syntax for basic types, a set of smart constructors that perform rewriting over those types, some code motion transformations, and a back-end code generator. Domainspecific optimization results from chains of rewrites on basic types. New DSELs are defined directly in terms of the basic syntactic types, plus host language functions and tuples. This definition style makes compilers easy to write and, in fact, almost identical to the simplest embedded interpreters. We illustrate this technique with a language Pan for the computationally intensive domain of image synthesis and manipulation.

87 citations


Journal ArticleDOI
TL;DR: Graph rewrite systems can be used to specify and generate program optimizations and parts of the lazy code motion optimization are specified, which forms the basis for the optimizer generator OPTIMIX.
Abstract: Graph rewrite systems can be used to specify and generate program optimizations. For termination of the systems several rule-based criteria are developed, defining exhaustive graph rewrite systems. For nondeterministic systems stratification is introduced which automatically selects single normal forms. To illustrate how far the methodology reaches, parts of the lazy code motion optimization are specified. The resulting graph rewrite system classes can be evaluated by a uniform algorithm, which forms the basis for the optimizer generator OPTIMIX. With this tool several optimizer components have been generated, and some numbers on their speed are presented.

73 citations


Proceedings ArticleDOI
21 Aug 2000
TL;DR: A novel framework for dynamic program optimization, ADAPT (Automated De-coupled Adaptive Program Transformation), that builds on the strengths of existing approaches and presents a compilation system, based on the Polaris optimizing compiler, that automatically applies this framework to general "plugged-in" optimization techniques.
Abstract: Dynamic program optimization offers performance improvements far beyond those possible with traditional compile-time optimization. These gains are due to the ability to exploit both architectural and input data set characteristics that are unknown prior to execution time. In this paper, we propose a novel framework for dynamic program optimization, ADAPT (Automated De-coupled Adaptive Program Transformation), that builds on the strengths of existing approaches. The key to our framework is the de-coupling of the dynamic compilation of new code variants from the dynamic selection of these variants at their points of use. This allows code generation to occur concurrently with program execution, removing dynamic compilation overheads from the critical path. We present a compilation system, based on the Polaris optimizing compiler, that automatically applies this framework to general "plugged-in" optimization techniques. We evaluate our system on three programs from the SPEC floating point benchmark suite by dynamically applying loop distribution, loop unrolling, loop tiling and automatic parallelization. We show that our techniques can improve performance by as much as 70% over statically optimized code.

Book ChapterDOI
29 Jun 2000
TL;DR: A static cost-benefit analysis is described that allows for the discovery of low-level code specialization based on value and expression profiles within a link-time code optimizer, and results are given to validate the approach.
Abstract: It is often the case at runtime that variables and registers in programs are “quasi-invariant,” i.e., the distribution of the values they take on is very skewed, with a small number of values occurring most of the time. Knowledge of such frequently occurring values can be exploited by a compiler to generate code that optimizes for the common cases without sacrificing the ability to handle the general case. The idea can be generalized to the notion of expression profiles, which profile the runtime values of arbitrary expressions and can permit optimizations that may not be possible using simple value profiles. Since this involves the introduction of runtime tests, a careful cost-benefit analysis is necessary to make sure that the benefits from executing the code specialized for the common values outweigh the cost of testing for these values. This paper describes a static cost-benefit analysis that allows us to discover when such specialization is profitable. Experimental results, using such an analysis and an implementation of low-level code specialization based on value and expression profiles within a link-time code optimizer, are given to validate our approach.

Book ChapterDOI
25 Mar 2000
TL;DR: The problem of verifying parameterized systems can be reduced to the problem of determining the equivalence of goals in a logic program, and a seamless integration of algorithmic and deductive verification at fine levels of granularity is provided.
Abstract: We show how the problem of verifying parameterized systems can be reduced to the problem of determining the equivalence of goals in a logic program. We further show how goal equivalences can be established using induction-based proofs. Such proofs rely on a powerful new theory of logic program transformations (encompassing unfold, fold and goal replacement over multiple recursive clauses), can be highly automated, and are applicable to a variety of network topologies, including uni- and bi-directional chains, rings, and trees of processes. Unfold transformations in our system correspond to algorithmic model-checking steps, fold and goal replacement correspond to deductive steps, and all three types of transformations can be arbitrarily interleaved within a proof. Our framework thus provides a seamless integration of algorithmic and deductive verification at fine levels of granularity.

Book ChapterDOI
06 Nov 2000
TL;DR: It is demonstrated that the class of functions computed by first order functional programs over lists which terminate by multiset path ordering and admit a polynomial quasi-interpretation, is exactly theclass of function computable inPolynomial time.
Abstract: We demonstrate that the class of functions computed by first order functional programs over lists which terminate by multiset path ordering and admit a polynomial quasi-interpretation, is exactly the class of function computable in polynomial time. The interest of this result lies on (i) the simplicity of the conditions on programs to certify their complexity, (ii) the fact that an important class of natural programs is captured, (iii) potential applications for program optimisation.

Proceedings ArticleDOI
01 Sep 2000
TL;DR: An optimization theorem is presented, a calculational strategy for applying the theorem is given, and the effectiveness of the approach is demonstrated through several nontrivial examples which would be difficult to deal with when using the methods previously available.
Abstract: In this paper we propose a new method for deriving a practical linear-time algorithm from the specification of a maximum-weightsum problem: From the elements of a data structure x, find a subset which satisfies a certain property p and whose weightsum is maximum. Previously proposed methods for automatically generating linear-time algorithms are theoretically appealing, but the algorithms generated are hardly useful in practice due to a huge constant factor for space and time. The key points of our approach are to express the property p by a recursive boolean function over the structure x rather than a usual logical predicate and to apply program transformation techniques to reduce the constant factor. We present an optimization theorem, give a calculational strategy for applying the theorem, and demonstrate the effectiveness of our approach through several nontrivial examples which would be difficult to deal with when using the methods previously available.

Book ChapterDOI
TL;DR: The Ciao module system as mentioned in this paper is a Prolog implementation of a modular global analysis and transformation system for Prolog, which allows separate compilation, extensibility in features and in syntax, amenability to modular transformation, enhanced error detection, support for meta-programming and higher-order, compatibility with official and de-facto standards, etc.
Abstract: It is now widely accepted that separating programs into modules is useful in program development and maintenance. While many Prolog implementations include useful module systems, we argue that these systems can be improved in a number of ways, such as, for example, being more amenable to effective global analysis and transformation and allowing separate compilation or sensible creation of standalone executables. We discuss a number of issues related to the design of such an improved module system for Prolog and propose some novel solutions. Based on this, we present the choices made in the Ciao module system, which has been designed to meet a number of objectives: allowing separate compilation, extensibility in features and in syntax, amenability to modular global analysis and transformation, enhanced error detection, support for meta-programming and higher-order, compatibility to the extent possible with official and de-facto standards, etc.

Journal ArticleDOI
TL;DR: The experiments on SGI Origin 2000 show that the MPI prototype called TMPI using the proposed techniques is competitive with SGI's native MPI implementation in a dedicated environment, and that it has significant performance advantages in a multiprogrammed environment.
Abstract: Parallel programs written in MPI have been widely used for developing high-performance applications on various platforms. Because of a restriction of the MPI computation model, conventional MPI implementations on shared-memory machines map each MPI node to an OS process, which can suffer serious performance degradation in the presence of multiprogramming. This paper studies compile-time and runtime techniques for enhancing performance portability of MPI code running on multiprogrammed shared-memory machines. The proposed techniques allow MPI nodes to be executed safety and efficiently as threads. Compile-time transformation eliminates global and static variables in C code using node-specific data. The runtime support includes an efficient and provably correct communication protocol that uses lock-free data structure and takes advantage of address space sharing among threads. The experiments on SGI Origin 2000 show that our MPI prototype called TMPI using the proposed techniques is competitive with SGI's native MPI implementation in a dedicated environment, and that it has significant performance advantages in a multiprogrammed environment.

Journal ArticleDOI
TL;DR: This work formalizes defunctionalization denotationally for a typed functional language, and proves that it preserves the meaning of any terminating program, using logical relations.
Abstract: Defunctionalization was introduced by John Reynolds in his 1972 article Definitional Interpreters for Higher-Order Programming Languages Defunctionalization transforms a higher-order program into a first-order one, representing functional values as data structures Since then it has been used quite widely, but we observe that it has never been proven correct We formalize defunctionalization denotationally for a typed functional language, and we prove that it preserves the meaning of any terminating program Our proof uses logical relations Keywords: defunctionalization, program transformation, denotational semantics, logical relations

01 Jan 2000
TL;DR: The design of a generic framework to express aspects as syntactic transformations as well as a generic weaver and how to describe and implement an aspect dealing with program robustness and exceptions is discussed.
Abstract: What exactly are aspects? How to weave? What are the join points used to anchor aspects into the component program? Is there a general purpose aspect language? We address these questions for a particular but quite general class of aspects: aspects which can be described as static, source-to-source program transformations. We discuss the design of a generic framework to express aspects as syntactic transformations as well as a generic weaver. We also consider how to use semantic properties for the definition of aspects and how to implement these properties using static analysis techniques. As an application of the framework, we sketch how to describe and implement an aspect dealing with program robustness and exceptions.

Journal ArticleDOI
01 Dec 2000
TL;DR: This paper gives an overview of a general and systematic approach to incrementalization: given a program f and an operation ⊕, the approach yields an incremental program that computes f efficiently by using the result of f, the intermediate results of f), and auxiliary information of f(x) that can be inexpensively maintained.
Abstract: Incremental computation takes advantage of repeated computations on inputs that differ slightly from one another, computing each output efficiently by exploiting the previous output. This paper gives an overview of a general and systematic approach to incrementalization: given a program f and an operation ⊕, the approach yields an incremental program that computes f(x ⊕ y) efficiently by using the result of f(x), the intermediate results of f(x), and auxiliary information of f(x) that can be inexpensively maintained. Since every non-trivial computation proceeds by iteration or recursion, the approach can be used for achieving efficient computation by computing each iteration incrementally using an appropriate incremental program. This approach has applications in interactive systems, optimizing compilers, transformational programming, and many other areas, where problems were previously solved in less general and systematic ways. This paper also describes the design and implementation of CACHET, a prototype system for incrementalization.

Journal ArticleDOI
TL;DR: It is noticed that lambda-dropping a program corresponds to transforming it into the functional representation of its optimal SSA form, which led us to substantially improve the authors' PEPM’97 presentation oflambda-dropping.

Book ChapterDOI
09 Oct 2000
TL;DR: It is shown that former approaches of program transformations are not sufficient for large object oriented systems and outline two base transformations that fill the gap.
Abstract: Software evolution demands continuous adaptation of software systems to continuously changing requirements. Our goal is to cope with software evolution by automating program transformation and system reconfiguration. We show that this can be achieved with a static metaprogramming facility and a library of suitable metaprograms. We show that former approaches of program transformations are not sufficient for large object oriented systems and outline two base transformations that fill the gap.

Book ChapterDOI
12 Jun 2000
TL;DR: In this article, the authors present a framework that allows a compiler to relax the dependence constraints between potentially excepting instructions (PEIs) and writes into variables to eliminate spurious dependence constraints.
Abstract: The support for precise exceptions in Java, combined with frequent checks for runtime exceptions, leads to severe limitations on the compiler's ability to perform program optimizations that involve reordering of instructions. This paper presents a novel framework that allows a compiler to relax these constraints. We first present an algorithm using dynamic analysis, and a variant using static analysis, to identify the subset of program state that need not be preserved if an exception is thrown. This allows many spurious dependence constraints between potentially excepting instructions (PEIs) and writes into variables to be eliminated. Our dynamic algorithm is particularly suitable for dynamically dispatched methods in object-oriented languages, where static analysis may be quite conservative. We then present the first software-only solution that allows dependence constraints among PEIs to be completely ignored while applying program optimizations, with no need to execute any additional instructions if an exception is not thrown. With a preliminary implementation, we show that for many benchmark programs, a large percentage of methods can be optimized (while honoring the precise exception requirement) without any constraints imposed by frequent runtime exceptions. Finally, we show that relaxing these reordering constraints can lead to substantial improvements (up to a factor of 7 on small codes) in the performance of programs.

Book ChapterDOI
24 Jul 2000
TL;DR: This paper proposes a solution to the problem of specializing constraint logic programs w.r.t. constrained queries by adapting to the framework various techniques developed in the field of constraint programming, partial evaluation, and abstract interpretation.
Abstract: We consider the problem of specializing constraint logic programs w.r.t. constrained queries. We follow a transformational approach based on rules and strategies. The use of the rules ensures that the specialized program is equivalent to the initial program w.r.t. a given constrained query. The strategies guide the application of the rules so to derive an efficient specialized program. In this paper we address various issues concerning the development of an automated transformation strategy. In particular, we consider the problems of when and how we should unfold, replace constraints, introduce generalized clauses, and apply the contextual constraint replacement rule. We propose a solution to these problems by adapting to our framework various techniques developed in the field of constraint programming, partial evaluation, and abstract interpretation. In particular, we use: (i) suitable solvers for simplifying constraints, (ii) well-quasi-orders for ensuring the termination of the unfoldings and for activating clause generalizations, and (iii) widening operators for ensuring the termination of the generalization process.

Book ChapterDOI
17 Jun 2000
TL;DR: A flexible framework for cooperating decision procedures is presented and the properties needed to ensure correctness are described and applied to implement an efficient version of Nelson and Oppen's algorithm for combining decision procedures.
Abstract: We present a flexible framework for cooperating decision procedures. We describe the properties needed to ensure correctness and show how it can be applied to implement an efficient version of Nelson and Oppen’s algorithm for combining decision procedures. We also show how a Shostak style decision procedure can be implemented in the framework in such a way that it can be integrated with the Nelson–Oppen method.

Book ChapterDOI
TL;DR: In this paper, it is shown how unfold/fold program transformation techniques may be used for proving that a closed first order formula holds in the perfect model of a logic program with locally stratified negation.
Abstract: We show how unfold/fold program transformation techniques may be used for proving that a closed first order formula holds in the perfect model of a logic program with locally stratified negation. We present a program transformation strategy which is a decision procedure for some given classes of programs and formulas.

Book ChapterDOI
09 Jul 2000
TL;DR: This paper shows how concepts from the area of graph transformation can be applied to provide a conceptual and formal framework for describing the structural and behavioral aspects of distributed software systems.
Abstract: Distributed software systems are typically built according to a three layer conceptual structure: Objects on the lowest layer are clustered by components on the second layer, which themselves are located at nodes of a computer network on the third layer. Orthogonal to these three layers, an instance level and a type or schema level are distinguished when modeling these systems. Accordingly, the changes a system experiences during its lifetime can be classified as the system's dynamic behavior on the instance level and as the evolution of the system on the schema level. This paper shows how concepts from the area of graph transformation can be applied to provide a conceptual and formal framework for describing the structural and behavioral aspects of such systems.

Proceedings ArticleDOI
23 Nov 2000
TL;DR: This paper is a case study which uses automated plus manually-directed transformations and abstractions to convert an IBM 370 assembler code program into a very high-level abstract specification.
Abstract: The FermaT transformation system, based on research carried out over the last sixteen years at Durham University, De Montfort University and Software, Migrations Ltd., is an industrial-strength formal transformation engine with many applications in program comprehension and language migration. This paper is a case study which uses automated plus manually-directed transformations and abstractions to convert an IBM 370 assembler code program into a very high-level abstract specification.

Dissertation
01 Jan 2000
TL;DR: This thesis investigates secure information flow in sequential programs, with the aim of completely eliminating covert timing channels, and presents a technique to describe register allocation for a functional language.
Abstract: As the title suggests, this thesis consists of two parts that address two rather different topics. The first part investigates secure information flow in sequential programs, with the aim of completely eliminating covert timing channels. The second part presents a technique to describe register allocation for a functional language. Common to both parts is that the techniques described make heavy use of types and type-based program analysis. Covert Channel Elimination refers to the work presented in papers I and II, which both deal with the removal of covert timing channels through program transformation. The setting and motivation is that of confidentiality in mobile code. Given a program from an untrusted source, the sensitive data it manipulates must not be leaked to unauthorised agents, observing the programs execution through its network accesses. In paper I, a type system is developed for a small while-language, where well-typed programs obey a time-sensitive noninterference property and are secure in the sense that they do not leak confidential information directly, indirectly or through their temporal behaviour. A type-based transformation that eliminates covert timing channels is also presented. The soundness and correctness of the approach is proven formally. Paper II moves the context of timing leak elimination down to a more practical level. Experiences from the implementation of a timing leak eliminating transformation for a subset of Java byte code are presented. The problems involved in adapting the transformation formalised for a while-language in Paper I, to a machine language are discussed and the solutions chosen in our implementation are presented. Paper III discusses the construction of secure programs and the consequences of noninterference on algorithmic complexity. The paper argues that for algorithms that manipulate pointers to secret data, support from the runtime system (and/or compiler) is necessary to mask the execution time effects of cache behaviour. The paper also argues that even with such support, noninterfering algorithms for searching a collection of secret objects cannot be made faster than OMEGA(log n). In Part II of the thesis, Paper IV presents a typed functional language with explicit register usage. The language is intended as an intermediate representation for use in a compiler, and can be seen as a lambda-calculus with strong flavours of assembly language. A type and effect system is used to monitor the use of registers. The soundness property of the system is that well typed terms will not overwrite registers containing live data.

Book ChapterDOI
20 Sep 2000
TL;DR: Autobyes as discussed by the authors is a high-level generator system for data analysis programs from statistical models, which can generate optimized and fully commented C/C++ code which can be linked dynamically into the Matlab and Octave environments.
Abstract: Extracting information from data, often also called data analysis, is an important scientific task. Statistical approaches, which use methods from probability theory and numerical analysis, are well-founded but difficult to implement: the development of a statistical data analysis program for any given application is time-consuming and requires knowledge and experience in several areas. In this paper, we describe AUTOBAYES, a high-level generator system for data analysis programs from statistical models. A statistical model specifies the properties for each problem variable (i.e., observation or parameter) and its dependencies in the form of a probability distribution. It is thus a fully declarative problem description, similar in spirit to a set of differential equations. From this model, AUTOBAYES generates optimized and fully commented C/C++ code which can be linked dynamically into the Matlab and Octave environments. Code is generated by schema-guided deductive synthesis. A schema consists of a code template and applicability constraints which are checked against the model during synthesis using theorem proving technology. AUTOBAYES augments schema-guided synthesis by symbolic-algebraic computation and can thus derive closed-form solutions for many problems. In this paper, we outline the AUTOBAYES system, its theoretical foundations in Bayesian probability theory, and its application by means of a detailed example.