scispace - formally typeset
Search or ask a question

Showing papers on "Program transformation published in 2007"


Journal ArticleDOI
TL;DR: This paper gives an introduction to Blast and demonstrates, through two case studies, how it can be applied to program verification and test-case generation.
Abstract: Blast is an automatic verification tool for checking temporal safety properties of C programs. Given a C program and a temporal safety property, Blast either statically proves that the program satisfies the safety property, or provides an execution path that exhibits a violation of the property (or, since the problem is undecidable, does not terminate). Blast constructs, explores, and refines abstractions of the program state space based on lazy predicate abstraction and interpolation-based predicate discovery. This paper gives an introduction to Blast and demonstrates, through two case studies, how it can be applied to program verification and test-case generation. In the first case study, we use Blast to statically prove memory safety for C programs. We use CCured, a type-based memory-safety analyzer, to annotate a program with run-time assertions that check for safe memory operations. Then, we use Blast to remove as many of the run-time checks as possible (by proving that these checks never fail), and to generate execution scenarios that violate the assertions for the remaining run-time checks. In our second case study, we use Blast to automatically generate test suites that guarantee full coverage with respect to a given predicate. Given a C program and a target predicate p, Blast determines the program locations q for which there exists a program execution that reaches q with p true, and automatically generates a set of test vectors that cause such executions. Our experiments show that Blast can provide automated, precise, and scalable analysis for C programs.

617 citations


Proceedings ArticleDOI
01 Oct 2007
TL;DR: This paper proposes a new framework for bidirectionalization that can automatically generate a useful backward transformation from a view function while guaranteeing that the two transformations satisfy the biddirectional properties.
Abstract: Bidirectional transformation is a pair of transformations: a view function and a backward transformation. A view function maps one data structure called source onto another called view. The corresponding backward transformation reflects changes in the view to the source. Its practically useful applications include replicated data synchronization, presentation-oriented editor development, tracing software development, and view updating in the database community. However, developing a bidirectional transformation is hard, because one has to give two mappings that satisfy the bidirectional properties for system consistency.In this paper, we propose a new framework for bidirectionalization that can automatically generate a useful backward transformation from a view function while guaranteeing that the two transformations satisfy the bidirectional properties. Our framework is based on two known approaches to bidirectionalization, namely the constant complement approach from the database community and the combinator approach from the programming language community, but it has three new features: (1) unlike the constant complement approach, it can deal with transformations between algebraic data structures rather than just tables; (2) unlike the combinator approach, in which primitive bidirectional transformations have to be explicitly given, it can derive them automatically; (3) it generates a view update checker to validate updates on views, which has not been well addressed so far. The new framework has been implemented and the experimental results show that our framework has promise.

127 citations


Journal ArticleDOI
TL;DR: To provide a foundation for slicing reactive systems, the article proposes a notion of slicing correctness based on weak bisimulation, and proves that some of these new definitions of control dependence generate slices that conform to this notion of correctness.
Abstract: The notion of control dependence underlies many program analysis and transformation techniques. Despite being widely used, existing definitions and approaches to calculating control dependence are difficult to apply directly to modern program structures because these make substantial use of exception processing and increasingly support reactive systems designed to run indefinitely.This article revisits foundational issues surrounding control dependence, and develops definitions and algorithms for computing several variations of control dependence that can be directly applied to modern program structures. To provide a foundation for slicing reactive systems, the article proposes a notion of slicing correctness based on weak bisimulation, and proves that some of these new definitions of control dependence generate slices that conform to this notion of correctness. This new framework of control dependence definitions, with corresponding correctness results, is even able to support programs with irreducible control flow graphs. Finally, a variety of properties show that the new definitions conservatively extend classic definitions. These new definitions and algorithms form the basis of the Indus Java slicer, a publicly available program slicer that has been implemented for full Java.

114 citations


Proceedings ArticleDOI
10 Jun 2007
TL;DR: This paper presents a simple but novel Hoare-logic-like framework that supports modular verification of general von-Neumann machine code with runtime code manipulation and proves its soundness in the Coq proof assistant and its power by certifying a series of realistic examples and applications.
Abstract: Self-modifying code (SMC), in this paper, broadly refers to anyprogram that loads, generates, or mutates code at runtime. It is widely used in many of the world's critical software systems tosupport runtime code generation and optimization, dynamic loading and linking, OS boot loader, just-in-time compilation, binary translation,or dynamic code encryption and obfuscation. Unfortunately, SMC is alsoextremely difficult to reason about: existing formal verification techniques-including Hoare logic and type system-consistentlyassume that program code stored in memory is fixedand immutable; this severely limits their applicability and power.This paper presents a simple but novel Hoare-logic-like framework that supports modular verification of general von-Neumann machine code with runtime code manipulation. By dropping the assumption that code memory is fixed and immutable, we are forced to apply local reasoningand separation logic at the very beginning, and treat program code uniformly as regular data structure. We address the interaction between separation and code memory and show how to establish the frame rules for local reasoning even in the presence of SMC. Our frameworkis realistic, but designed to be highly generic, so that it can support assembly code under all modern CPUs (including both x86 andMIPS). Our system is expressive and fully mechanized. We prove itssoundness in the Coq proof assistant and demonstrate its power by certifying a series of realistic examples and applications-all of which can directly run on the SPIM simulator or any stock x86 hardware.

100 citations


Journal ArticleDOI
TL;DR: Indus—a robust framework for analyzing and slicing concurrent Java programs, and Kaveri—a feature-rich Eclipse-based GUI front end for Indus slicing are presented.
Abstract: Program slicing is a program analysis and transformation technique that has been successfully used in a wide range of applications including program comprehension, debugging, maintenance, testing, and verification. However, there are only few fully featured implementations of program slicing that are available for industrial applications or academic research. In particular, very little tool support exists for slicing programs written in modern object-oriented languages such as Java, C#, or C++. In this paper, we present Indus—a robust framework for analyzing and slicing concurrent Java programs, and Kaveri—a feature-rich Eclipse-based GUI front end for Indus slicing. For Indus, we describe the underlying tool architecture, analysis components, and program dependence capabilities required for slicing. In addition, we present a collection of advanced features useful for effective slicing of Java programs including calling-context sensitive slicing, scoped slicing, control slicing, and chopping. For Kaveri, we discuss the design goals and basic capabilities of the graphical facilities integrated into a Java development environment to present the slicing information. This paper is an extended version of a tool demonstration paper presented at the International Conference on Fundamental Aspects of Software Engineering (FASE 2005). Thus, the paper highlights tool capabilities and engineering issues and refers the reader to other papers for technical details.

71 citations


Journal ArticleDOI
TL;DR: A technique for data flow analysis of programs in this class of functional languages is described by safely approximating the behavior of a certain class of term rewriting systems and obtaining ''safe'' descriptions of program inputs, outputs and intermediate results by regular sets of trees.

57 citations


Proceedings ArticleDOI
15 Jan 2007
TL;DR: A new transformation algorithm called distillation is presented which can automatically transform higher-order functional programs into equivalent tail-recursive programs and it is possible to produce superlinear improvement in the runtime of programs.
Abstract: In this paper, we present a new transformation algorithm called distillation which can automatically transform higher-order functional programs into equivalent tail-recursive programs. Using this algorithm, it is possible to produce superlinear improvement in the runtime of programs. This represents a significant advance over the supercompilation algorithm, which can only produce a linear improvement. Outline proofs are given that the distillation algorithm is correct and that it always terminates.

54 citations


Book ChapterDOI
14 Jan 2007
TL;DR: In this article, the authors propose to partially evaluate a jvml interpreter implemented in LP together with (an LP representation of) a JVM program and then analyze the residual program.
Abstract: State of the art analyzers in the Logic Programming (LP) paradigm are nowadays mature and sophisticated. They allow inferring a wide variety of global properties including termination, bounds on resource consumption, etc. The aim of this work is to automatically transfer the power of such analysis tools for LP to the analysis and verification of Java bytecode (jvml). In order to achieve our goal, we rely on well-known techniques for meta-programming and program specialization. More precisely, we propose to partially evaluate a jvml interpreter implemented in LP together with (an LP representation of) a jvml program and then analyze the residual program. Interestingly, at least for the examples we have studied, our approach produces very simple LP representations of the original jvml programs. This can be seen as a decompilation from jvml to high-level LP source. By reasoning about such residual programs, we can automatically prove in the CiaoPP system some non-trivial properties of jvml programs such as termination, run-time error freeness and infer bounds on its resource consumption. We are not aware of any other system which is able to verify such advanced properties of Java bytecode.

50 citations


Journal ArticleDOI
TL;DR: This paper compares and contrasts threads and interrupts from the point of view of verifying the absence of race conditions and presents examples of source-to-source transformations that turn interrupt-driven code into semantically equivalent thread-based code that can be checked by a thread verifier.

48 citations


Journal ArticleDOI
TL;DR: A novel technique of “encoding” operational semantics within a denotational semantics allows the framework to handle “operational slicing” and enables the concept of slicing to be applied to nondeterministic programs.
Abstract: The aim of this article is to provide a unified mathematical framework for program slicing which places all slicing work for sequential programs on a sound theoretical foundation. The main advantage to a mathematical approach is that it is not tied to a particular representation. In fact the mathematics provides a sound basis for any particular representation. We use the WSL (wide-spectrum language) program transformation theory as our framework. Within this framework we define a new semantic relation, semirefinement, which lies between semantic equivalence and semantic refinement. Combining this semantic relation, a syntactic relation (called reduction), and WSL's remove statement, we can give mathematical definitions for backwards slicing, conditioned slicing, static and dynamic slicing, and semantic slicing as program transformations in the WSL transformation theory. A novel technique of “encoding” operational semantics within a denotational semantics allows the framework to handle “operational slicing”. The theory also enables the concept of slicing to be applied to nondeterministic programs. These transformations are implemented in the industry-strength FermaT transformation system.

46 citations


Proceedings ArticleDOI
14 Mar 2007
TL;DR: This work provides the extension to ⁰-polyhedra which are the intersection of polyhedra and lattices and proves closure in the ⁡-polyhedral model under images by dependence functions, and proves that unions of LBLs, widely assumed to be a richer class of sets, is equal to unions of ⁷- polyhedRA.
Abstract: The polyhedral model is a well developed formalism and has been extensively used in a variety of contexts viz. the automatic parallelization of loop programs, program verification, locality, hardware generationand more recently, in the automatic reduction of asymptotic program complexity. Such analyses and transformations rely on certain closure properties. However, the model is limited in expressivity and the need for a more general class of programs is widely known.We provide the extension to ⁰-polyhedra which are the intersection of polyhedra and lattices. We prove the required closure properties using a novel representation and interpretation of ⁰-polyhedra. In addition, we also prove closure in the ⁰-polyhedral model under images by dependence functions---thereby proving that unions of LBLs, widely assumedto be a richer class of sets, is equal to unions of ⁰-polyhedra. Another corollary of this result is the equivalence of the unions of ⁰-polyhedraand Presburger sets. Our representation and closure properties constitute the foundations of the ⁰-polyhedral model. As an example, we presentthe transformation for automatic reduction of complexity in the ⁰-polyhedral model.

Proceedings ArticleDOI
14 Jul 2007
TL;DR: A denotational semantics is given to a region-based effect system tracking reading, writing and allocation in a higher-order language with dynamically allocated integer references that validates a number of effect-dependent program equivalences and can serve as a foundation for effect-based compiler transformations.
Abstract: We give a denotational semantics to a region-based effect system tracking reading, writing and allocation in a higher-order language with dynamically allocated integer references.Effects are interpreted in terms of the preservation of certain binary relations on the store, parameterized by region-indexed partial bijections on locations.The semantics validates a number of effect-dependent program equivalences and can thus serve as a foundation for effect-based compiler transformations.

Proceedings ArticleDOI
25 Jun 2007
TL;DR: In this paper, the authors present a symbolic method for automatic synthesis of fault-tolerant distributed programs, in which elements of a problem are represented by Boolean formulae.
Abstract: Automated formal analysis methods such as program verification and synthesis algorithms often suffer from time complexity of their decision procedures and also high space complexity known as the state explosion problem. Symbolic techniques, in which elements of a problem are represented by Boolean formulae, are desirable in the sense that they often remedy the state explosion problem and time complexity of decision procedures. Although symbolic techniques have successfully been used in program verification, their benefits have not yet been exploited in the context of program synthesis and transformation extensively. In this paper, we present a symbolic method for automatic synthesis of fault-tolerant distributed programs. Our experimental results on synthesis of classical fault-tolerant distributed problems such as Byzantine agreement and token ring show a significant performance improvement by several orders of magnitude in both time and space complexity. To the best of our knowledge, this is the first illustration where programs with large state space (beyond 2100) is handled during synthesis.

Proceedings ArticleDOI
29 Oct 2007
TL;DR: This paper shows how it can utilise the information gained from slicing a program to aid in designing obfuscations that are more resistant to slicing, and extends a previously proposed technique and provides proofs of correctness for the authors' transforms.
Abstract: The goal of obfuscation is to transform a program, without affecting its functionality, such that some secret information within the program can be hidden for as long as possible from an adversary armed with reverse engineering tools. Slicing is a form of reverse engineering which aims to abstract away a subset of program code based on a particular program point and is considered to be a potent program comprehension technique. Thus, slicing could be used as a way of attacking obfuscated programs. It is challenging to manufacture obfuscating transforms that are provably resilient to slicing attacks.We show in this paper how we can utilise the information gained from slicing a program to aid us in designing obfuscations that are more resistant to slicing. We extend a previously proposed technique and provide proofs of correctness for our transforms. Finally, we illustrate our approach with a number of obfuscating transforms and provide empirical results using software engineering metrics.

Journal ArticleDOI
TL;DR: The key insight underlying the extension is that both data migration functions and data processors can be represented type-safely by a generalized abstract data type (GADT).

Book ChapterDOI
24 Mar 2007
TL;DR: Type-dependence analysis is presented, which performs a context- and field-sensitive interprocedural static analysis to identify program entities that may store symbolic values at run-time and a technique to transform real applications for efficient symbolic execution.
Abstract: Symbolic execution can be problematic when applied to real applications. This paper addresses two of these problems: (1) the constraints generated during symbolic execution may be of a type not handled by the underlying decision procedure, and (2) some parts of the application may be unsuitable for symbolic execution (e.g., third-party libraries). The paper presents type-dependence analysis, which performs a context- and field-sensitive interprocedural static analysis to identify program entities that may store symbolic values at run-time. This information is used to identify the above two problematic cases and assist the user in addressing them. The paper also presents a technique to transform real applications for efficient symbolic execution. Instead of transforming the entire application, which can be inefficient and infeasible (mostly for pragmatic reasons), our technique leverages the results of type-dependence analysis to transform only parts of the program that may interact with symbolic values. Finally, the paper discusses the implementation of our analysis and transformation technique in a tool, stinger, and an empirical evaluation performed on two real applications. The results of the evaluation show the effectiveness of our approach.

Proceedings ArticleDOI
Simon Jones1
01 Oct 2007
TL;DR: This paper describes a simple, modular transformation that specialises recursive functions according to their argument "shapes", and describes the implementation in the Glasgow Haskell Compiler, and gives measurements that demonstrate substantial performance improvements.
Abstract: User-defined data types, pattern-matching, and recursion are ubiquitous features of Haskell programs. Sometimes a function is called with arguments that are statically known to be in constructor form, so that the work of pattern-matching is wasted. Even worse, the argument is sometimes freshly-allocated, only to be immediately decomposed by the function.In this paper we describe a simple, modular transformation that specialises recursive functions according to their argument "shapes". We describe our implementation of this transformation in the Glasgow Haskell Compiler, and give measurements that demonstrate substantial performance improvements: a worthwhile 10% on average, with a factor of 10 in particular cases.

Book ChapterDOI
22 Aug 2007
TL;DR: A new kind of program transformation is introduced in order to automatically improve the accuracy of floating-point computations and is implemented, and the first experimental results are presented.
Abstract: Floating-point arithmetic is an important source of errors in programs because of the loss of precision arising during a computation. Unfortunately, this arithmetic is not intuitive (e.g. many elementary operations are not associative, inversible, etc.) making the debugging phase very difficult and empiric. This article introduces a new kind of program transformation in order to automatically improve the accuracy of floating-point computations. We use P. Cousot and R. Cousot's framework for semantics program transformation and we propose an offline transformation. This technique was implemented, and the first experimental results are presented.

Journal ArticleDOI
TL;DR: This paper addresses the question of whether extra variables can be eliminated in such kind of functional logic programs, proving the soundness and completeness of an easy solution that takes advantage of the possibility of non-confluence.

Journal ArticleDOI
TL;DR: A novel algorithm for analysis of Java bytecode which includes a number of optimizations in order to reduce the number of iterations and is parametric -in the sense that it is independent of the abstract domain used and it can be applied to different domains as ''plug-ins''-, multivariant, and flow-sensitive.

Book ChapterDOI
21 Jul 2007
TL;DR: It is shown that a simple phase detection scheme can be sufficient for optimization space pruning and it is possible to search for complex optimizations at run-time without resorting to sophisticated dynamic compilation frameworks.
Abstract: This article aims at making iterative optimization practical and usable by speeding up the evaluation of a large range of optimizations. Instead of using a full run to evaluate a single program optimization, we take advantage of periods of stable performance, called phases. For that purpose, we propose a low-overhead phase detection scheme geared toward fast optimization space pruning, using code instrumentation and versioning implemented in a production compiler. Our approach is driven by simplicity and practicality. We show that a simple phase detection scheme can be sufficient for optimization space pruning. We also show it is possible to search for complex optimizations at run-time without resorting to sophisticated dynamic compilation frameworks. Beyond iterative optimization, our approach also enables one to quickly design self-tuned applications. Considering 5 representative SpecFP2000 benchmarks, our approach speeds up iterative search for the best program optimizations by a factor of 32 to 962. Phase prediction is 99.4% accurate on average, with an overhead of only 2.6%. The resulting self-tuned implementations bring an average speed-up of 1.4.

Journal ArticleDOI
TL;DR: The design maintenance system is described, a practical, commercial program analysis and transformation system, and how it was employed to construct a custom modernization tool being applied to a large C++ industrial avionics system.
Abstract: Automated program transformation holds promise for a variety of software life cycle endeavors, particularly where the size of legacy systems makes manual code analysis, re-engineering, and evolution difficult and expensive. But constructing highly scalable transformation tools supporting modern languages in full generality is itself a painstaking and expensive process. This cost can be managed by developing a common transformation system infrastructure re-useable by derived tools that each address specific tasks, thus leveraging the infrastructure costs. This paper describes the Design Maintenance System (DMS), a practical, commercial program analysis and transformation system, and discusses how it was employed to construct a custom modernization tool being applied to a large C++ avionics system. The tool transforms components developed in a 1990s-era component style to a more modern CORBA-like component framework, preserving functionality.

01 Jan 2007
TL;DR: In this paper, a programming model for distributed embedded systems that uses discrete-event (DE) models as program specifications is presented, and two feasible implementations of PTIDES are presented.
Abstract: We build on PTIDES, a programming model for distributed embedded systems that uses discrete-event (DE) models as program specifications. PTIDES improves on distributed DE execution by allowing more concurrent event processing without backtracking. This paper discusses the general execution strategy for PTIDES, and provides two feasible implementations. This execution strategy is then extended with tolerance for hardware errors. We take a program transformation approach to automatically enhance DE models with incremental checkpointing and state recovery functionality. Our fault tolerance mechanism is lightweight and has low overhead. It requires very little human intervention. We incorporate this mechanism into PTIDES for efficient execution of faulttolerant real-time distributed DE systems.

Journal ArticleDOI
TL;DR: This paper proposes a fresh approach by obfuscating abstract data-types allowing us to develop structure-dependent obfuscations that would otherwise (traditionally) not be available.

Proceedings ArticleDOI
11 Jul 2007
TL;DR: This paper presents a novel approach to creating obfuscating transforms which are designed to survive slicing attacks, and shows how the information gained from slicing a program can be utilised to aid in manufacturing obfuscations that are more resistant to slicing.
Abstract: An obfuscation aims to transform a program, without affecting its functionality, so that some secret information within the program can be hidden for as long as possible from an adversary armed with reverse engineering tools. Slicing is a reverse engineering technique which aims to produce a subset of a program which is dependent on a particular program point and is used to aid in program comprehension. Thus slicing could be used as a way of attacking obfuscated programs. Can we design obfuscations which are more resilient to slicing attacks? In this paper we present a novel approach to creating obfuscating transforms which are designed to survive slicing attacks. We show how we can utilise the information gained from slicing a program to aid us in manufacturing obfuscations that are more resistant to slicing. We give a definition for what it means for a transformation to be a slicing obfuscation and we illustrate our approach with a number of obfuscating transforms.

Proceedings ArticleDOI
25 Aug 2007
TL;DR: An eclipse based representation framework for ASTs is presented and it is shown that ASTs can be used for program analysis and for program transformation.
Abstract: syntax trees (ASTs) are known from compiler construction where they build the intermediate data format which is passed from the analytic front-end to the synthetic back-end. In model driven software development ASTs are used as a model of the source code. The object management group (OMG) has issued a request for proposals for AST models. Various levels of abstraction can be introduced. ASTs can be used for program analysis and for program transformation. In this paper we present an eclipse based representation framework for ASTs.

Proceedings ArticleDOI
13 Jun 2007
TL;DR: This paper has combined automated code transformation and ISE generators to explore the potential benefits of such a combination, and demonstrates that a combination of source-level transformations and instruction set extensions can yield average performance improvements of 47%.
Abstract: Industry's demand for flexible embedded solutions providing high performance and short time-to-market has led to the development of configurable and extensible processors. These pre-verified application-specific processors build on proven baseline cores while allowing for some degree of customization through user-defined instruction set extensions (ISE) implemented as functional units in an extended micro-architecture. The traditional design flow for ISE is based on plain C sources of the target application and, after some ISE identification and synthesis stages, a modified source file is produced with explicit handles to the new machine instructions. Further code optimization is left to the compiler. In this paper we develop a novel approach, namely the combined exploration of source-level transformations and ISE identification. We have combined automated code transformation and ISE generators to explore the potential benefits of such a combination. This applies up to 50 transformations from a selection of 70, and synthesizes ISEs for the resulting code. The resulting performance has been measured on 26 applications from the SNU-RT and UTDSP benchmarks. We show that the instruction extensions generated by automated tools are heavily influenced by source code structure. Our results demonstrate that a combination of source-level transformations and instruction set extensions can yield average performance improvements of 47%. This out performs instruction set extensions when applied in isolation, and in extreme cases yields a speedup of 2.85.

Proceedings ArticleDOI
26 Mar 2007
TL;DR: This paper presents feasible solutions to such code transformations and identifies associated issues, and a technique is presented that inserts Java byte code with self- healing primitives and transforms it to become a self-healing component.
Abstract: Autonomic computing is a grand challenge in computing, which aims to produce software that has the properties of self-configuration, self-healing, self-optimization and self-protection. Adding such autonomic properties into existing applications is immensely useful for redeploying them in an environment other than they were developed for. Such transformed applications can be redeployed in different dynamic environments without the user making changes to the application. However, creating such autonomic software entities is a significant challenge because of the amount of code transformation required. This paper presents feasible solutions to such code transformations and identifies associated issues. To illustrate such code transformations, a technique is presented that inserts Java byte code with self-healing primitives and transforms it to become a self-healing component. Experiments show that such code transformations are challenging however they are worthwhile in order to provide transparent autonomic behavior

Journal ArticleDOI
TL;DR: A type system for the update calculus that infers the possible type changes that can be caused by an update program and it is demonstrated that type-safe update programs that fulfill certain structural constraints preserve the type correctness of lambda terms.

Journal ArticleDOI
TL;DR: Using concepts from the theory of tree transducers and extending on earlier work, automatic transformations from accumulative functional programs into non-accumulative ones are developed, which are much better suited for mechanized verification.