scispace - formally typeset
Search or ask a question

Showing papers on "Static program analysis published in 2000"


Journal ArticleDOI
TL;DR: The article introduces the language Jif, an extension to Java that provides static checking of information flow using the decentralized label model, which improves on existing multilevel security models by allowing users to declassify information in a decentralized way, and by improving support for fine-grained data sharing.
Abstract: Stronger protection is needed for the confidentiality and integrity of data, because programs containing untrusted code are the rule rather than the exception. Information flow control allows the enforcement of end-to-end security policies, but has been difficult to put into practice. This article describes the decentralized label model, a new label model for control of information flow in systems with mutual distrust and decentralized authority. The model improves on existing multilevel security models by allowing users to declassify information in a decentralized way, and by improving support for fine-grained data sharing. It supports static program analysis of information flow, so that programs can be certified to permit only acceptable information flows, while largely avoiding the overhead of run-time checking. The article introduces the language Jif, an extension to Java that provides static checking of information flow using the decentralized label model.

574 citations


Journal ArticleDOI
Ganesan Ramalingam1
TL;DR: The article shows that an analysis that is simultaneously both context-sensitive and synchronization-sensitive (that is, a context- sensitive analysis that precisely takes into account the constraints on execution order imposed by the synchronization statements in the program) is impossible even for the simplest of analysis problems.
Abstract: Static program analysis is concerned with the computation of approximations of the runtime behavior of programs. Precise information about a program's runtime behavior is, in general, uncomputable for various different reasons, and each reason may necessitate making certain approximations in the information computed. This article illustrates one source of difficulty in static analysis of concurrent programs. Specifically, the article shows that an analysis that is simultaneously both context-sensitive and synchronization-sensitive (that is, a context-sensitive analysis that precisely takes into account the constraints on execution order imposed by the synchronization statements in the program) is impossible even for the simplest of analysis problems.

296 citations


Journal ArticleDOI
01 Jun 2000
TL;DR: A new XML application is described that provides an alternative representation of Java source code, called JavaML, that is more natural for tools and permits easy specification of numerous software-engineering analyses by leveraging the abundance of XML tools and techniques.
Abstract: The classical plain-text representation of source code is convenient for programmers but requires parsing to uncover the deep structure of the program. While sophisticated software tools parse source code to gain access to the program's structure, many lightweight programming aids such as grep rely instead on only the lexical structure of source code. I describe a new XML application that provides an alternative representation of Java source code. This XML-based representation, called JavaML, is more natural for tools and permits easy specification of numerous software-engineering analyses by leveraging the abundance of XML tools and techniques. A robust converter built with the Jikes Java compiler framework translates from the classical Java source code representation to JavaML, and an XSLT stylesheet converts from JavaML back into the classical textual form.

206 citations


12 May 2000
TL;DR: A theoretical result is presented which shows that a precise analysis of the transformed program, in the general case, is NP-hard and the applicability of the techniques is demonstrated with empirical results.
Abstract: Reliable execution of software on untrustworthy platforms is a difficult problem. On the one hand, the underlying system services cannot be relied upon to provide execution assurance, while on the other hand, the effect of a tampered execution can be disastrous -- consider intrusion detection programs. What is needed, in this case, is tamper resistant software. Code obfuscation has been an area of development, in part, to enhance software tamper resistance. However, most obfuscation techniques are ad hoc, without the support of sound theoretical basis or provable results. In this paper, we address one aspect of software protection by obstructing static analysis of programs. Our techniques are based, fundamentally, on the difficulty of resolving aliases in programs. The presence of aliases has been proven to greatly restrict the precision of static data-flow analysis. Meanwhile, effective alias detection has been shown to be NP-Hard. While this represents a significant hurdle for code optimization, it provides a theoretical basis for structuring tamper-resistant programs -- systematic introduction of nontrivial aliases transforms programs to a form that yields data flow information very slowly and/or with little precision. Precise alias analysis relies on the collection of static control flow information. We further hinder the analysis by a systematic "break-down" of the program control-flow; transforming high level control transfers to indirect addressing through aliased pointers. By doing so, we have made the basic control-flow analysis into a general alias analysis problem, and the data-flow analysis and control-flow analysis are made co-dependent. We present a theoretical result which shows that a precise analysis of the transformed program, in the general case, is NP-hard and demonstrate the applicability of our techniques with empirical results.

200 citations


Journal ArticleDOI
01 Dec 2000
TL;DR: In this article, the authors apply program slicing techniques to remove irrelevant code and reduce the size of the corresponding transition system models, and prove its safety with respect to model checking of linear temporal logic (LTL) formulae.
Abstract: Applying finite-state verification techniques (e.g., model checking) to software requires that program source code be translated to a finite-state transition system that safely models program behavior. Automatically checking such a transition system for a correctness property is typically very costly, thus it is necessary to reduce the size of the transition system as much as possible. In fact, it is often the case that much of a program's source code is irrelevant for verifying a given correctness property. In this paper, we apply program slicing techniques to remove automatically such irrelevant code and thus reduce the size of the corresponding transition system models. We give a simple extension of the classical slicing definition, and prove its safety with respect to model checking of linear temporal logic (LTL) formulae. We discuss how this slicing strategy fits into a general methodology for deriving effective software models using abstraction-based program specialization.

157 citations


Patent
08 Jun 2000
TL;DR: In this article, the authors proposed a method to increase the tamper-resistance and obscurity of computer software code by transforming the data flow of the computer software so that the observable operation is dissociated from the intent of the original software code.
Abstract: The present invention relates generally to computer software, and more specifically, to a method and system of making computer software resistant to tampering and reverse-engineering. 'Tampering' occurs when an attacker makes unauthorized changes to a computer software program such as overcoming password access, copy protection or timeout algorithms. Broadly speaking, the method of the invention is to increase the tamper-resistance and obscurity of computer software code by transforming the data flow of the computer software so that the observable operation is dissociated from the intent of the original software code. This way, the attacker can not understand and decode the data flow by observing the execution of the code. A number of techniques for performing the invention are given, including encoding software arguments using polynomials, prime number residues, converting variables to new sets of boolean variables, and defining variables on a new n-dimensional vector space.

155 citations


Proceedings ArticleDOI
27 Nov 2000
TL;DR: This paper presents a method for representing program flow information that is compact while still being strong enough to handle the types of flow previously considered in WCET research, and extends the set of representable flows compared to previous research.
Abstract: Knowing the worst-case execution time (WCET) of a program is necessary when designing and verifying real-time systems. The WCET depends both on the program flow (like loop iterations and function calls), and on hardware factors like caches and pipelines. In this paper, we present a method for representing program flow information that is compact while still being strong enough to handle the types of flow previously considered in WCET research. We also extend the set of representable flows compared to previous research. We give an algorithm for converting the flow information to the linear constraints used in calculating a WCET estimate in our WCET analysis tool. We demonstrate the practicality of the representation by modeling the flow of a number of programs, and show that execution time estimates can be made tighter by using flow information.

129 citations


Journal ArticleDOI
TL;DR: A new processor architecture poses significant financial risk to hardware and software developers alike, so both have a vested interest in easily ported code from one processor to another.
Abstract: A new processor architecture poses significant financial risk to hardware and software developers alike, so both have a vested interest in easily porting code from one processor to another. Binary translation offers solutions for automatically converting executable code to run on new architectures without recompiling the source code.

103 citations


Proceedings ArticleDOI
23 Nov 2000
TL;DR: This paper proposes a program representation technique that is based on language domain modes and the XML markup language that offers a higher level of openness and portability than custom-made tool specific abstract syntax trees.
Abstract: One of the most important issues in source code analysis and software re-engineering is the representation of software code text at an abstraction level and form suitable for algorithmic processing. However, source code representation schemes must be compact, accessible by well defined application programming interfaces (APIs) and above all portable to different operating platforms and various CASE tools. This paper proposes a program representation technique that is based on language domain modes and the XML markup language. In this context, source code is represented as XML DOM trees that offer a higher level of openness and portability than custom-made tool specific abstract syntax trees. The DOM trees can be exchanged between tools in textual or binary form. Similarly the domain model allows for language entities to be associated with analysis services offered by various CASE tools, leading to an integrated software maintenance environment.

97 citations


Proceedings ArticleDOI
25 Jun 2000
TL;DR: The paper proposes a C/C++ source-to-source compiler able to increase the dependability properties of a given application through code re-ordering and variables duplication, and introduces code modifications that are transparent to the original program functionality.
Abstract: The paper proposes a C/C++ source-to-source compiler able to increase the dependability properties of a given application. The adopted strategies are based on two main techniques, code re-ordering and variables duplication. The proposed approach is portable to any platform, it can be applied to any C/C++ source code, and it introduces code modifications that are transparent to the original program functionality. The RECCO tool, which fully automates the process, allows the user to trade-off between the level of dependability improvement and the performance degradation due to the code modification.

90 citations


Journal ArticleDOI
TL;DR: The concepts of code delta and code churn are illustrated by measuring a real, industrial sized software system and can be used to assess the amount of change in the complexity of the system across successive software builds.

Proceedings ArticleDOI
08 Oct 2000
TL;DR: PLCTOOLS is described, a toolbox that exploits all the aforementioned techniques to supply an integrated environment for the design, formal validation, and automatic code generation of software controllers.
Abstract: Strong timing requirements and complex interactions with controlled elements complicate the design and validation of software controllers. Different techniques have been proposed to cope with these problems during the different development steps: for example, differential equations for modeling controlled elements, the IEC 1131-3 notations for designing the software controller, and formal models for validating the design, but no definitive solutions have been proposed yet. The paper describes PLCTOOLS, a toolbox that exploits all the aforementioned techniques to supply an integrated environment for the design, formal validation, and automatic code generation of software controllers.


Journal ArticleDOI
TL;DR: In this article, the authors propose conceptual and software support for the design of abstract domains for logic programming analysis, which is a systematic methodology to design static program analysis which has been studied extensively in the logic programming community, because of the potential for optimizations in logic programming compilers and the sophistication of the analyses which require conceptual support.

Proceedings ArticleDOI
01 Jan 2000
TL;DR: This research has developed tool support for design navigation that allows for flexible browsing of a system's design-level representation and for information exchange with a suite of program comprehension tools.
Abstract: Source code investigation is one of the most time-consuming activities during software maintenance and evolution, yet currently-available tool support suffers from several shortcomings. Browsing is typically limited to low-level elements, investigation is only supported as a one-way activity, and tools provide little help in getting an encompassing picture of the system under examination. In our research, we have developed tool support for design navigation that addresses these shortcomings. A design browser allows for flexible browsing of a system's design-level representation and for information exchange with a suite of program comprehension tools. The browser is complemented with a retriever supporting full-text and structural searching. In this paper, we detail these tools and their integration into a reverse engineering environment, present three case studies and put them into perspective.

Patent
04 Oct 2000
TL;DR: In this article, the developer tool detects a group of related elements in the code, and collapses a portion of the graphical representation of the code associated with the group of elements, which is then used for debugging and editing.
Abstract: Software development tool that simplifies a graphical representation of software code for a developer. The software developer tool provides the developer with a more coherent, manageable, and abstract graphical view of the project model (Figure 3, #302), and facilitates the developer in graphically debugging and editing the associated software code (Figure 3, #304). The improved software development tool detects a group of related elements in the code, and collapses a portion of the graphical representation of the code associated with the group of related elements (Figure 3, #312).

Patent
27 Sep 2000
Abstract: Computer program code which is a candidate for Web enablement or stored procedures is identified. Source code corresponding to computer program code is scanned and parsed to determine static information concerning the computer program code. The static information is stored in a database. Dynamic information concerning the computer program code during an execution of the computer program code is also collected and stored in the database. Responsive to the static information and dynamic information stored in the database, relationships and dependencies are then developed and stored in the database. The database may then be queried to produce a set of potential candidates of computer program code meeting a constraint of the query. If insufficient candidates are returned by the query, then the query constraint may be relaxed, and the query repeated.

01 Jan 2000
TL;DR: A prototype implementation demonstrates that the technology can be implemented using current graphical toolkits, can be made highly configurable using current language analysis tools, and that it can be encapsulated in a manner consistent with reuse in many software engineering contexts.
Abstract: Source code plays a major role in most software engineering environments. The interface of choice between source code and human users is a tool that displays source code textually and possibly permits its modification. Specializing this tool for the source code’s language promises enhanced services for programmers as well as better integration with other tools. However, these two goals, user services and tool integration, present conflicting design constraints that have previously prevented specialization. A new architecture, based on a lexical representation of source code, represents a compromise that satisfies constraints on both sides. A prototype implementation demonstrates that the technology can be implemented using current graphical toolkits, can be made highly configurable using current language analysis tools, and that it can be encapsulated in a manner consistent with reuse in many software engineering contexts.

Journal ArticleDOI
TL;DR: The ITC‐Irst Reverse Engineering group was charged with analyzing a software application of approximately 4.7 million lines of C code, an old legacy system, maintained for a long time, on which several successive adaptive and corrective maintenance interventions had led to the degradation of the original structure.
Abstract: The ITC-Irst Reverse Engineering group was charged with analyzing a software application of approximately 4.7 million lines of C code. It was an old legacy system, maintained for a long time, on which several successive adaptive and corrective maintenance interventions had led to the degradation of the original structure. The company decided to re-engineer the software instead of replacing it, because the complexity and costs of re-implementing the application from scratch could not be afforded, and the associated risk could not be run. Several problems were encountered during re-engineering, including identifying dependencies and detecting redundant functions that were not used anymore. To accomplish these goals, we adopted a conservative approach. Before performing any kind of analysis on the whole code, we carefully evaluated the expected costs. To this aim, a small but representative sample of modules was preliminarily analyzed, and the costs and outcomes were extrapolated so as to obtain some indications on the analysis of the whole system. When the results of the sample modules were found to be useful as well as affordable for the entire system, the resources involved were carefully distributed among the different reverse engineering tasks to meet the customer's deadline. This paper summarizes that experience, discussing how we approached the problem, the way we managed the limited resources available to complete the task within the assigned deadlines, and the lessons we learned. Copyright © 2000 John Wiley & Sons, Ltd.

Proceedings ArticleDOI
Jeffrey J. Kotula1
30 Jul 2000
TL;DR: This paper argues that source code documentation is an irreplaceable necessity, as well as an important discipline to increase development efficiency and quality.
Abstract: Source code documentation is a fundamental engineering practice critical to efficient software development. Regardless of the intent of its author, all source code is eventually reused, either directly, or just through the basic need to understand it. In either case, the source code documentation acts as a specification of behavior for other engineers. Without documentation, they are forced to get the information they need by making dangerous assumptions, scrutinizing the implementation, or interrogating the author. These alternatives are unacceptable. Although some developers believe that source code ?self-documents?, there is a great deal of information about code behavior that simply cannot be expressed in source code, but requires the power and flexibility of natural language to state. Consequently, source code documentation is an irreplaceable necessity, as well as an important discipline to increase development efficiency and quality.

Proceedings ArticleDOI
Steven P. Reiss1
04 Jan 2000
TL;DR: The paper describes the basis for a suite of tools that lets the programmer work in terms of design patterns and source code simultaneously, and introduces a language for defining design patterns.
Abstract: The paper describes the basis for a suite of tools that lets the programmer work in terms of design patterns and source code simultaneously. It first introduces a language for defining design patterns. This language breaks a pattern down into elements and constraints over a program database of structural and semantic information. The language supports both creation of new program elements when a pattern is created or modified and generating source code for these new elements. The paper next describes tools for working with patterns. These tools let the user identify and create instances of patterns in the source code. Once patterns are so identified, they can be saved in a library of patterns that accompanies the system and the patterns can be verified, maintained as the source evolves, and edited to modify the source.

Proceedings ArticleDOI
23 Nov 2000
TL;DR: The paper reports on the experiences of the users of the Rigi reverse engineering tool suite, an interactive, visual tool designed to help developers better understand and redocument their software.
Abstract: An experiment is conducted on how well expert users of program comprehension tools are able to perform specific program understanding and maintenance tasks on the xfig drawing program using these tools. The paper reports on the experiences of the users of the Rigi reverse engineering tool suite. Rigi is an interactive, visual tool designed to help developers better understand and redocument their software. Rigi includes parsers to read the source code of the subject software and produce a graph of extracted artifacts such as procedures, variables, calls, and data accesses. To manage the complexity of the graph, an editor allows the software engineer to automatically or manually collapse related artifacts into subsystems. These subsystems typically represent concepts such as abstract data types or personnel assignments. The created hierarchy can be navigated, analyzed, and presented using various automatic or user-guided graphical layouts.

Patent
19 May 2000
TL;DR: In this paper, an end-user can attach to the object code specifying behavior that the user wishes the object to exhibit from that point forward, and the code defining the new or additional behavioral features will be interpreted by the applications software so that each time an event or operation is performed on the object, the system will recognize that the object has user-specified behavior associated therewith.
Abstract: A method, apparatus and system for allowing an end-user to define at run-time the way an object in the system will react to existing operations, or events, that are later performed on the object. In the system of the invention, the end-user can attach to the object code specifying behavior that the user wishes the object to exhibit from that point forward. The code defining the new or additional behavioral features will be interpreted by the applications software so that each time an event or operation is performed on the object, the system will recognize that the object has user-specified behavior associated therewith. The code remains associated with the object. In the system of the invention, there is no need to exit the application software. The new code specifying the desired behavior is immediately callable and executable by the system upon being input.

Journal ArticleDOI
TL;DR: The assessment method is based on source code abstraction, object–oriented metrics and graphical representation, and helps evaluators assess the quality and risks associated with software by identifying code fragments presenting unusual characteristics.
Abstract: This paper presents an assessment method to evaluate the quality of object oriented software systems. The assessment method is based on source code abstraction, objectdoriented metrics and graphical representation. The metrics used and the underlying model representing the software are presented. The assessment method experiment is part of an industrial research effort with the Bell Canada Quality Engineering and Research Group. It helps evaluators assess the quality and risks associated with software by identifying code fragments presenting unusual characteristics. The assessment method evaluates objectdoriented software systems at three levels of granularity: system level, class level and method level. One large Cpp and eight Java software systems, for a total of over one million lines of code, are presented as case studies. A critical analysis of the results is presented comparing the systems and the two languages.

Patent
25 Feb 2000
TL;DR: In this paper, a computer program product is provided for use with a computer system to execute a simulation, which includes a plurality of service computer-readable program code means, which are configured to collectively determine simulated attributes of objects of an environment under simulated operation.
Abstract: A computer program product is provided for use with a computer system to execute a simulation. The computer program product includes a plurality of service computer-readable program code means. The service program code means are configured to collectively determine simulated attributes of objects of an environment under simulated operation. Each service program code means is associated with at least a subset of object attributes in an object context. At least some of the service program code means include attribute accessing computer-readable program code means coupling the service program code means to the attributes in the object context for intercommunication therebetween and for operating upon the object attributes. The intercommunication is based on identifications of the attributes by the service programs that are recognizable by the object context. Mapping computer-readable program code means couple the at least some of the service programs to the object context, for mapping a user-expressed attribute name, not recognizable by the object context, to the identification of the attributes recognizable by the object context.

Journal ArticleDOI
TL;DR: The specification that previously would have been handed to a software engineer for hand coding is now used as an 'executable specification', which can be used to develop test procedures, that can be applied both in simulation and on the real product.
Abstract: Automatic code generation already plays a valuable role in embedded development. Engineers are turning to advanced software tools that generate code automatically, both during the prototyping stage of the project and when production code is required. At the prototyping stage, automatic code generation can greatly accelerate the development process, allowing many different algorithms to be tried in a shorter period. Furthermore, because there is no significant time penalty for trying alternative solutions, automatic code generation tools positively encourage innovation, eliminating the temptation to re-use previously developed code in compromised solutions. The specification that previously would have been handed to a software engineer for hand coding is now used as an 'executable specification'. In addition to forming the basis for code generation this specification can be used to develop test procedures, that can be applied both in simulation and on the real product. A further advantage of this approach is that algorithm developers can test their ideas without having to wait until the associated code is ready for downloading to a target processor. This not only benefits the algorithm designer, but also frees the software engineer from the routine coding of algorithms, allowing greater effort to be devoted to more challenging issues.

01 Jan 2000
TL;DR: An integrated, traceable software development approach in the context of a use case design methodology that achieves several quality control properties is proposed and a conversion algorithm that converts high-level design elements into code skeleton is developed.
Abstract: The challenge faced by software developers is how to capture dynamically changing requirements and how to maintain consistency between the graphical models and the code generated to support the evolution of complex software. The use case driven approach has become extremely popular as an effective software development technique used in leading methodologies such as Unified Modeling Language (UML) because of its capacity for capturing functional requirements. However, use case driven methodologies do not sufficiently provide the systematic development process supported by a manageable relationship between design and the actual implementation. We propose an integrated, traceable software development approach in the context of a use case design methodology that achieves several quality control properties. As part of this approach, a layered architecture and a code generation model are introduced. The foundation of our approach lies in partitioning the design schemata into a layered architecture of functional components called design units. Design units provide the basis for the generation of modular, well-structured source code, the traceability of requirements throughout the software development process, the integration of test stubs into a structured code environment and the framework for a systematic approach to change management. We also developed a conversion algorithm that converts high-level design elements into code skeleton. Our skeletal code is generated from the interaction diagrams and event state tables in which the design units have been identified. In addition, we propose a complex interaction diagram and extended event state table in order to describe various associations between use cases.


Proceedings ArticleDOI
25 Jan 2000
TL;DR: The decentralized label model is described, a new model for controlling information flow in systems with mutual distrust and decentralized authority that improves on existing multilevel security models by allowing users to declassify information in a decentralized way, and by improving support for fine-grained data sharing.
Abstract: This paper describes the decentralized label model, a new model for controlling information flow in systems with mutual distrust and decentralized authority. The model allows users to share information with distrusted code (e.g., downloaded applets), yet still control how that code disseminates the shared information to others. The model improves on existing multilevel security models by allowing users to declassify information in a decentralized way, and by improving support for fine-grained data sharing. It supports static program analysis of information flow so that programs can be certified to permit only acceptable information flows and to avoid most run-time information flow checks. In addition to presenting the model, the paper also discusses how the model can be supported in a distributed environment, including an introduction to Jif, an extension to Java that incorporates the model and permits static checking of information flow.

Proceedings ArticleDOI
01 Nov 2000
TL;DR: This paper proposes a new proof-based approach to safe evolution of distributed software systems by extending the simple certification mechanism of proof-carrying code to make it interactive and probabilistic, thereby devising code with interactive proof (CIP).
Abstract: This paper proposes a new proof-based approach to safe evolution of distributed software systems. Specifically it extends the simple certification mechanism of proof-carrying code (PCC) to make it interactive and probabilistic, thereby devising code with interactive proof (CIP). With CIP, a code consumer is convinced, with overwhelming probability, of the existence and validity of a safety proof of a transmitted code through interaction with a code producer. The class of safety properties that are provable by CIP is larger than the class provable by PCC, provided that each code consumer is allowed to spend a reasonable amount of time on verification. Moreover, CIP can be further extended to devise code with zero-knowledge interactive proof (CZKIP). This concept is useful, for example, when the code producer wants to use the safety proof as a kind of "copyright" of the code.