scispace - formally typeset
Search or ask a question

Showing papers on "White-box testing published in 2006"


Book
11 Dec 2006
TL;DR: In this paper, the authors give a practical introduction to model-based testing, showing how to write models for testing purposes and how to use modelbased testing tools to generate test suites.
Abstract: This book gives a practical introduction to model-based testing, showing how to write models for testing purposes and how to use model-based testing tools to generate test suites. It is aimed at testers and software developers who wish to use model-based testing, rather than at tool-developers or academics. The book focuses on the mainstream practice of functional black-box testing and covers different styles of models, especially transition-based models (UML state machines) and pre/post models (UML/OCL specifications and B notation). The steps of applying model-based testing are demonstrated on examples and case studies from a variety of software domains, including embedded software and information systems. From this book you will learn: * The basic principles and terminology of model-based testing * How model-based testing differs from other testing processes * How model-based testing fits into typical software lifecycles such as agile methods and the Unified Process * The benefits and limitations of model-based testing, its cost effectiveness and how it can reduce time-to-market * A step-by-step process for applying model-based testing * How to write good models for model-based testing * How to use a variety of test selection criteria to control the tests that are generated from your models * How model-based testing can connect to existing automated test execution platforms such as Mercury Test Director, Java JUnit, and proprietary test execution environments * Presents the basic principles and terminology of model-based testing * Shows how model-based testing fits into the software lifecycle, its cost-effectiveness, and how it can reduce time to market * Offers guidance on how to use different kinds of modeling techniques, useful test generation strategies, how to apply model-based testing techniques to real applications using case studies

890 citations


Proceedings ArticleDOI
21 Jul 2006
TL;DR: This paper focuses on structural coverage criteria on requirements formalized as LTL properties and discusses how they can be adapted to measure finite test cases and can be used to automatically generate a requirements-based test suite.
Abstract: In black-box testing, one is interested in creating a suite of tests from requirements that adequately exercise the behavior of a software system without regard to the internal structure of the implementation. In current practice, the adequacy of black box test suites is inferred by examining coverage on an executable artifact, either source code or a software model.In this paper, we define structural coverage metrics directly on high-level formal software requirements. These metrics provide objective, implementation-independent measures of how well a black-box test suite exercises a set of requirements. We focus on structural coverage criteria on requirements formalized as LTL properties and discuss how they can be adapted to measure finite test cases. These criteria can also be used to automatically generate a requirements-based test suite. Unlike model or code-derived test cases, these tests are immediately traceable to high-level requirements. To assess the practicality of our approach, we apply it on a realistic example from the avionics domain.

149 citations


Proceedings ArticleDOI
Qian Yang1, J. Jenny Li1, David M. Weiss1
23 May 2006
TL;DR: This study studies and compares 17 coverage-based testing tools and shows that each tool has its unique features tailored to its application domains, which can be used to pick the right coverage testing tools depending on various requirements.
Abstract: Test coverage is sometimes used as a way to measure how thoroughly software is tested. Coverage is used by software developers and sometimes by vendors to indicate their confidence in the readiness of their software. This survey studies and compares 17 coverage-based testing tools focusing on, but not restricted to coverage measurement. We also survey additional features, including program prioritization for testing, assistance in debugging, automatic generation of test cases, and customization of test reports. Such features make tools more useful and practical, especially for large-scale, real-life commercial software applications. Our initial motivations were both to understand the available test coverage tools and to compare them to a tool that we have developed, called eXVantage1 (a tool suite that includes code coverage testing, debugging, performance profiling, and reporting). Our study shows that each tool has its unique features tailored to its application domains. Therefore this study can be used to pick the right coverage testing tools depending on various requirements.

135 citations


Proceedings ArticleDOI
08 Jul 2006
TL;DR: The approach presented in this paper relies on a tree-based representation of method call sequences by which sequence feasibility is preserved throughout the entire search process, and uses an extended distance-based fitness function to deal with runtime exceptions.
Abstract: Evolutionary algorithms have successfully been applied to software testing. Not only approaches that search for numeric test data for procedural test objects have been investigated, but also techniques for automatically generating test programs that represent object-oriented unit test cases. Compared to numeric test data, test programs optimized for object-oriented unit testing are more complex. Method call sequences that realize interesting test scenarios must be evolved. An arbitrary method call sequence is not necessarily feasible due to call dependences which exist among the methods that potentially appear in a method call sequence. The approach presented in this paper relies on a tree-based representation of method call sequences by which sequence feasibility is preserved throughout the entire search process. In contrast to other approaches in this area, neither repair of individuals nor penalty mechanisms are required. Strongly-typed genetic programming is employed to generate method call trees. In order to deal with runtime exceptions, we use an extended distance-based fitness function. We performed experiments with four test objects. The initial results are promising: high code coverages were achieved completely automatically for all of the test objects.

112 citations


Proceedings ArticleDOI
Jun Yan1, Zhongjie Li2, Yuan Yuan2, Wei Sun2, Jian Zhang1 
07 Nov 2006
TL;DR: This paper presents a novel method of BPEL test case generation, which is based on concurrent path analysis, and uses an extended control flow graph (XCFG) to represent a BPEL program, and generates all the sequential test paths from XCFG to form concurrent test paths.
Abstract: BPEL is a language that could express complex con- current behaviors. This paper presents a novel method of BPEL test case generation, which is based on concurrent path analysis. This method first uses an Extended Control Flow Graph (XCFG) to represent a BPEL program, and generates all the sequential test paths from XCFG. These sequential test paths are then combined to form concurrent test paths. Finally a constraint solver BoNuS is used to solve the constraints of these test paths and generate feasible test cases. Some techniques are proposed to reduce the number of combined concurrent test paths. Some test criteria de- rived from traditional sequential program testing are also presented to reduce the number of test cases. This method is modularized so that many test techniques such as various test criteria and complex constraint solvers can be applied. This method is tested sound and efficient in experiments. It is also applicable to the testing of other business process languages with possible extension and adaption.

100 citations


Proceedings ArticleDOI
23 May 2006
TL;DR: In this article, the authors describe an ongoing research on test case generation based on Unified Modeling Language (UML) and describe an approach that builds on and combines existing techniques for data and graph coverage.
Abstract: This paper describes an ongoing research on test case generation based on Unified Modeling Language (UML). The described approach builds on and combines existing techniques for data and graph coverage. It first uses the Category-Partition method to introduce data into the UML model. UML Use Cases and Activity diagrams are used to respectively describe which functionalities should be tested and how to test them. This combination has the potential to create a very large number of test cases. This approach offers two ways to manage the number of tests. First, custom annotations and guards use the Category-Partition data which allows the designer tight control over possible, or impossible, paths. Second, automation allows different configurations for both the data and the graph coverage. The process of modeling UML activity diagrams, annotating them with test data requirements, and generating test scripts from the models is described. The goal of this paper is to illustrate the benefits of our model-based approach for improving automation on software testing. The approach is demonstrated and evaluated based on use cases developed for testing a graphical user interface (GUI).

97 citations


Journal ArticleDOI
TL;DR: This work considers the case where inconsistencies are present between a system and its corresponding model, used for automatic verification, and presents an implementation of the proposed methodology called AMC (for Adaptive Model Checking), using techniques from black box testing and machine learning.
Abstract: We consider the case where inconsistencies are present between a system and its corresponding model, used for automatic verification. Such inconsistencies can be the result of modeling errors or recent modifications of the system. Despite such discrepancies, we can still attempt to perform automatic verification. In fact, as we show, we can sometimes exploit the verification results to assist in automatically learning the required updates to the model. In a related previous work, we have suggested the idea of black box checking, where verification starts without any model, and the model is obtained while repeated verification attempts are performed. Under the current assumptions, an existing inaccurate (but not completely obsolete) model is used to expedite the updates. We use techniques from black box testing and machine learning. We present an implementation of the proposed methodology called AMC (for Adaptive Model Checking). We discuss some experimental results, comparing various tactics of updating a model while trying to perform model checking.

93 citations


Journal ArticleDOI
01 May 2006
TL;DR: In this paper, a general framework for model-in-the-loop (MiL) system analysis is described and two detailed examples are given, which concern the integration of numerical aerodynamic and tyre models into car test rigs.
Abstract: Physical testing is used extensively to characterize mechanical systems. However, in many cases, mathematical models are now available that adequately describe the behaviour of part of the test specimen. Thus, test systems can be conceived which split the specimen into a physical part, and a virtual part, i.e. a real-time computer simulation. This has the potential to enhance convenience and reduce cost. The term ‘model-in-the-loop’ (MiL) has been used in the automotive industry to describe this concept. In this paper, a general framework for MiL system analysis is described. Two detailed examples are given. These concern the integration of numerical aerodynamic and tyre models into car test rigs. In the first case, the analytical results are validated by data from a real system. It is clear that particular demands are placed on the actuation and sensing systems if the numerical and real parts of the specimen are to interact correctly to give a realistic response for the complete system. Acceptabl...

93 citations


Journal ArticleDOI
TL;DR: A case study that investigates the effects of changing configurations on two types of test suites and shows that test coverage and fault detection effectiveness do not vary much across configurations for entire test suites; however, for individual test cases and certain types of faults, configurations matter.
Abstract: User configurable software systems allow users to customize functionality at run time. In essence, each such system consists of a family of potentially thousands or millions of program instantiations. Testing methods cannot test all of these configurations, therefore some sampling mechanism must be applied. A common approach to providing such a mechanism has been to use combinatorial interaction testing. To date, however, little work has been done to quantify the effects of different configurations on a test suites' operation and effectiveness. In this paper we present a case study that investigates the effects of changing configurations on two types of test suites. Our results show that test coverage and fault detection effectiveness do not vary much across configurations for entire test suites; however, for individual test cases and certain types of faults, configurations matter.

90 citations


Proceedings ArticleDOI
21 Sep 2006
TL;DR: The results suggest that TDD improves the unit testing but slows down the overall process, which is an alternative to the testing after coding (TAC), which is the usual approach to run and execute unit tests after having written the code.
Abstract: Test driven development (TDD) is gaining interest among practitioners and researchers: it promises to increase the quality of the code. Even if TDD is considered a development practice, it relies on the use of unit testing. For this reason, it could be an alternative to the testing after coding (TAC), which is the usual approach to run and execute unit tests after having written the code. We wondered which are the differences between the two practices, from the standpoint of quality and productivity. In order to answer our research question, we carried out an experiment in a Spanish Software House. The results suggest that TDD improves the unit testing but slows down the overall process.

85 citations


Patent
28 Dec 2006
TL;DR: In this paper, an automated testing framework enables automated testing of complex software systems, which can be configured for test selection, flow definition, and automated scheduled testing for complex computer systems.
Abstract: An automated testing framework enables automated testing of complex software systems. The framework can be configured for test selection, flow definition, and automated scheduled testing of complex computer systems. The framework has facilities for result analysis, comparison of key performance indicators with predefined target values, and test management.

01 Jan 2006
TL;DR: This research indicates that an adapted form of mutation-based testing can be used for effective and automated testing of timeliness and, thus, for increasing the confidence level in real-time systems that are designed according to the event-triggered paradigm.
Abstract: A problem when testing timeliness of event-triggered real-time systems is that response times depend on the execution order of concurrent tasks. Conventional testing methods ignore task interleaving and timing and thus do not help determine which execution orders need to be exercised to gain confidence in temporal correctness. This thesis presents and evaluates a framework for testing of timeliness that is based on mutation testing theory. The framework includes two complementary approaches for mutation-based test case generation, testing criteria for timeliness, and tools for automating the test case generation process. A scheme for automated test case execution is also defined. The testing framework assumes that a structured notation is used to model the real-time applications and their execution environment. This real-time system model is subsequently mutated by operators that mimic potential errors that may lead to timeliness failures. Each mutated model is automatically analyzed to generate test cases that target execution orders that are likely to lead to timeliness failures. The validation of the theory and methods in the proposed testing framework is done iteratively through case-studies, experiments and proof-of-concept implementations. This research indicates that an adapted form of mutation-based testing can be used for effective and automated testing of timeliness and, thus, for increasing the confidence level in real-time systems that are designed according to the event-triggered paradigm.

Proceedings ArticleDOI
28 May 2006
TL;DR: This paper proposes a new approach in which the database states required for testing are specified intensionally, as constrained queries, that can be used to prepare the database for testing automatically, and does not appear to impose significant performance overheads.
Abstract: When testing database applications, in addition to creating in-memory fixtures it is also necessary to create an initial database state that is appropriate for each test case. Current approaches either require exact database states to be specified in advance, or else generate a single initial state (under guidance from the user) that is intended to be suitable for execution of all test cases. The first method allows large test suites to be executed in batch, but requires considerable programmer effort to create the test cases (and to maintain them). The second method requires less programmer effort, but increases the likelihood that test cases will fail in non-fault situations, due to unexpected changes to the content of the database. In this paper, we propose a new approach in which the database states required for testing are specified intensionally, as constrained queries, that can be used to prepare the database for testing automatically. This technique overcomes the limitations of the other approaches, and does not appear to impose significant performance overheads.

Journal ArticleDOI
TL;DR: Recently, software development teams using agile processes have started widely adopting test-driven development, a testing technique that forces the programmer to think about many aspects of the feature before coding it and provides a safety net of tests.
Abstract: Recently, software development teams using agile processes have started widely adopting test-driven development. Despite its name, "test driven" or "test first" development isn't really a testing technique. Also known as test-driven design, TDD works like this: For each small bit of functionality the programmers code, they first write unit tests. Then they write the code that makes those unit tests pass. This forces the programmer to think about many aspects of the feature before coding it. It also provides a safety net of tests that the programmers can run with each update to the code, ensuring that refactored, updated, or new code doesn't break existing functionality

Journal ArticleDOI
TL;DR: This work makes use of a special kind of situation, which it calls checkpoints, such that the middleware will not activate the functions under test, and shows how hidden failures can be detected.
Abstract: During the testing of context-sensitive middleware-based software, the middleware checks the current situation to invoke the appropriate functions of the applications. Since the middleware remains active and the situation may continue to evolve, however, the conclusion of some test cases may not easily be identified. Moreover, failures appearing in one situation may be superseded by subsequent correct outcomes and, therefore, be hidden. We alleviate the above problems by making use of a special kind of situation, which we call checkpoints, such that the middleware will not activate the functions under test. We recommend testers to generate test cases that start at a checkpoint and end at another. Testers may identify relations that associate different execution sequences of a test case. They then check the results of each test case to detect any contravention of such relations. We illustrate our technique with an example that shows how hidden failures can be detected. We also report the experimentation carried out on an RFID-based location-sensing application on top of a context-sensitive middleware.

Proceedings ArticleDOI
18 Sep 2006
TL;DR: A novel pair, model-based symbolic testing is developed, and the results show that the pairs are complementary (i.e., reveal faults differently), with their respective strengths and weaknesses.
Abstract: Testing involves two major activities: generating test inputs and determining whether they reveal faults. Automated test generation techniques include random generation and symbolic execution. Automated test classification techniques include ones based on uncaught exceptions and violations of operational models inferred from manually provided tests. Previous research on unit testing for object-oriented programs developed three pairs of these techniques: model-based random testing, exception-based random testing, and exception-based symbolic testing. We develop a novel pair, model-based symbolic testing. We also empirically compare all four pairs of these generation and classification techniques. The results show that the pairs are complementary (i.e., reveal faults differently), with their respective strengths and weaknesses.

Proceedings ArticleDOI
18 Sep 2006
TL;DR: A new test adequacy criterion is introduced that is based on coverage of the database commands generated by an application and specifically focuses on the application-database interactions.
Abstract: The testing of database applications poses new challenges for software engineers. In particular, it is difficult to thoroughly test the interactions between an application and its underlying database, which typically occur through dynamically-generated database commands. Because traditional code-based coverage criteria focus only on the application code, they are often inadequate in exercising these commands. To address this problem, we introduce a new test adequacy criterion that is based on coverage of the database commands generated by an application and specifically focuses on the application-database interactions. We describe the criterion, an analysis that computes the corresponding testing requirements, and an efficient technique for measuring coverage of these requirements. We also present a tool that implements our approach and a preliminary study that shows the approach?s potential usefulness and feasibility.

Book
17 Nov 2006
TL;DR: In this paper, the authors proposed a paradigm shift from traditional software testing to risk-based security testing with threat modeling. But they did not address the problem of how Vulnerabilities get into all software.
Abstract: Foreword xiii Preface xvii Acknowledgments xxix About the Authors xxxi Part I: Introduction Chapter 1: Case Your Own Joint: A Paradigm Shift from Traditional Software Testing 3 Chapter 2: How Vulnerabilities Get Into All Software 19 Chapter 3: The Secure Software Development Lifecycle 55 Chapter 4: Risk-Based Security Testing: Prioritizing Security Testing with Threat Modeling 73 Chapter 5: Shades of Analysis: White, Gray, and Black Box Testing 93 Part II: Performing the Attacks Chapter 6: Generic Network Fault Injection 107 Chapter 7: Web Applications: Session Attacks 125 Chapter 8: Web Applications: Common Issues 141 Chapter 9: Web Proxies: Using WebScarab 169 Chapter 10: Implementing a Custom Fuzz Utility 185 Chapter 11: Local Fault Injection 201 Part III: Analysis Chapter 12: Determining Exploitability 233 Index 251

Journal ArticleDOI
TL;DR: It is argued that the perceived costs of unit testing may be exaggerated and that the likely benefits in terms of defect detection are quite high in relation to those costs.
Abstract: Conventional wisdom and anecdote suggests that testing takes between 30 to 50% of a project's effort. However testing is not a monolithic activity as it consists of a number of different phases such as unit testing, integration testing and finally system and acceptance test. Unit testing has received a lot of criticism in terms of the amount of time that it is perceived to take and its perceived costs. However it still remains an important verification activity being an effective means to test individual software components for boundary value behavior and ensure that all code has been exercised adequately. We examine the available data from three safety-related, industrial software projects that have made use of unit testing. Using this information we argue that the perceived costs of unit testing may be exaggerated and that the likely benefits in terms of defect detection are quite high in relation to those costs. We also discuss the different issues that have been found applying the technique at different phases of the development and using different methods to generate those tests. We also compare results we have obtained with empirical results from the literature and highlight some possible weakness of research in this area.

Proceedings ArticleDOI
17 Jul 2006
TL;DR: A notion of object distance is defined, with associated algorithms to compute distances between arbitrary objects, and used to generalize Adaptive Random Testing to such inputs, which opens the way for effective automated testing of large, realistic object-oriented programs.
Abstract: Testing with random inputs can give surprisingly good results if the distribution of inputs is spread out evenly over the input domain; this is the intuition behind Adaptive Random Testing, which relies on a notion of "distance" between test values. Such distances have so far been defined for integers and other elementary inputs; extending the idea to the testing of today's object-oriented programs requires a more general notion of distance, applicable to composite programmer-defined types.We define a notion of object distance, with associated algorithms to compute distances between arbitrary objects, and use it to generalize Adaptive Random Testing to such inputs. The resulting testing strategies open the way for effective automated testing of large, realistic object-oriented programs.

Book ChapterDOI
29 Oct 2006
TL;DR: The need for unit testing is motivated, the adaptation to the unit testing approach is described, use cases and examples are given, and the idea of unit testing for ontologies seems not applicable.
Abstract: In software engineering, the notion of unit testing was successfully introduced and applied Unit tests are easy manageable tests for small parts of a program – single units They proved especially useful to capture unwanted changes and side effects during the maintenance of a program, and they grow with the evolution of the program. Ontologies behave quite differently than program units As there is no information hiding in ontology engineering, and thus no black box components, at first the idea of unit testing for ontologies seems not applicable In this paper we motivate the need for unit testing, describe the adaptation to the unit testing approach, and give use cases and examples.

Proceedings ArticleDOI
28 May 2006
TL;DR: GridUnit is introduced, an extension of the widely adopted JUnit testing framework, able to automatically distribute the execution of software tests on a computational grid with minimum user intervention, and provides a cost-effectiveness improvement to the software testing experience.
Abstract: Software testing is a fundamental part of system development. As software grows, its test suite becomes larger and its execution time may become a problem to software developers. This is especially the case for agile methodologies, which preach a short develop/test cycle. Moreover, due to the increasing complexity of systems, there is the need to test software in a variety of environments. In this paper, we introduce GridUnit, an extension of the widely adopted JUnit testing framework, able to automatically distribute the execution of software tests on a computational grid with minimum user intervention. Experiments conducted with this solution have showed a speed-up of almost 70x, reducing the duration of the test phase of a synthetic application from 24 hours to less than 30 minutes. The solution does not require any source-code modification, hides the grid complexity from the user and provides a cost-effectiveness improvement to the software testing experience.

Proceedings ArticleDOI
18 Sep 2006
TL;DR: A prototype tool is developed based on symbolic execution of .NET code that generates mock objects including their behavior by analyzing all uses of the mock object in a given unit test.
Abstract: Unit testing is a popular way to guide software development and testing. Each unit test should target a single feature, but in practice it is difficult to test features in isolation. Mock objects are a well-known technique to substitute parts of a program which are irrelevant for a particular unit test. Today mock objects are usually written manually supported by tools that generate method stubs or distill behavior from existing programs. We have developed a prototype tool based on symbolic execution of .NET code that generates mock objects including their behavior by analyzing all uses of the mock object in a given unit test. It is not required that an actual implementation of the mocked behavior exists. We are working towards an integration of our tool into Visual Studio Team System.

Proceedings ArticleDOI
17 Sep 2006
TL;DR: This paper presents a method of reducing overheads called forgetting, where the number of test cases used in the restriction algorithm can be limited, and thus the computational overheads reduced.
Abstract: Adaptive Random Testing (ART) methods are Software Testing methods which are based on Random Testing, but which use additional mechanisms to ensure more even and widespread distributions of test cases over an input domain. Restricted Random Testing (RRT) is a version of ART which uses exclusion regions and restriction of test case generation to outside these regions. RRT has been found to perform very well, but incurs some additional computational cost in its restriction of the input domain. This paper presents a method of reducing overheads called Forgetting, where the number of test cases used in the restriction algorithm can be limited, and thus the computational overheads reduced. The motivation for Forgetting comes from its importance as a human strategy for learning. Several implementations are presented and examined using simulations. The results are very encouraging.

Proceedings ArticleDOI
29 Aug 2006
TL;DR: The technique combines dependence analysis and symbolic evaluation and uses information about the changes between two versions of a program to identify parts of the program affected by the changes, compute the conditions under which the effects of the changes are propagated, and create a set of testing requirements based on the computed information.
Abstract: This paper presents a new test-suite augmentation technique for use in regression testing of software. Our technique combines dependence analysis and symbolic evaluation and uses information about the changes between two versions of a program to (1) identify parts of the program affected by the changes, (2) compute the conditions under which the effects of the changes are propagated to such parts, and (3) create a set of testing requirements based on the computed information. Testers can use these requirements to assess the effectiveness of the regression testing performed so far and to guide the selection of new test cases. The paper also presents MATRIX, a tool that partially implements our technique, and its integration into a regression-testing environment. Finally, the paper presents a preliminary empirical study performed on two small programs. The study provides initial evidence of both the effectiveness of our technique and the shortcomings of previous techniques in assessing the adequacy of a test suite with respect to exercising the effect of program changes

Proceedings ArticleDOI
17 Sep 2006
TL;DR: The MORABIT project realizes such an infrastructure for built-in-test (BIT) and extends the BIT concepts to allow for a smooth integration of the testing process and the original business functionality execution.
Abstract: Runtime testing is important for improving the quality of software systems. This fact holds true especially for systems which cannot be completely assembled at development time, such as mobile or ad-hoc systems. The concepts of built-in-test (BIT) can be used to cope with runtime testing, but to our knowledge there does not exist an implemented infrastructure for BIT. The MORABIT project realizes such an infrastructure and extends the BIT concepts to allow for a smooth integration of the testing process and the original business functionality execution. In this paper the requirements on the infrastructure and our solution are presented

Journal ArticleDOI
TL;DR: In this article, the authors propose a data flow analysis approach for inter-class testing, which is based on finite state machines, database modeling and processing techniques, and algorithms for analysis and traversal of directed graphs.
Abstract: In object-oriented terms, one of the goals of integration testing is to ensure that messages from objects in one class or component are sent and received in the proper order and have the intended effect on the state of external objects that receive the messages. This research extends an existing single-class testing technique to integration testing. The previous method models the behavior of a single class as a finite state machine, transforms that representation into a data flow graph that explicitly identifies the definitions and uses of each state variable of the class, and then applies conventional data flow testing to produce test case specifications that can be used to test the class. This paper extends those ideas to inter-class testing by developing flow graphs and tests for an arbitrary number of classes and components. It introduces flexible representations for message sending and receiving among objects and allows concurrency among any or all classes and components. A second major result is the introduction of a novel approach to performing data flow analysis. Data flow graphs are stored in a relational database, and database queries are used to gather def-use information. This approach is conceptually simple, mathematically precise, quite powerful, and general enough to be used for traditional data flow analysis. This testing approach relies on finite state machines, database modeling and processing techniques, and algorithms for analysis and traversal of directed graphs. A proof-of-concept implementation is used to illustrate how the approach works on an extended example.

Proceedings ArticleDOI
22 Oct 2006
TL;DR: This paper presents an approach that has been used to address security when running projects according to agile principles, and misuse stories have been added to user stories to capture malicious use of the application.
Abstract: In this paper, we present an approach that we have used to address security when running projects according to agile principles. Misuse stories have been added to user stories to capture malicious use of the application. Furthermore, misuse stories have been implemented as automated tests (unit tests, acceptance tests) in order to perform security regression testing. Penetration testing, system hardening and securing deployment have been started in early iterations of the project.

01 Jan 2006
TL;DR: This work proposes new methods for systematically and automatically testing sequential and concurrent programs, based on concolic testing, race-detection and flipping, and predictive monitoring, and developed tools for testing both C and Java programs.
Abstract: Testing using manually generated test cases is the primary technique used in industry to improve reliability of software---in fact, such ad hoc testing accounts for over half of the typical cost of software development. We propose new methods for systematically and automatically testing sequential and concurrent programs. The methods are based on three new techniques: concolic testing, race-detection and flipping, and predictive monitoring. Concolic testing combines concrete and symbolic testing to avoid redundant test cases as well as false warnings. Concolic testing can catch generic errors such as assertion violations, uncaught exceptions, and segmentation faults. Large real-world programs are almost always concurrent. Because of the inherent non-determinism of such programs, testing is notoriously hard. We extend concolic testing with a method called race-detection and flipping , which provides ways of reducing, often exponentially, the exploration space for concolic testing. This combined method provides the first technique to effectively test concurrent programs with complex data inputs. Concolic testing may also be combined with formal specifications by using runtime monitors. Runtime monitors are small software units which are synthesized automatically from the formal specification for the software and weaved into the code to dynamically check if the specification is violated. For multi-threaded concurrent programs, we developed a novel technique which allows efficient predictive monitoring to enable the detection of a violation by observing some related, but possibly bug free execution of a concurrent program. Predictive monitoring dramatically improves the efficiency of testing. Based on the above methods we have developed tools for testing both C and Java programs. We have used the tools to find bugs in several real-world software systems including SGLIB, a popular C data structure library used in a commercial tool, implementations of the Needham-Schroeder protocol and the TMN protocol, the scheduler of Honeywell's DEOS real-time operating system, and the Sum Microsystems' JDK 1.4 collection framework.

Proceedings ArticleDOI
29 Aug 2006
TL;DR: This paper proposes an experimental framework for comparison of test techniques with respect to efficiency, effectiveness and applicability, and plans to evaluate ease of automation, which has not been addressed by previous studies.
Abstract: Software testing is expensive for the industry, and always constrained by time and effort. Although there is a multitude of test techniques, there are currently no scientifically based guidelines for the selection of appropriate techniques of different domains and contexts. For large complex systems, some techniques are more efficient in finding failures than others and some are easier to apply than others are. From an industrial perspective, it is important to find the most effective and efficient test design technique that is possible to automate and apply. In this paper, we propose an experimental framework for comparison of test techniques with respect to efficiency, effectiveness and applicability. We also plan to evaluate ease of automation, which has not been addressed by previous studies. We highlight some of the problems of evaluating or comparing test techniques in an objective manner. We describe our planned process for this multi-phase experimental study. This includes presentation of some of the important measurements to be collected with the dual goals of analyzing the properties of the test technique, as well as validating our experimental framework.