scispace - formally typeset
Search or ask a question
Book ChapterDOI

Are We There Yet? Determining the Adequacy of Formalized Requirements and Test Suites

TL;DR: The results of the preliminary study show that even for systems with comprehensive test suites and good sets of requirements, the approach can identify cases where more tests or more requirements are needed to improve coverage numbers.
Abstract: Structural coverage metrics have traditionally categorized code as either covered or uncovered. Recent work presents a stronger notion of coverage, checked coverage, which counts only statements whose execution contributes to an outcome checked by an oracle. While this notion of coverage addresses the adequacy of the oracle, for Model-Based Development of safety critical systems, it is still not enough; we are also interested in how much of the oracle is covered, and whether the values of program variables are masked when the oracle is evaluated. Such information can help system engineers identify missing requirements as well as missing test cases. In this work, we combine results from checked coverage with results from requirements coverage to help provide insight to engineers as to whether the requirements or the test suite need to be improved. We implement a dynamic backward slicing technique and evaluate it on several systems developed in Simulink. The results of our preliminary study show that even for systems with comprehensive test suites and good sets of requirements, our approach can identify cases where more tests or more requirements are needed to improve coverage numbers.

Summary (4 min read)

1 Introduction

  • Model-Based Development (MBD) refers to the use of domain-specific modeling notations to create models of a desired system early in the development lifecycle.
  • Model-Based Development significantly reduces costs while also improving quality.
  • They note that this metric judges the quality of the test oracle — a program with no assertions will have no coverage.
  • The authors hypothesis is that this metric can be leveraged to better assess the quality of an automated testing process in MBD where formalized requirements serve as oracles for auto-generated tests [28].
  • The authors combine the results of checked coverage with the results of requirements coverage to determine for a given model whether its requirements and test suite are adequate.

2 Motivation

  • Consider the control software for an infusion pump, a medical device that is typically used to infuse liquid drugs into a patient’s body in a controlled fashion.
  • The “ALARM” subsystem is responsible for monitoring hazards (CheckAlarm state machine) with different levels of severity in the system, and alerting the clinicians (Audio and Visual state machines) to take the appropriate action when such conditions occur.
  • The authors auto-generate the source code from the Simulink model, formalize the requirements as boolean expressions, and automatically generate the test cases from the model.
  • To motivate the utility of their proposed approach the authors use a snippet of autogenerated code from the Audio state machine in Fig.
  • This example demonstrates that the checked coverage is lower than the set of covered statements.

3 Methodology

  • There are three inputs to their technique: the model of the system being analyzed, a set of test cases (manual or auto-generated) that exercise the model, and a set of formalized requirements of the model as shown in Fig.
  • A dynamic backward slice is used to extract the set of program statements that operate on variables whose values are checked in the assertions.
  • This is termed as checked coverage while all other executed statements are categorized as unchecked coverage.
  • The algorithm takes as input an auto-generated program M , the test suite T for exercising the behaviors of the program, and the set of assertions that encode the formalized requirements.
  • Dynamic slicing is used to compute the basic form of checked coverage.

3.1 Coverage of Requirements

  • In this work the authors use the Modified Decision/Condition Coverage (MC/DC) metric to evaluate the assertion coverage for a given test suite.
  • MC/DC coverage of a requirement encoded as an assertion requires that each condition in the assertion takes on all possible outcomes at least once and each condition is shown to independently affect the assertion’s outcome.
  • The authors use the masking form of MC/DC to determine the independence of the conditions in the assertion.
  • A condition is masked if changing its value does not affect the outcome of the assertion.
  • But if only one is satisfied by the test, then the authors report 33% coverage of the assertion.

3.2 A More Precise Dynamic Backward Slice

  • The authors propose a more precise dynamic backward slice that takes into account which parts of the assertion are covered and whether certain values of program variables are not used when the assertion is evaluated.
  • The authors leverage the masking information within an assertion for a given test to generate a more precise dynamic backward slice.
  • Then get all of the program statements in the execution trace that impact them.the authors.
  • Even though there are values of y being written to in the execution trace, since they are not being used in the evaluation of the assertion, they are not added to the checked set.
  • The authors believe this will reduce the size of the checked set and provide a more precise characterization of parts of the program that are being checked in the assertions.

3.3 Mapping Back to the Model

  • In the final phase of their technique, for a given test suite, the authors report the following to the system engineers: (i) the precise checked coverage, (ii) the unchecked coverage, (iii) the uncovered coverage, and the (iv) coverage of the requirements.
  • Note that the authors map the coverage of the code onto the model.
  • The authors believe that these coverage measures help us bridge the gap between requirements, tests, and the model as discussed in [28].
  • The relationship between the various types of coverage can potentially help to determine the source of incompleteness in either tests, requirements, or the model.
  • Low coverage of the requirements coupled with low checked coverage could be indicative of missing tests and/or missing requirements.

4.1 Case Examples

  • The authors consider three different systems: a medical device controller, an avionics system controller and a general appliance controller.
  • The second column gives the number of auto-generated source lines of code (LOC); column three presents the number of requirements available for each test suite; column 4 describes the source of the test suites.
  • This system was also developed using Mathworks Simulink/Stateflow tool and its source code was generated using Simulink Coder.
  • Their goal was to analyze the adequacy of the sparse requirements for the test cases.the authors.
  • The authors generated code for the microwave system using the Gryphon Tool Suite [34].

4.2 Tools and Experiment Set up

  • The authors use a combination of commercially available and free open source tools to implement their approach.
  • As previously mentioned, the test suites and the source code are generated using various sources and tools in order to generate a variety of artifacts and determine the efficacy of the different test suites based on their metrics.
  • The total number of obligations that are satisfied by the test suite are recorded and reported.
  • The Frama-C slicing plugin requires the slicing criterion to be expressed using ACSL [4], a formal specification language used for specifying behavioral properties of C source code.
  • Once all slices and execution traces are obtained, the slices are compared with the execution trace to identify the checked and unchecked covered lines of code.

4.3 Analysis of the Results

  • Table 2 shows the structural and requirements coverage metrics for the artifacts for a given test suite.
  • Similarly MCR 2 has statement and condition coverage of 87% and 100% respectively and requirements coverage of 80%.
  • The results demonstrate that, overall, the checked coverage in Table 3 is lower compared to the set of covered statements shown in Table 2.
  • Using themore precise dynamic slicing techniqueproposed in thiswork the checked coverage decreases even further while the unchecked coverage increases.

5 Discussion

  • The authors summarize the results of the empirical evaluation and provide some recommendations for improvement based on the data.
  • This is not surprising since there are only three requirements for the model.
  • The ALM 2, DCK 2, MCR 2 examples have reasonable statement and requirements coverage but low precise checked coverage.
  • The variables used in these lines are then traced back to their source blocks in the model, as shown in Figure 5.
  • Using this information, a system engineer might want to add a requirement that would check if the system has been IDLE for more than a certain amount of time.

7 Conclusion

  • The two main techniques for test case generation are (i) manual and (ii) automated test case generation techniques.
  • Sometimes even in manually generated tests, defining a precise oracle for a given test is often a difficult endeavor.
  • Recent work presents a stronger notion of coverage of checked coverage, compared to traditional structural values of simply covered and uncovered [29,30].
  • The approach presented here allows us to connect the dots between test cases, requirements, and the model.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Are We There Yet? Determining the Adequacy
of Formalized Requirements and Test Suites
Anitha Murugesan
1(
B
)
, Michael W. Whalen
1
, Neha Rungta
2
,
Oksana Tkachuk
2
, Suzette Person
3
, Mats P.E. Heimdahl
1
, and Dongjiang You
1
1
Department of Computer Science and Engineering, University of Minnesota,
200 Union Street, Minneapolis, MN 55455, USA
{anitha,whalen,heimdahl,djyou}@cs.umn.edu
2
NASA Ames Research Center, Mountain, USA
{neha.s.rungta,oksana.tkachuk}@nasa.gov
3
NASA Langley Research Center, Hampton, USA
suzette.person@nasa.gov
Abstract. Structural coverage metrics have traditionally categorized
code as either covered or uncovered. Recent work presents a stronger
notion of coverage, checked coverage, which counts only statements whose
execution contributes to an outcome checked by an oracle. While this
notion of coverage addresses the adequacy of the oracle, for Model-Based
Development of safety critical systems, it is still not enough; we are also
interested in how much of the oracle is covered, and whether the val-
ues of program variables are masked when the oracle is evaluated. Such
information can help system engineers identify missing requirements as
well as missing test cases. In this work, we combine results from checked
coverage with results from requirements coverage to help provide insight
to engineers as to whether the requirements or the test suite need to
be improved. We implement a dynamic backward slicing technique and
evaluate it on several systems developed in Simulink. The results of our
preliminary study show that even for systems with comprehensive test
suites and good sets of requirements, our approach can identify cases
where more tests or more requirements are needed to improve coverage
numbers.
1 Introduction
Model-Based Development (MBD) refers to the use of domain-specific modeling
notations to create models of a desired system early in the development lifecycle.
These models can be executed on the desktop, analyzed for desired behaviors,
and then used to automatically generate code and test cases. Also known as
correct-by-construction development, the emphasis in model-based development
is on the engineering effort invested in the early lifecycle activities of modeling,
simulation, and analysis. This reduces development costs by finding defects early
This work has been partially supported by NSF grants CNS-0931931 and CNS-
1035715.
c
Springer International Publishing Switzerland 2015
K. Havelund et al. (Eds.): NFM 2015, LNCS 9058, pp. 279–294, 2015.
DOI: 10.1007/978-3-319-17524-9
20

280 A. Murugesan et al.
in the lifecycle, avoiding rework that is necessary when errors are discovered dur-
ing integration testing, and by automating the late life-cycle activities of coding
and test case generation. In this way, Model-Based Development significantly
reduces costs while also improving quality. There are several commercial MBD
tools, including Simulink/Stateflow [19], SCADE [10], IBM Rhapsody [1]and
IBM Rational Statemate [2].
An important part of MBD is automated test generation and execution.
Tools such as Reactis [26], the MathWorks Verification and Validation plug-in
for Simulink, and the IBM Rhapsody Automatic Test Generation add-on, as
well as other tools, support automated test generation from models. These tools
enable generation of structural coverage tests up to a high degree of rigor, e.g.,
tests satisfying the MC/DC coverage metric. In the domain of critical systems
particularly in avionics demonstrating structural coverage is required for
certification [27].
In principle, automated test generation represents a success for software engi-
neering research: a mandatory and potentially arduous engineering task
has been automated. However, several studies have raised questions about the
effectiveness of automated test generation towards a specific structural coverage
metric (e.g., [12,14,31]), in some cases finding these tests less effective than ran-
domly generated tests of the same length in terms of fault-finding capabilities.
This often has to do with the observability capabilities of the test oracle, which
determines whether the test passes or fails. In many cases, the code structure
that was examined has no measurable effect on the test outcome.
In recent work, a metric proposed by Schuler and Zeller in [29,30] addresses
observability, but does so in a post-priori way: given a test suite and a set
of requirements specified as assertions, it uses dynamic backward slicing from
the requirements (assertions) to determine the set of program statements that
affect the evaluation of the requirement. They call this metric checked statement
coverage, because it only considers the statements that are checked (observed).
They note that this metric judges the quality of the test oracle a program
with no assertions will have no coverage. Therefore, given any test suite, it is
possible to increase coverage by adding additional oracles (requirements) to the
suite. Our hypothesis is that this metric can be leveraged to better assess the
quality of an automated testing process in MBD where formalized requirements
serve as oracles for auto-generated tests [28].
In this work, we combine the results of checked coverage with the results of
requirements coverage to determine for a given model whether its requirements
and test suite are adequate. While the work in [30] focuses on whether or not
the oracles (requirements) are adequate, we are interested in both the adequacy
of the test suite and the requirements encoded as oracles: if checked coverage is
low then either the requirements or the tests maybe incomplete. Specifically, we
add to this notion of coverage by calculating checked coverage based on dynamic
backward slicing as well as MC/DC masking information. Finally, we map the
different forms of code coverage back to the model, and report the coverage of

Determining the Adequacy of Formalized Requirements and Test Suites 281
Fig. 1. Hierarchical state machine model of the ALARM subsystem
the requirements, in order to provide information to the system engineers about
sources of incompleteness. Thus, the contributions of the paper are:
An approach using checked, unchecked, and requirements coverage informa-
tion to assess the adequacy of both test suites and requirements.
An approach to calculate checked coverage based on backward dynamic slic-
ing and MC/DC masking information, which leads to more precise checked
coverage results than dynamic backward slicing alone.
A preliminary evaluation of our technique on a set of examples that use
Simulink as part of the MBD approach. In addition to computing coverage
for the auto-generated code, we also map the results back to the models.
Our experience shows that even for case studies with comprehensive test
suites and good sets of requirements, our approach can identify cases where
more tests or more requirements are needed to improve the coverage numbers.
2 Motivation
Consider the control software for an infusion pump, a medical device that is typ-
ically used to infuse liquid drugs into a patient’s body in a controlled fashion. An
important subsystem of the controller is the ALARM subsystem shown in Fig. 1.
The model for the system [22] was developed using MathWorks Simulink/State-
flow tool [19]. The “ALARM” subsystem is responsible for monitoring hazards
(CheckAlarm state machine) with different levels of severity in the system, and
alerting the clinicians (Audio and Visual state machines) to take the appropriate
action when such conditions occur. We auto-generate the source code from the
Simulink model, formalize the requirements as boolean expressions, and auto-
matically generate the test cases from the model.

282 A. Murugesan et al.
1: if(localB->ALARM
OUT Hazard >=3){
2: if(localB->Disable
Audio > 1){
3: localB->ALARM
OUT Audio Command =0;
4: localB->ALARM
OUT Audio Disabled =1;
5: if(localDW->time
minutes > 3){
6: localB->Disable
Audio =0;
7: }
8: }
9: }else ...
Fig. 2. Code snippet from the ALARM system’s audio notification functionality
To motivate the utility of our proposed approach we use a snippet of auto-
generated code from the Audio state machine in Fig. 1. The code is shown
in Fig. 2. It raises an aural alert when a certain level of hazard is detected and
the audio has not been disabled by the user. Assume the following oracle encodes
a requirement of the system:
Hazard >=3 Disable
Audio =0 = Audio Command =1
Suppose we execute a test case, t, that covers program statements one to
seveninFig.2 and the values of the variables used in the oracle are: Hazard := 3
and Disable
Audio := 2. The corresponding checked coverage for the test does
not contain the program statement at line 4 in Fig. 2;theAudio
Disabled variable
defined at line 4 does not either directly or transitively impact the values used in
the oracle. This example demonstrates that the checked coverage is lower than
the set of covered statements.
The notion of checked coverage, however, does not take into account which
parts of the oracle were covered and whether the values of certain program
variables are masked when the oracle is evaluated. The values for variables
Hazard := 3 and Disable
Audio := 2 cause the antecedent in the requirement
(Hazard >=3 Disable
Audio = 0) to be false; hence, the consequent of the
requirement (Audio
Command = 1) is not evaluated. Even though the program
statement at line 3 in Fig. 2 writes to the variable Audio
Command used in
the oracle, the test, t, does not evaluate Audio
Command in the oracle. We
can leverage this information to define a more precise checked coverage measure
by marking line 3 in Fig. 2 as unchecked. In the next section we present an
overview of how we measure requirements coverage along with checked coverage
to improve upon the checked coverage measure.
3 Methodology
There are three inputs to our technique: the model of the system being analyzed,
a set of test cases (manual or auto-generated) that exercise the model, and a set
of formalized requirements of the model as shown in Fig. 3. The requirements are

Determining the Adequacy of Formalized Requirements and Test Suites 283
Fig. 3. Test Case Coverage Classification Approach Overview
transformed into assertions over program variables. We automatically generate
the code from the model and execute the tests on the auto-generated code. The
formalized requirements are used as a slicing criteria for program execution traces
generated by the various tests as shown in Fig. 3. A dynamic backward slice is
used to extract the set of program statements that operate on variables whose
values are checked in the assertions. This is termed as checked coverage while all
other executed statements are categorized as unchecked coverage. In addition to
the code coverage we also measure the coverage of the requirements. Checked,
unchecked, and uncovered code coverage are mapped back to the model to help
the system engineers determine incompleteness in the requirements, tests, or the
model.
We present an overview of the algorithm to partition coverage into checked
coverage versus unchecked coverage in Fig. 4. The algorithm takes as input an
auto-generated program M, the test suite T for exercising the behaviors of the
program, and the set of assertions that encode the formalized requirements. The
sets checked and unchecked are initialized as empty. We run each test, t,in
the test suite T on the program and generate the set of program statements
l
0
,...,l
n
executed by the test. Next, we generate a dynamic slice of the trace
using each assertion a as the slicing criteria. In the case that a program statement
l is in the dynamic slice then it is added to the checked set; otherwise it is added
to the unchecked set.
Dynamic slicing is used to compute the basic form of checked coverage. A
dynamic slice of an execution trace with respect to an assertion extracts the set of
program statements in the trace that may impact the evaluation of the assertion.

Citations
More filters
Proceedings ArticleDOI
01 Nov 2016
TL;DR: In this article, the authors present an algorithm to compute the em inductive validity core (IVC) within a model necessary for inductive proofs of safety properties for sequential systems, based on the UNSAT core support built into current SMT solvers and a novel encoding of the inductive problem.
Abstract: Symbolic model checkers can construct proofs of properties over very complex models. However, the results reported by the tool when a proof succeeds do not generally provide much insight to the user. It is often useful for users to have traceability information related to the proof: which portions of the model were necessary to construct it. This traceability information can be used to diagnose a variety of modeling problems such as overconstrained axioms and underconstrained properties, and can also be used to measure completeness of a set of requirements over a model. In this paper, we present a new algorithm to efficiently compute the em inductive validity core (IVC) within a model necessary for inductive proofs of safety properties for sequential systems. The algorithm is based on the UNSAT core support built into current SMT solvers and a novel encoding of the inductive problem to try to generate a minimal inductive validity core. We prove our algorithm is correct, and describe its implementation in the JKind model checker for Lustre models. We then present an experiment in which we benchmark the algorithm in terms of speed, diversity of produced cores, and minimality, with promising results.

36 citations

Proceedings ArticleDOI
09 Nov 2015
TL;DR: This work proposes a methodology to allow developers to determine (and correct) what it is that they have verified, and tools to support that methodology, based on a novel variation of mutation analysis and the idea of verification driven by falsification.
Abstract: Formal verification has advanced to the point that developers can verify the correctness of small, critical modules. Unfortunately, despite considerable efforts, determining if a "verification" verifies what the author intends is still difficult. Previous approaches are difficult to understand and often limited in applicability. Developers need verification coverage in terms of the software they are verifying, not model checking diagnostics. We propose a methodology to allow developers to determine (and correct) what it is that they have verified, and tools to support that methodology. Our basic approach is based on a novel variation of mutation analysis and the idea of verification driven by falsification. We use the CBMC model checker to show that this approach is applicable not only to simple data structures and sorting routines, and verification of a routine in Mozilla's JavaScript engine, but to understanding an ongoing effort to verify the Linux kernel Read-Copy-Update (RCU) mechanism.

22 citations


Cites background from "Are We There Yet? Determining the A..."

  • ...Recent efforts of most interest have focused on measuring checked coverage [48], [49], [50], where a metric tries to make sure the code under test potentially changes the value of an assert, using dynamic slicing [51], [52]....

    [...]

Journal ArticleDOI
TL;DR: Adding observability tends to improve efficacy over satisfaction of the traditional criteria, with average improvements of 125.98 percent in mutation detection with the common output-only test oracle and per-model improvements of up to 1760.52 percent.
Abstract: Test adequacy criteria are widely used to guide test creation However, many of these criteria are sensitive to statement structure or the choice of test oracle This is because such criteria ensure that execution reaches the element of interest, but impose no constraints on the execution path after this point We are not guaranteed to observe a failure just because a fault is triggered To address this issue, we have proposed the concept of observability —an extension to coverage criteria based on Boolean expressions that combines the obligations of a host criterion with an additional path condition that increases the likelihood that a fault encountered will propagate to a monitored variable Our study, conducted over five industrial systems and an additional forty open-source systems, has revealed that adding observability tends to improve efficacy over satisfaction of the traditional criteria, with average improvements of 12598 percent in mutation detection with the common output-only test oracle and per-model improvements of up to 176052 percent Ultimately, there is merit to our hypothesis—observability reduces sensitivity to the choice of oracle and to the program structure

8 citations


Cites background from "Are We There Yet? Determining the A..."

  • ...Recent work presents a stronger notion of coverage, checked coverage, which counts only statements whose execution contributes to an outcome checked by an oracle [75], [76]....

    [...]

Journal ArticleDOI
11 Jul 2018
TL;DR: This work proposes a methodology to allow developers to determine (and correct) what it is that they have verified, and tools to support that methodology, based on a novel variation of mutation analysis and the idea of verification driven by falsification.
Abstract: Formal verification has advanced to the point that developers can verify the correctness of small, critical modules. Unfortunately, despite considerable efforts, determining if a “verification” verifies what the author intends is still difficult. Previous approaches are difficult to understand and often limited in applicability. Developers need verification coverage in terms of the software they are verifying, not model checking diagnostics. We propose a methodology to allow developers to determine (and correct) what it is that they have verified, and tools to support that methodology. Our basic approach is based on a novel variation of mutation analysis and the idea of verification driven by falsification. We use the CBMC model checker to show that this approach is applicable not only to simple data structures and sorting routines, and verification of a routine in Mozilla’s JavaScript engine, but to understanding an ongoing effort to verify the Linux kernel read-copy-update mechanism. Moreover, we show that despite the probabilistic nature of random testing and the tendency to incompleteness of testing as opposed to verification, the same techniques, with suitable modifications, apply to automated test generation as well as to formal verification. In essence, it is the number of surviving mutants that drives the scalability of our methods, not the underlying method for detecting faults in a program. From the point of view of a Popperian analysis where an unkilled mutant is a weakness (in terms of its falsifiability) in a “scientific theory” of program behavior, it is only the number of weaknesses to be examined by a user that is important.

7 citations


Cites background from "Are We There Yet? Determining the A..."

  • ...Recent efforts of most interest have focused on measuring checked coverage (Schuler and Zeller 2011, 2013; Murugesan et al. 2015), where a metric tries to make sure the code under test potentially changes the value of an assert, using dynamic slicing (Zhang et al. 2003; Tip 1995)....

    [...]

  • ...Recent efforts of most interest have focused on measuring checked coverage [82,83,77], where a metric tries to make sure the code under test potentially changes the value of an assert, using dynamic slicing [91,87]....

    [...]

Journal ArticleDOI
TL;DR: A comprehensive treatment of a suite of algorithms to compute inductive validity cores (IVCs), minimal sets of model elements necessary to construct inductive proofs of safety properties for sequential systems, and a substantial experiment in which the efficiency and efficacy of the algorithms are benchmarked.
Abstract: Symbolic model checkers can construct proofs of properties over highly complex models. However, the results reported by the tool when a proof succeeds do not generally provide much insight to the user. It is often useful for users to have traceability information related to the proof: which portions of the model were necessary to construct it. This traceability information can be used to diagnose a variety of modeling problems such as overconstrained axioms and underconstrained properties, measure completeness of a set of requirements over a model, and assist with design optimization given a set of requirements for an existing or synthesized implementation. In this paper, we present a comprehensive treatment of a suite of algorithms to compute inductive validity cores (IVCs), minimal sets of model elements necessary to construct inductive proofs of safety properties for sequential systems. The algorithms are based on the UNSAT core support built into current SMT solvers and novel encodings of the inductive problem to generate approximate and guaranteed minimal inductive validity cores as well as all inductive validity cores. We demonstrate that our algorithms are correct, describe their implementation in the JKind model checker for Lustre models, and present several use cases for the algorithms. We then present a substantial experiment in which we benchmark the efficiency and efficacy of the algorithms.

4 citations


Cites background from "Are We There Yet? Determining the A..."

  • ...We would like to thank Mona Rahimi for discussions that led to the initial IVC idea, John Backes and Anitha Murugesan for discussions on various aspects of IVCs, and Lucas Wagner for his work integrating them into the Spear requirements tool....

    [...]

  • ...Recent work by Murugesan [93] and Schuller [94] attempts to utilize properties when determining test cover-...

    [...]

  • ...Recent work by Murugesan [93] and Schuller [94] attempts to utilize properties when determining test coverage towards ensuring adequate requirements....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: Program slicing as mentioned in this paper is a method for automatically decomposing programs by analyzing their data flow and control flow. But it is not a technique for finding statement-minimal slices, as it is in general unsolvable, but using data flow analysis is sufficient to find approximate slices.
Abstract: Program slicing is a method for automatically decomposing programs by analyzing their data flow and control flow. Starting from a subset of a program's behavior, slicing reduces that program to a minimal form which still produces that behavior. The reduced program, called a ``slice,'' is an independent program guaranteed to represent faithfully the original program within the domain of the specified subset of behavior. Some properties of slices are presented. In particular, finding statement-minimal slices is in general unsolvable, but using data flow analysis is sufficient to find approximate slices. Potential applications include automatic slicing tools for debuggng and parallel processing of slices.

3,163 citations

01 Sep 2008
TL;DR: Applications of program slicing are surveyed, ranging from its first use as a debugging technique to current applications in property verification using finite state models, and a summary of research challenges for the slicing community is discussed.
Abstract: Program slicing is a decomposition technique that slides program components not relevant to a chosen computation, referred to as a slicing criterion. The remaining components form an executable program called a slice that computes a projection of the original programpsilas semantics. Using examples coupled with fundamental principles, a tutorial introduction to program slicing is presented. Then applications of program slicing are surveyed, ranging from its first use as a debugging technique to current applications in property verification using finite state models. Finally, a summary of research challenges for the slicing community is discussed.

2,595 citations

Journal ArticleDOI
TL;DR: In many cases tests of a program that uncover simple errors are also effective in uncovering much more complex errors, so-called coupling effect can be used to save work during the testing process.
Abstract: In many cases tests of a program that uncover simple errors are also effective in uncovering much more complex errors. This so-called coupling effect can be used to save work during the testing process.

2,047 citations

Proceedings ArticleDOI
01 Jun 1990
TL;DR: This paper investigates the concept of the dynamic slice consisting of all statements that actually affect the value of a variable occurrence for a given program input, and introduces the economical concept of a Reduced Dynamic Dependence Graph, proportional in size to the number of dynamic slices arising during the program execution.
Abstract: Program slices are useful in debugging, testing, maintenance, and understanding of programs. The conventional notion of a program slice, the static slice, is the set of all statements that might affect the value of a given variable occurrence. In this paper, we investigate the concept of the dynamic slice consisting of all statements that actually affect the value of a variable occurrence for a given program input. The sensitivity of dynamic slicing to particular program inputs makes it more useful in program debugging and testing than static slicing. Several approaches for computing dynamic slices are examined. The notion of a Dynamic Dependence Graph and its use in computing dynamic slices is discussed. The Dynamic Dependence Graph may be unbounded in length; therefore, we introduce the economical concept of a Reduced Dynamic Dependence Graph, which is proportional in size to the number of dynamic slices arising during the program execution.

1,138 citations

Journal ArticleDOI
Eugene Miya1
TL;DR: The software engineering baccalaureate program consists of a rigorous curriculum of science, math, computer science, and software engineering courses.
Abstract: Software engineers work on multidisciplinary teams to identify and develop software solutions and to maintain software intensive systems of all sizes. The focus of this program is on the rigorous engineering practices necessary to build, maintain, and protect modern software intensive systems. Consistent with this focus, the software engineering baccalaureate program consists of a rigorous curriculum of science, math, computer science, and software engineering courses.

1,124 citations

Frequently Asked Questions (12)
Q1. What is the observability-based code coverage metric?

The observability-based code coverage metric (OCCOM) attaches tags to internal states in a circuit and the propagation of tags is used to predict the actual propagation of errors (corrupted state) [9,11]. 

The observability coverage can be used to determine whether erroneous effects that are activated by the inputs can be observed at the outputs. 

The model of the ALARM subsystem was developed as a multi-level hierarchical state machine using the Mathworks Simulink/Stateflow tool. 

Their recommendation is to first augment the test suite with tests that exercise additional parts of the code, then try to identify missing requirements, and finally measure the requirements coverage with the augmented test cases. 

Any program statements that read or write variables used in the assertion, as well as program statements computed by transitive closure of the reads and writes, are part of the dynamic slice. 

The third case example is a microwave’s controller system used in their previous work [28], that was also modeled as hierarchical state machines using the MathWorks Stateflow notation. 

In recent work, a metric proposed by Schuler and Zeller in [29,30] addresses observability, but does so in a post-priori way: given a test suite and a set of requirements specified as assertions, it uses dynamic backward slicing from the requirements (assertions) to determine the set of program statements that affect the evaluation of the requirement. 

More accurate techniques forinformation flow modeling, such as [35], define path conditions to prove noninterference, that is, the non-observability of a variable or expression on a particular output. 

The authors consider three different systems: a medical device controller, an avionics system controller and a general appliance controller. 

For the Docking example, the authors generated a random test suite using the Reactis tool and another test suite with high structural coverage using MathWorks Simulink Design Verifier (SDV) [21] . 

For software, dynamic taint analysis, or dynamic information flow analysis, marks and tracks data in a program at runtime in order to determine observability. 

The values for variables Hazard := 3 and Disable Audio := 2 cause the antecedent in the requirement (Hazard >= 3 ∧ Disable Audio = 0) to be false; hence, the consequent of the requirement (Audio Command = 1) is not evaluated.