On the Accuracy of Spectrum-based Fault Localization
read more
Citations
A Survey on Software Fault Localization
A practical evaluation of spectrum-based fault localization
A model for spectra-based software diagnosis
Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs
Spectrum-Based Multiple Fault Localization
References
Basic concepts and taxonomy of dependable and secure computing
Basic Concepts and Taxonomy of Dependable and Secure Computing
Diagnosing multiple faults
Related Papers (5)
Frequently Asked Questions (17)
Q2. What have the authors stated for future works in "On the accuracy of spectrum-based fault localization" ?
In future work, the authors plan to study the influence of the granularity ( statement, function level ) of program spectra on the diagnostic accuracy of spectrum-based fault localization. Finally, their study was conducted using single fault programs, and further investigation is required to be able to generalize their findings to multiple fault programs.
Q3. Why is the quality of the error detection information important?
Because of its relation to test coverage, and to the error detection mechanism used to characterize runs as passed or failed, an important condition in this respect is the quality of the error detection information used in the analysis.
Q4. How many passed runs may improve diagnostic performance?
including up to 20 passed runs may improve but also degrade diagnostic performance, depending on the program and/or input data.
Q5. How does qd decrease when a block is touched often?
qd decreases only if the faulty block is touched often in passed runs, as spectrum-based fault localization works under de assumption that if a block is touched often in passed runs, it should be exonerated.
Q6. What is the simplest way to determine the likelihood of a fault?
Under the assumption that a high similarity to the error vector indicates a high probability that the corresponding parts of the software cause the detected errors, the calculated similarity coefficients rank the parts of the program with respect to their likelihood of containing the faults.
Q7. Why is the quality of the observations important?
Since most faults lead to errors only under specific input conditions, and as not all errors propagate to system failures, this parameter is relevant because error detection mechanisms are usually not ideal.
Q8. How many runs are available in the Siemens set?
Although the number of available runs in the Siemens set ranges from 1052 (tot info) to 5542 (replace), the number of runs that fail is comparatively small, ranging from a single run for tcas version 8, to 518 for print tokens version 2.
Q9. What is the definition of error in the if statement?
There is a fault (bug) in the swapping code within the body of the if statement: only the numerators of the rational numbers are swapped while the denominators are left in their original order.
Q10. How many runs are required to achieve near-optimum diagnostic accuracy?
the authors show that near-optimum diagnostic accuracy (exonerating around 80% of all code on average) is already obtained for low-quality (ambiguous) error observations, while, in addition, only a few runs are required.
Q11. How many observations can provide a near-optimal diagnosis?
The fact that a few observations can already provide a near-optimal diagnosis enables the application of spectrum-based fault localization methods within continuous (embedded) processing, where only limited observation horizons can be maintained.
Q12. Why did the authors not include its weight coefficient in their experiments?
although the authors have recognized that it uses hit spectra of method call sequences, the authors didn’t include its weight coefficient in their experiments because the calculated values are only used to collect evidence about classes, not to identify suspicious method call sequences.
Q13. What is the effect of adding more failed runs on the diagnostic accuracy?
the authors conclude that while accumulating more failed runs only improves the accuracy of the diagnosis, the effect of including more passed runs is unpredictable.
Q14. How many runs can compensate for weak error detection quality?
a large number of runs can apparently compensate weak error detection quality: even for small qe, a large amount of runs provides sufficient information for good diagnostic accuracy, as shown in Figure 4.
Q15. What is the effect of the parameter on the diagnostic accuracy?
Investigating the influence of this parameter will also help us to assess the potential gain of more powerful error detection mechanisms and better test coverage on diagnostic accuracy.
Q16. Why were the versions of schedule2 and replace not considered in their experiments?
Version 9 of schedule2 and version 32 of replace were not considered in their experiments because no test case fails and therefore theexistence of a fault was never revealed.
Q17. How much of a program needs to be inspected after diagnosis?
As an indication, in one of the experiments described in the present paper, on average 20% of a program still needs to be inspected after the diagnosis.