scispace - formally typeset
Search or ask a question

Showing papers by "James A. Jones published in 2011"


Proceedings ArticleDOI
17 Jul 2011
TL;DR: It is found that the influence of multiple faults was not as great as expected, created a negligible effect on the effectiveness of the fault localization, and was often even complimentary to the fault-localization effectiveness.
Abstract: This paper presents an empirical study on the effects of the quantity of faults on statistical, coverage-based fault localization techniques. The former belief was that the effectiveness of fault-localization techniques was inversely proportional to the quantity of faults. In an attempt to verify these beliefs, we conducted a study on three programs varying in size on more than 13,000 multiple-fault versions. We found that the influence of multiple faults (1) was not as great as expected, (2) created a negligible effect on the effectiveness of the fault localization, and (3) was often even complimentary to the fault-localization effectiveness. In general, even in the presence of many faults, at least one fault was found by the fault-localization technique with high effectiveness. We also found that some faults were localizable regardless of the presence of other faults, whereas other faults' ability to be found by these techniques varied greatly in the presence of other faults. Because almost all real-world software contains multiple faults, these results impact the use of statistical fault-localization techniques and provide a greater understanding of their potential in practice.

86 citations


Proceedings ArticleDOI
25 Sep 2011
TL;DR: An in-depth study of the effects of the interaction of faults within a program, which shows four significant types of interaction, with one type — faults obscuring the results of other faults — as the most prevalent type.
Abstract: Multiple faults in a program can interact to form new behaviors in a program that would not be realized if the program were to contain the individual faults. This paper presents an in-depth study of the effects of the interaction of faults within a program. Many researchers attempt to ameliorate the effects of faulty programs. Unfortunately, such researchers are left to rely upon intuition about fault behavior due to the paucity of formalized studies of faults and their behavior. In an attempt to advance the understanding of faults and their behavior, we conducted a study of fault interaction across six subjects with more than 65,000 multiple-fault versions. The results of our study show four significant types of interaction, with one type — faults obscuring the effects of other faults — as the most prevalent type. The prevalence of obscuring faults' effects has an adverse effect on many automated software-engineering techniques, such as regression-testing, fault-localization, and fault-clustering techniques. Given that software commonly contains more than a single fault, these results have implications for developers and researchers alike by informing them of expected complications, which in many instances are opposite to intuition.

35 citations


Proceedings ArticleDOI
06 Nov 2011
TL;DR: A new fault-localization technique designed for applications that interact with a relational database that uses dynamic information specific to the application's database, such as Structured Query Language (SQL) commands, to provide a fault-location diagnosis.
Abstract: This paper presents a new fault-localization technique designed for applications that interact with a relational database. The technique uses dynamic information specific to the application's database, such as Structured Query Language (SQL) commands, to provide a fault-location diagnosis. By creating statement-SQL tuples and calculating their suspiciousness, the presented method lets the developer identify the database commands and the program statements likely to cause the failures. The technique also calculates suspiciousness for statement-attribute tuples and uses this information to identify SQL fragments that are statistically likely to be responsible for the suspiciousness of that SQL command. The paper reports the results of two empirical studies. The first study compares existing and database-aware fault-localization methods, and reveals the strengths and limitations of prior techniques, while also highlighting the effectiveness of the new approach. The second study demonstrates the benefits of using database information to improve understanding and reduce manual debugging effort.

32 citations


Proceedings ArticleDOI
18 Jul 2011
TL;DR: The method defines a mapping between relational database elements and Daikon's notion of program points and variable observations, thus enabling row-level and column-level invariant detection and an implementation of the approach that leverages theDaikon dynamic-invariant engine is presented.
Abstract: Despite the many automated techniques that benefit from dynamic invariant detection, to date, none are able to capture and detect dynamic invariants at the interface of a program and its databases. This paper presents a dynamic invariant detection method for relational databases and for programs that use relational databases and an implementation of the approach that leverages the Daikon dynamic-invariant engine. The method defines a mapping between relational database elements and Daikon's notion of program points and variable observations, thus enabling row-level and column-level invariant detection. The paper also presents the results of two empirical evaluations on four fixed data sets and three subject programs. The first study shows that dynamically detecting and inferring invariants in a relational database is feasible and 55% of the invariants produced for each subject are meaningful. The second study reveals that all of these meaningful invariants are schema-enforceable using standards-compliant databases and many can be checked by databases with only limited schema constructs.

24 citations


Proceedings ArticleDOI
06 Nov 2011
TL;DR: A model of the process and options for generating a graph that links every line of code with its corresponding previous revision through the history of the software project is presented and an approach for automating it is presented.
Abstract: To perform a number of tasks such as inferring design rationale from past code changes or assessing developer expertise for a software feature or bug, the evolution of a set of lines of code can be assessed by mining software histories. However, determining the evolution of a set of lines of code is a manual and time consuming process. This paper presents a model of this process and an approach for automating it. We call this process History Slicing. We describe the process and options for generating a graph that links every line of code with its corresponding previous revision through the history of the software project. We then explain the method and options for utilizing this graph to determine the exact revisions that contain changes for the lines of interest and their exact position in each revision. Finally, we present some preliminary results which show initial evidence that our automated technique can be several orders of magnitude faster than the manual approach and require that developers examine up to two orders of magnitude less code in extracting such histories.

22 citations


Proceedings ArticleDOI
03 Nov 2011
TL;DR: A scalable, statement-level visualization that shows related code in a way that supports human interpretation of clustering and context is presented that is applicable to many software-engineering tasks through the utilization and visualization of problem-specific meta-data.
Abstract: This paper presents a scalable, statement-level visualization that shows related code in a way that supports human interpretation of clustering and context. The visualization is applicable to many software-engineering tasks through the utilization and visualization of problem-specific meta-data. The visualization models statement-level code relations from a system-dependence-graph model of the program being visualized. Dynamic, run-time information is used to augment the static program model to further enable visual cluster identification and interpretation. In addition, we performed a user study of our visualization on an example program domain. The results of the study show that our new visualization successfully revealed relevant context to the programmer participants.

11 citations


Proceedings ArticleDOI
06 Nov 2011
TL;DR: Techniques for providing static and dynamic relations among program elements that can be used as the basis for the exploration of a program when attempting to understand the nature of faults are presented.
Abstract: This paper provides techniques for aiding developers' task of familiarizing themselves with the context of a fault. Many fault-localization techniques present the software developer with a subset of the program to inspect in order to aid in the search for faults that cause failures. However, typically, these techniques do not describe how the components of the subset relate to each other in a way that enables the developer to understand how these components interact to cause failures. These techniques also do not describe how the subset relates to the rest of the program in a way that enables the developer to understand the context of the subset. In this paper, we present techniques for providing static and dynamic relations among program elements that can be used as the basis for the exploration of a program when attempting to understand the nature of faults.

4 citations