scispace - formally typeset
Search or ask a question

Showing papers by "Thomas J. Ostrand published in 2004"


Proceedings ArticleDOI
01 Jul 2004
TL;DR: A negative binomial regression model using information from previous releases has been developed and used to predict the numbers of faults for a large industrial inventory system, and was extremely accurate.
Abstract: The ability to predict which files in a large software system are most likely to contain the largest numbers of faults in the next release can be a very valuable asset. To accomplish this, a negative binomial regression model using information from previous releases has been developed and used to predict the numbers of faults for a large industrial inventory system. The files of each release were sorted in descending order based on the predicted number of faults and then the first 20% of the files were selected. This was done for each of fifteen consecutive releases, representing more than four years of field usage. The predictions were extremely accurate, correctly selecting files that contained between 71% and 92% of the faults, with the overall average being 83%. In addition, the same model was used on data for the same system's releases, but with all fault data prior to integration testing removed. The prediction was again very accurate, ranging from 71% to 93%, with the average being 84%. Predictions were made for a second system, and again the first 20% of files accounted for 83% of the identified faults. Finally, a highly simplified predictor was considered which correctly predicted 73% and 74% of the faults for the two systems.

303 citations


Proceedings ArticleDOI
01 Jan 2004
TL;DR: The goal is to produce an automated tool that mines the project defect tracking system and that can be used by testers without requiring any particular statistical expertise or subjective judgements.
Abstract: In earlier research we identified characteristics of files in large software systems that tend to make them particularly likely to contain faults. We then developed a statistical model that uses historical fault information and file characteristics to predict which files of a system are likely to contain the largest numbers of faults. Testers can use that information to prioritize their testing and focus their efforts to make the testing process more efficient and the resulting software more dependable. In this paper we describe a proposed new tool to automate this prediction process, and discuss issues involved in its design and implementation. The goal is to produce an automated tool that mines the project defect tracking system and that can be used by testers without requiring any particular statistical expertise or subjective judgements.

39 citations