scispace - formally typeset
Proceedings ArticleDOI

Identifying reasons for software changes using historic databases

Mockus, +1 more
- pp 120-130
Reads0
Chats0
TLDR
From this study several suggestions are arrived at on how to make version control data useful in diagnosing the state of a software project, without significantly increasing the overhead for the developer using the change management system.
Abstract
Large scale software products must constantly change in order to adapt to a changing environment. Studies of historic data from legacy software systems have identified three specific causes of this change: adding new features; correcting faults; and restructuring code to accommodate future changes. Our hypothesis is that a textual description field of a change is essential to understanding why that change was performed. Also, we expect that difficulty, size, and interval would vary strongly across different types of changes. To test these hypotheses we have designed a program which automatically classifies maintenance activity based on a textual description of changes. Developer surveys showed that the automatic classification was in agreement with developer opinions. Tests of the classifier on a different product found that size and interval for different types of changes did not vary across two products. We have found strong relationships between the type and size of a change and the time required to carry it out. We also discovered a relatively large amount of perfective changes in the system we examined. From this study we have arrived at several suggestions on how to make version control data useful in diagnosing the state of a software project, without significantly increasing the overhead for the developer using the change management system.

read more

Citations
More filters
Journal ArticleDOI

An empirical study of speed and communication in globally distributed software development

TL;DR: This work uses both data from the source code change management system and survey data to model the extent of delay in a distributed software development organization and explores several possible mechanisms for this delay.
Journal ArticleDOI

Mining version histories to guide software changes

TL;DR: Data mining is applied to version histories in order to guide programmers along related changes: "Programmers who changed these functions also changed".
Journal ArticleDOI

When do changes induce fixes

TL;DR: In a first investigation of the MOZILLA and ECLIPSE history, it turns out that fix-inducing changes show distinct patterns with respect to their size and the day of week they were applied.
Proceedings ArticleDOI

Mining version histories to guide software changes

TL;DR: The ROSE prototype can correctly predict further locations to be changed and show up item coupling that is undetectable by program analysis, and can prevent errors due to incomplete changes.
Proceedings ArticleDOI

Predicting faults using the complexity of code changes

TL;DR: A case study shows that the change complexity metrics proposed are better predictors of fault potential in comparison to other well-known historical predictor of faults, i.e., prior modifications and prior faults.
References
More filters
Journal ArticleDOI

A Coefficient of agreement for nominal Scales

TL;DR: In this article, the authors present a procedure for having two or more judges independently categorize a sample of units and determine the degree, significance, and significance of the units. But they do not discuss the extent to which these judgments are reproducible, i.e., reliable.
Book

Generalized Linear Models

TL;DR: In this paper, a generalization of the analysis of variance is given for these models using log- likelihoods, illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables), and gamma (variance components).
Journal ArticleDOI

Generalized Linear Models

Eric R. Ziegel
- 01 Aug 2002 - 
TL;DR: This is the Ž rst book on generalized linear models written by authors not mostly associated with the biological sciences, and it is thoroughly enjoyable to read.
Journal ArticleDOI

Generalized linear models. 2nd ed.

TL;DR: A class of statistical models that generalizes classical linear models-extending them to include many other models useful in statistical analysis, of particular interest for statisticians in medicine, biology, agriculture, social science, and engineering.
Journal ArticleDOI

Generalized Linear Models

TL;DR: Generalized linear models, 2nd edn By P McCullagh and J A Nelder as mentioned in this paper, 2nd edition, New York: Manning and Hall, 1989 xx + 512 pp £30
Related Papers (5)