What measures are commonly used in the software effort prediction literature?

The Bayesian network model’s prediction accuracy is evaluated using some accu-2racy measures, which are commonly found in the software effort prediction literature [16,24].

How does the network predict the posterior probability distribution of CHANGE?

After the batch learning, the network predicts the posterior probability distribution of CHANGE for each case in the corresponding test subset, by computing the joint probability distribution.

Why is the Med.Ab.Res. chosen as a measure of the central tendency?

The Med.Ab.Res. is chosen to be a measure of the central tendency because the residual distribution is usually skewed in software datasets.

How many classes were chosen by random sampling without replacement?

Approximately a two-third of the cases in each dataset is chosen by random sampling without replacement using a function provided in a statistical software package, SPSS 11.0.

What is the prediction accuracy for the UIMS dataset?

For the UIMS dataset, the Bayesian network model has achieved significantly better prediction accuracy than the regression tree model and the multiple linear regression models.

What are the results of the Wilcoxon signed-rank tests of the MRE values?

The Wilcoxon signed-rank tests of the MRE values have also confirmed strong evidence that the Bayesian network model’s MMRE value is significantly lower and thus, better than those of the other models.

What is the definition of a Bayesian network?

From this point of view, Bayesian networks can be considered as a network of events connected by the probabilistic dependencies between them.

(Open Access) An application of Bayesian network for predicting object-oriented software maintainability (2006) | C. van Koten

Q: What are the future works mentioned in the paper "An application of bayesian network for predicting object-oriented software maintainability" ?

Those findings have also confirmed that Bayesian network is indeed a useful modelling technique for software maintainability prediction, although further studies are required to realize the full potential as well as the limitation. This provides an interesting 16 direction for future studies. The results in this paper also suggest that the prediction accuracy of the Bayesian network model may vary depending on the characteristics of dataset and/or the prediction accuracy measure used.

An Application of Bayesian Network for Predicting

Object-Oriented Software Maintainability

Chikako van Koten

Andrew Gray

The Information Science

Discussion Paper Series

Number 2005/02

March 2005

ISSN 1172-6024

University of Otago

Department of Information Science

The Department of Information Science is one of six departments that make up the

School of Business at the University of Otago. The department offers courses of study

leading to a major in Information Science within the BCom, BA and BSc degrees. In

addition to undergraduate teaching, the department is also strongly involved in post-

graduate research programmes leading to MCom, MA, MSc and PhD degrees. Re-

search projects in spatial information processing, connectionist-based information sys-

tems, software engineering and software development, information engineering and

database, software metrics, distributed information systems, multimedia information

systems and information systems security are particularly well supported.

The views expressed in this paper are not necessarily those of the department as a

whole. The accuracy of the information presented in this paper is the sole responsibil-

ity of the authors.

poses is granted on the condition that the authors and the Series are given due ac-

knowledgment. Reproduction in any form for purposes other than research or teach-

ing is forbidden unless prior written permission has been obtained from the authors.

Correspondence

This paper represents work to date and may not necessarily form the basis for the au-

thors’ ﬁnal conclusions relating to this topic. It is likely, however, that the paper will ap-

pear in some form in a journal or in conference proceedings in the near future. The au-

thors would be pleased to receive correspondence in connection with any of the issues

raised in this paper, or for subsequent publication details. Please write directly to the

authors at the address provided below. (Details of ﬁnal journal/conference publication

venues for these papers are also provided on the Department’s publications web pages:

http://www.otago.ac.nz/informationscience/pubs/). Any other correspondence con-

cerning the Series should be sent to the DPS Coordinator.

Department of Information Science

University of Otago

P O Box 56

Dunedin

NEW ZEALAND

Fax: +64 3 479 8311

email: dps@infoscience.otago.ac.nz

www: http://www.otago.ac.nz/informationscience/

An application of Baye sian network for

predicting object-oriented software

maintainability

C. van Koten

and A.R. Gray

Department of Information Science,University of Otago, P.O.Box 56, Dunedin,

New Zealand

Abstract

As the number of object-oriented software systems increases, it becomes more im-

portant for organizations to maintain those systems eﬀec tively. However, currently

only a small number of maintainability prediction models are available for object-

oriented s ystem s. This paper presents a Bayesian network maintainability predic-

tion model for an object-oriented software system. The model is constructed using

object-oriented metric data in Li and Henry’s datasets, which were collected from

two diﬀerent object- oriented systems. Prediction accuracy of the model is evaluated

and compared with commonly used regression-based models. The results suggest

that the Bayesian network model can predict maintainability more accurately than

the regression-based models for one system, and almost as accurately as the best

regression-based model for the other system.

Key words: Object-oriented systems, Maintainability, Bayesian network,

Regression tree, Regression

1 Introduction

It is arguable that many object-oriented (OO) software systems are currently

in use. It is also arguable that the growing popularity of OO programming lan-

guages, such as Java, as well as the increasing number of software development

tools supporting the Uniﬁed Modelling Language (UML), encourages more OO

systems to be developed at present and in the future. Hence it is important

Corresponding author. Tel.: +64-3-479-8142; fax: +64-3-479-8311.

E-mail address: ckoten@infoscience.otago.ac.nz

Preprint submitted to Elsevier Science 27 February 2005

that those systems are maintained eﬀectively and eﬃciently. A software main-

tainability prediction model enables organizations to predict maintainability

of a software system and assists them with managing maintenance resource.

In addition, if an accurate maintainability prediction model is available for

a software system, a defensive design can be adopted. This would minimize,

or at least reduce future maintenance eﬀort of the system. Maintainability

of a software system can be measured in diﬀerent ways. In this paper, main-

tainability is measured as the number of changes made to the code during

a maintenance period. Alternatively, maintainability may be measured as ef-

fort to make those changes. When maintainability is measured as eﬀort, the

predictive model is called a maintenance eﬀort prediction mo del. It is unfortu-

nate that the number of software maintainability prediction models including

maintenance eﬀort prediction models, is currently very small in the literature.

Programming an OO software system is diﬀerent from programming a non-

OO system due to the concepts that are speciﬁc to the OO paradigm, for

example, objects, inheritance and encapsulation. This diﬀerence limits the ap-

plicability of well-known non-OO software eﬀort prediction models, such as

COCOMO [3], to OO software eﬀort prediction, as well as non-OO software

metrics, such as Function Points [1], to measuring the characteristics of OO

software systems [23]. Hence a number of new software metrics were proposed

speciﬁcally for OO systems. Some of those OO metrics were used to predict

maintainability of OO systems. Examples of the OO metrics are Chidamber

and Kemerer (C&K) metrics and Li and Henry (L&H) metrics [10,25]. It was

shown that the L&H metrics had a correlation with the number of changes

made to the code of the OO software system [25]. It was also shown that

multiple linear regression models consisting of the C&K, L&H and other OO

metrics were able to predict software maintenance eﬀort for some OO systems

[17].

This paper constructs an OO software maintainability prediction model using

a technique known as Bayesian network [14,20,22]. This technique allows a user

to construct a predictive model based on Bayesian probability theory [12]. An

application of Bayesian network to Software Engineering is currently limited

to a small number of studies of development eﬀort prediction [2,11,31,34] and

defect prediction [15,28]. However, Bayesian network can also be a promis-

ing new technique for OO software maintainability prediction. This is due to

the ability to explicitly represent uncertainty using probabilities, the ability

to incorporate existing human expert’s knowledge into empirical data, and

the ability to update the model when new information becomes available.

Hence this paper investigates a research problem of what prediction accuracy

a Bayesian network OO software maintainability prediction model can achieve.

The term prediction accuracy in this paper means how well a predictive model

constructed using known data can predict the outcomes of unknown data. The

Bayesian ne twork model’s prediction accuracy is evaluated using some accu-

racy measures, which are commonly found in the software eﬀort prediction

literature [16,24]. Those measures are absolute residuals, the magnitude of

relative error (MRE) and pred measures. Then, the Bayesian network mo del’s

prediction accuracy is compared with regression-based models, namely, a re-

gression tree [4] model and two diﬀerent types of multiple linear regression

models.

The structure of the reminder of this paper is as follows. Section 2 describes

the OO software datasets and the sampling method used. Section 3 describes

the B ayesian network OO software maintainability prediction model. This

is followed by Section 4, which describes the regression tree model and the

multiple linear regression models. Section 5 describes the prediction accuracy

measures used. Section 6 evaluates the Bayesian network model’s prediction

accuracy using those accuracy measures and compares it with the regression

tree model and multiple linear regression models. Finally Section 7 presents

conclusions and discussions about a direction of future studies.

2 OO software datasets

2.1 Characteristics of datasets

This paper uses OO software datasets published by Li and Henry [25]. The

datasets consist of ﬁve C&K metrics: DIT, NOC, RFC, LCOM and WMC, and

four L&H metrics: MPC, DAC, NOM and SIZE2, as well as SIZE1, which is a

traditional lines of code size metric. Those metric data were collected from a

total of 110 classes in two OO software systems: User Interface Management

System (UIMS) and Quality Evaluation System (QUES). The code was writ-

ten in Cl assi cal − Ada

T M

. The UIMS and QUES datasets contain 39 classes

and 71 classes, respectively. Maintainability was measured in CHANGE met-

ric by counting the number of lines in the code, which were changed during

a three-year maintenance period. Neither UIMS nor QUES datasets contain

actual maintenance eﬀort data. The de scription of each metric is given in

Table 1.

The descriptive statistics of the UIMS and QUES datasets are shown in Ta-

ble 2.

The Pearson’s correlation coeﬃcients between CHANGE and each of the OO

metrics are shown in Table 3.

Table 3 shows that there is a signiﬁcant correlation between CHANGE and the

OO metrics. However, Table 3 also shows that the correlations in the UIMS

An application of Bayesian network for predicting object-oriented software maintainability

Figures

Citations

Software engineering economics

A systematic review of software maintainability prediction and metrics

Predicting object-oriented software maintainability using multivariate adaptive regression splines

A Bayesian belief network for IT implementation decision support

Application of TreeNet in Predicting Object-Oriented Software Maintainability: A Comparative Study

References

Classification and Regression Trees.

Classification and regression trees

Software engineering economics

A metrics suite for object oriented design

Bayesian networks and decision graphs

Related Papers (5)

Predicting object-oriented software maintainability using multivariate adaptive regression splines

Object-oriented metrics that predict maintainability

A metrics suite for object oriented design

Application of neural networks for software quality prediction using object-oriented metrics

Using metrics to evaluate software system maintainability

Frequently Asked Questions (10)

Q1. What have the authors contributed in "An application of bayesian network for predicting object-oriented software maintainability" ?

Q2. What are the future works mentioned in the paper "An application of bayesian network for predicting object-oriented software maintainability" ?

Q3. What measures are commonly used in the software effort prediction literature?

Q4. How does the network predict the posterior probability distribution of CHANGE?

Q5. Why is the Med.Ab.Res. chosen as a measure of the central tendency?

Q6. How many classes were chosen by random sampling without replacement?

Q7. What is the prediction accuracy for the UIMS dataset?

Q8. Why is the Bayesian network model able to predict uncertainty?

Q9. What are the results of the Wilcoxon signed-rank tests of the MRE values?

Q10. What is the definition of a Bayesian network?