What have the authors contributed in "Towards identifying software project clusters with regard to defect prediction" ?

This paper describes an analysisthat was conducted on newly collected repository with 92 versions of 38 proprietary, open-source and academic projects. A preliminary study perfomed before showed the need for a further in-depth analysis in order to identify project clusters. The goal of this research is to perform clustering on software projects in order to identify groups of software projects with similar characteristic from the defect prediction point of view. The existence of those groups was investigated with statistical tests and by comparing the mean value of prediction efficiency. The first one was based on the projects that belong to a given group, and the second one on all the projects. If the predictions from the model based on projects that belong to the identified group are significantly better than the all-projects model ( the mean values were compared and statistical tests were used ), the authors conclude that the group really exists. The results of this work makes next step towards defining formal methods of reuse defect prediction models by identifying groups of projects within which the same defect prediction model may be used. Furthermore, a method of clustering was suggested and applied.

What have the authors stated for future works in "Towards identifying software project clusters with regard to defect prediction" ?

Further research is necessary to identify more clusters. The clusters that were identified are very wide and therefore it is possible that those clusters may be successfully divided into smaller ones. There may be conducted a cross validation for the study.

What was the level of automation in the testing process?

High level of automatization in the testing process (the data about testing process were not available for all releases) was applied in most cases, and in all of them SVN repositories were used as the source code version control system.

Why is it difficult to reproduce the study in anindustrial environment?

Reproducing the study in anindustrial environment is difficult because in order to construct the correlation vectors the information about defects (that one is going to predict) is necessary.

(Open Access) Towards identifying software project clusters with regard to defect prediction (2010) | Marian Jureczko

Towards identifying software project clusters with regard

to defect prediction

Marian Jureczko

Institute of Computer Engineering, Control and Robotics

Wrocław University of Technology

Wybrzeże Wyspiańskiego 27

50-370, Wrocław - Poland

+48 71 320 27 45

marian.jureczko@pwr.wroc.pl

Lech Madeyski

Institute of Informatics

Wrocław University of Technology

Wybrzeże Wyspiańskiego 27

50-370, Wrocław - Poland

lech.madeyski@pwr.wroc.pl

http://madeyski.e-informatyka.pl/

ABSTRACT

Background: This paper describes an analysisthat was conducted

on newly collected repository with 92 versions of 38 proprietary,

open-source and academic projects. A preliminary study

perfomed before showed the need for a further in-depth analysis

in order to identify project clusters.

Aims: The goal of this research is to perform clustering on

software projects in order to identify groups of software projects

with similar characteristic from the defect prediction point of

view. One defect prediction model should work well for all

projects that belong to such group. The existence of those groups

was investigated with statistical tests and by comparing the mean

value of prediction efficiency.

Method: Hierarchical and k-means clustering, as well as

Kohonen’s neural network was used to find groups of similar

projects. The obtained clusters were investigated with the

discriminant analysis. For each of the identified group a statistical

analysis has been conducted in order to distinguish whether this

group really exists. Two defect prediction models were created for

each of the identified groups. The first one was based on the

projects that belong to a given group, and the second one - on all

the projects. Then, both models were applied to all versions of

projects from the investigated group. If the predictions from the

model based on projects that belong to the identified group are

significantly better than the all-projects model (the mean values

were compared and statistical tests were used), we conclude that

the group really exists.

Results: Six different clusters were identified and the existence of

two of them was statistically proven: 1) cluster proprietary B –

T=19, p=0.035, r=0.40; 2) cluster proprietary/open – t(17)=3.18,

p=0.05, r=0.59. The obtained effect sizes (r) represent large

effects according to Cohen’s benchmark, which is a substantial

finding.

Conclusions: The two identified clusters were described and

compared with results obtained by other researchers. The results

of this work makes next step towards defining formal methods of

reuse defect prediction models by identifying groups of projects

within which the same defect prediction model may be used.

Furthermore, a method of clustering was suggested and applied.

Categories and Subject Descriptors

D.2.8 [Software Engineering]: Metrics – complexity measures,

product metrics, software science.

General Terms

Measurement

Keywords

Defect Prediction, Design Metrics, Size Metrics, Clustering.

INTRODUCTION

Testing of software systems is an activity that consumes time and

resources. Applying the same testing effort to all modules of a

system is not the optimal approach, because the distribution of

defects among individual parts of a software system is not

uniform. Therefore, testers should be able to identify fault-prone

classes. With such knowledge they would be able to prioritize the

tests, and therefore, work more efficiently. Accorindg to Weyuker

et al. [24,25] typically 20% of modules contain upwards of 80%

of defects. Testers with good defect predicator may be able to

spare a lot of test effort by testing only 20% of system modules

and still finding up to 80% of the software defects. Defect

prediction studies usually use historical data of previous versions

of software to build the defect prediction models. Such approach

can be applied neither in the first release of a software system, nor

by companies that do not collect historical data. Therefore, it is

vital to identify methods of constructing models that do not

require historical data is vital.

Considerable research has been performed on the defect

prediction methods; see the surveys by Purao and Vaishnavi [19],

or by Wahyudin et al. and [23], but the methods of reusing of

defect prediction model have not been discovered yet. There are

only works where the same model has been used in similar

projects (Watanabe et al. [22], Bell, Ostrand and Weyuker

[2,18,24] or Nagappan et al. [16]), but without identifying the

borders of similarity. According to the authors' knowledge there

are only two studies where cross project validation of defect

prediction models were performed [21, 26]; both are described in

the next section. The goal of this research is to fulfill that gap by

identifying clusters of software projects. Defect prediction in all

projects that belong to one cluster should be possible to make by

using only one defect prediction model. A preliminary study was

already conducted [11], where existence of three clusters was

investigated: proprietary projects, open-source projects and

academic projects. Only the defect prediction model created for

the open-source cluster was statistically better. Therefore, only

one cluster was proved to exist whereasit is extremely unlikely

that the other clusters do not exist. Further studies could reveal

other clusters, and it is also possible that the identified cluster

may be successfully split into several smaller clusters.

The paper is organized as follows: in Section 2 related works are

described. Section 3 presents the suite of OO metrics that were

used, the investigated projects, definition of the study and

discusses threats to validity of the study. The obtained results are

shown in Section 4. Conclusions are given in Section 6 and the

prospects for future research in Section 7.

RELATED WORKS

Typical approach in studies connected with defect prediction

models is to build a model according to data from an old version

of a project and then validate or use this model on a new version

of the same project. Such approach was used [2,8,17,18,24 ,25] as

well as advocated [5,23] by many researchers. Some experiments

were also reported where the cross-project reusability of a defect

prediction models was investigated.

Koru and Liu [12] came to interesting conclusions: “Normally,

defect prediction models will change from one development

environment to another according to specific defect patterns.” But

in their opinion, it does not mean that building generalizable

defect prediction model is not possible. In fact, such models may

be extremely useful and may serve as a starting point in

development environments that have no historical data.

Nagappan et al. [16] extended the state of the art through

analyzing whether predictors obtained from one project history

are applicable to other projects. The authors investigated five

proprietary software projects. The performed analyze showed that

there is no single set of metrics that fits to all five projects, but the

defect prediction models may be accurate when obtained form

similar projects (the similarity were not precisely defined). The

authors evaluated this problem by building one predictor for each

project and applying it to the entities of each of the other four

projects. Then the correlations between the actual and predicted

rankings were compared. It turned out that the projects histories

cannot serve as predictors for other projects in most cases. The

study was extended in [26], where 622 cross-project predictions

were performed for 12 real world applications. A project was

considered as a strong predictor for another project, when all

precision, recall, and accuracy were greater than 0.75. Only 21

cross-project validations satisfied this criterion – success rate

3.4%. Subsequently, guidelines that enable assessing the chance

of the success of a cross-project prediction were given. The

guidelines were summarized in a decision tree. The authors

constructed separate trees for assessing prediction precision,

recall, and accuracy, but only the tree for precision was given in

the paper.

Watanabe et al. [22] tried to apply in a C++ project a defect

prediction model that has been constructed according to the data

from a Java project. The reusability study in the opposite

direction was conducted as well. Sakura Editor and JEdit were

used as the investigated projects. Metrics from only one release

were collected, so the authors stratified 10-fold cross validation

model in order to count two metrics of models accuracy: precision

and recall. In intra project prediction they obtained precision

0.828 and 0.733 and recall 0.897 and 0.702. In inter project

prediction they obtained precision 0.872 and 0.622 and recall

0.596 and 0.402. According to obtained results, authors concluded

that in the case of a similar domain and a similar size, it is

possible to reuse the prediction model between languages; despite

the fact the precision/recall is not very high. The authors admitted

that their results were based on only two projects, so the

generality is not clear and in order to increase the generalization

level they were going to evaluate the reusability with other

projects whose domain is text editor.

Relevant to this study are experiments conducted by Ostrand et al.

[18], where two large industrial systems with separately seventeen

and nine releases were investigated. A negative binomial

regression model was used. The predictions were based on the

source code of current release, and fault and modification history

from previous release. The study was extended in [24] by

analyzing the third project (it increased the number of used

programming languages to ten). Applying the defect prediction

model to the third project gave good results – 20% of the files that

would contain the largest number of faults contained, on average,

83% of the faults. Further findings were presented in [25], where

the number of the investigated projects was increased to four.

According to the obtained results, the authors said: “Our

prediction methodology is designed for large industrial systems

with a succession of releases over years of development” but later

it “was successfully adapted to a system without release”.

However, it must be mentioned that Weyuker et al. used another

approach as the one that is presented in this paper. They had no

fixed model structure, the model equation was adjusted according

to data from the history of the analyzed system. Only the model

building procedure was fixed.

A comprehensive study of cross company defect prediction was

conducted by Turhan et al. [21]. Ten different software projects

were investigated. Turhan et al. concluded that there is no single

set of static code features (metrics) that may serve as defect

predictor for all software projects. The defect prediction models

effectiveness was measured using probability of detection (pd)

and probability of false alarm (pf). Cross company defect

prediction dramatically increased the pd as well as the pf. The

authors were also able to decrease the pf by applying the nearest

neighbor filtering. The similarity measure was the Euclidean

distance between the static code features. The project features that

may influence the effectiveness of cross company predictions

were not identified.

Wahyudin et al. [23] suggested a framework for defect prediction.

In the context of their framework they discussed the possibility of

reusing historical data in defect prediction for other projects. They

concluded that: “A prediction model models the context of a

particular project. As a consequence, predictors obtained from one

project are usually not applicable to other projects”. When the

predictors are applicable or whether there exist such groups of

projects within which one predicator may be applied to all

projects was not discussed.

STUDY DESIGN

Metrics and Tools

There is a number of size and complexity metrics that may be

used in defect prediction models. All metrics that are calculated

by the Ckjm

tool were used in thisstudy. The reported in [8]

version of ckjm was used. This is the version that calculated 19

metrics that has been reported as good quality indicators. Those

metrics were selected according to some reported experiments

[3,17] and own researches [9,10]. The utilized metrics comes

from several metrics suites.

The metrics suite suggested by Chidamber and Kemerer [4]:

• Weighted methods per class (WMC). The value of the WMC is

equal to the number of methods in the class (assuming unity

weights for all methods).

• Depth of Inheritance Tree (DIT). The DIT metric provides for

each class a measure of the inheritance levels from the object

hierarchy top.

• Number of Children (NOC). The NOC metric simply measures

the number of immediate descendants of the class.

• Coupling between object classes (CBO). The CBO metric

represents the number of classes coupled to a given class

(efferent couplings and afferent couplings). These couplings can

occur through method calls, field accesses, inheritance, method

arguments, return types, and exceptions.

• Response for a Class (RFC). The RFC metric measures the

number of different methods that can be executed when an object

of that class receives a message. Ideally, we would want to find,

for each method of the class, the methods that class will call, and

repeat this for each called method, calculating what is called the

transitive closure of the method call graph. This process can

however be both expensive and quite inaccurate. Ckjm calculates

a rough approximation to the response set by simply inspecting

method calls within the class method bodies. The value of RFC is

the sum of number of methods called within the class method

bodies and the number of class methods. This simplification was

also used in the original description of the metric.

• Lack of cohesion in methods (LCOM). The LCOM metric

counts the sets of methods in a class that are not related through

the sharing of some of the class fields. The original definition of

this metric (which is the one used in Ckjm) considers all pairs of

class methods. In some of these pairs both methods access at least

one common field of the class, while in other pairs the two

methods do not share any common field accesses. The lack of

cohesion in methods is then calculated by subtracting from the

number of method pairs that do not share a field access the

number of method pairs that do.

One metric suggested by Henderson-Sellers [6]:

• Lack of cohesion in methods (LCOM3).

m - number of methods in a class;

a - number of attributes in a class;

μ(A) - number of methods that access

the attribute A.

The metrics suite suggested by Bansiy and Davis [1]:

• Number of Public Methods (NPM). The NPM metric simply

counts all the methods in a class that are declared as public. The

metric is known also as Class Interface Size (CIS)

http://gromit.iiar.pwr.wroc.pl/p_inf/ckjm

• Data Access Metric (DAM). This metric is the ratio of the

number of private (protected) attributes to the total number of

attributes declared in the class.

• Measure of Aggregation (MOA). The metric measures the

extent of the part-whole relationship, realized by using attributes.

The metric is a count of the number of class fields whose types

are user defined classes.

• Measure of Functional Abstraction (MFA). This metric is the

ratio of the number of methods inherited by a class to the total

number of methods accessible by the member methods of the

class. The constructors and the java.lang.Object (as parent) are

ignored.

• Cohesion Among Methods of Class (CAM). This metric

computes the relatedness among methods of a class based upon

the parameter list of the methods. The metric is computed using

the summation of number of different types of method

parameters in every method divided by a multiplication of

number of different method parameter types in whole class and

number of methods.

The quality oriented extension to Chidamber & Kemerer metrics

suite suggested by Tang et al. [20]:

• Inheritance Coupling (IC). This metric provides the number of

parent classes to which a given class is coupled. A class is

coupled to its parent class if one of its inherited methods is

functionally dependent on the new or redefined methods in the

class. A class is coupled to its parent class if one of the following

conditions is satisfied:

- One of its inherited methods uses an attribute that is defined in

a new/redefined method.

- One of its inherited methods calls a redefined method.

- One of its inherited methods is called by a redefined method

and uses a parameter that is defined in the redefined method.

• Coupling Between Methods (CBM). The metric measures the

total number of new/redefined methods to which all the inherited

methods are coupled. There is a coupling when at least one of the

conditions given in the IC metric is held.

• Average Method Complexity (AMC). This metric measures the

average method size for each class. The size of a method is equal

to the number of Java binary codes in the method.

Two metrics suggested by Martin [15]:

• Afferent couplings (Ca). The Ca metric represents the number

of classes that depend upon the measured class.

• Efferent couplings (Ce). The Ca metric represents the number

of classes that the measured class is depended upon.

One McCabe's metric [14]:

• McCabe's cyclomatic complexity (CC). CC is equal to the

number of different paths in a method (function) plus one. The

cyclomatic complexity is defined as: CC = E–N+P; where E - the

number of edges of the graph, N - the number of nodes of the

graph, P - the number of connected components. CC is the only

method size metric. The constructed models make the class size

predictions. Therefore, the metric had to be converted to a class

size metric. Two metrics has been derived:

LCOM

−

⎟

⎠

⎞

⎜

⎝

⎛

∑

)(

- Max(CC) - the greatest value of CC among methods of the

investigated class.

- Avg(CC) - the arithmetic mean of the CC value in the

investigated class.

Those metrics were complemented with one more, very popular

metric:

• Lines of Code (LOC). The LOC metric calculates the number

of lines of code in the Java binary code of the class under

investigation.

The information about defects occurrence was collected with a

tool called BugInfo. BugInfo analyses the logs from source code

repository (SVN or CVS) and according to the log content decides

whether a commit is a bugfix. A commit is interpreted as a bugfix

when it solves an issue reported in the bug tracking system. Each

of the projects had been investigated in order to identify bugfixes

commenting guidelines that were used in the source code

repository. The guidelines were formalized in regular expressions.

Buginfo compares the regular expressions with comments of the

commits. When a comment matches the regular expression,

BugInfo increments the defect count for all classes that have been

modified in the commit. The BugInfo tool has had no official

release yet, but we are going to implement some improvements,

especially in the user interface, and then make an official release.

Its current version is available at:

http://kenai.com/projects/buginfo

. There is no formal evaluation

regarding the efficiency of this tool in mapping defects yet, but

comprehensive functional tests were conducted and many of the

tests are available as JUnit tests in the source code package. All

collected data is available online at:

http://purl.org/MarianJureczko/MetricsRepo

Investigated projects

48 releases of 15 open source projects were investigated: Apache

Ant (1.3 – 1.7), Apache Camel (1.0 – 1.6), Ckjm (1.8), Apache

Forrest (0.6 – 0.8), Apache Ivy (1.1 – 2.0), JEdit (3.2.1 – 4.3),

Apache Log4j (1.0 – 1.2), Apache Lucene (2.0 – 2.2), PBeans (1.0

and 2.0), Apache POI (1.5 – 3.0), Apache Synapse (1.0 – 1.2),

Apache Tomcat (6.0), Apache Velocity (1.4 – 1.6.1), Apache

Xalan-Java (2.4.0 – 2.7.0), Apache Xerces (1.1.0 – 1.4.4). A more

comprehensive discussion of most of those projects was given in

[8].

27 releases of 6 proprietary software projects were investigated.

Five of them are custom build solutions that had been already

successfully installed in the customer environment. Those five

projects belong to the same domain: insurances. The 6

proprietary project is a standard tool that supports quality

assurances in software development. All six projects were

developed by the same company.

Moreover, 17 academic software projects were investigated. Each

of them had exactly one release. Those projects were

implemented by 8

or 9

semester computer science students.

The students worked in groups of 3 to 6 persons during one year.

A highly iterative software development process was used. A

UML documentation was prepared and high level of test code

coverage were obtained for each of those projects. JUnit and

FitNesse were used as test tools. Some of those projects had been

already investigated in [9,10].

All of the investigated projects were written in Java.

Analysis method employed

It had been assumed that character of a defect predictor strongly

depends on the correlation between metrics and number of defects

in a class. A correlation vector was calculated for each of the

investigated releases of projects. The correlation between each of

metric (the metrics are given in 3.1) and the number of defects

were calculated. The vectors were than extended by adding the

ratio of defects per class.

In order to uncover the project clusters, hierarchical clustering

procedure and then k-means clustering were used. The complete

linkage clustering indicated a two-group solution. Additionally

Kohonen's neural network was used. The results returned by the

Kohonen's neural network differ between separate runs of the

network. Therefore, the network was executed several times and

those releases of projects that were predominantly classified into

the same neuron (cluster) were later investigated in order to

distinguish whether it is a cluster from the defect prediction point

of view. The obtained results were investigated with the

discriminant analysis. Several different configurations of the

Kohonen’s network with different number of the output neurons

were used, but no more than 4 clusters were obtained, even when

the number of output neurons was increased up to 16.

For each of the identified cluster a defect prediction model was

created. In order to create the model, all metrics were used and

the stepwise linear regression was applied. Due to the stepwise

regression, a typical model used five to ten metrics (not all of

them). Subsequently, the models were evaluated by being applied

to all releases of projects that belonged to the investigated cluster.

In order to evaluate the efficiency of predicting defects in a

release of project of a model, all classes that belong to the given

release were sorted according to the model output. Descending

predicted number of defects was used as sorting order. Next, the

number of classes that must be visited in order to find 80% of

defects were calculated and used as the model efficiency in

predicting defects in a given release of the project. A general

defect prediction model was build too. The general model used

data from all the releases of all the projects as training set. In

order to distinguish whether a cluster exists from the defect

prediction point of view the efficiency of a model created for the

cluster was compared with the efficiency of the general model.

Those two models were applied only to those releases of software

projects that belonged to the investigated cluster. When the

efficiency of the model created for the cluster is significantly

better than the efficiency of the general model one may assume

that the cluster exists. In order to investigate whether the

difference was significant, statistical test were used.

To render that in a more formally way, it is necessary to assume

that R is a set of all releases of all projects and r is a single release

of a project. C is a set of all r that were selected in a cluster. C is a

subset of R (C

⊂R). There are two defect prediction models M

and M

. M

is the general model that was trained with all r

∈

is a cluster model that was trained with all r∈C. E(M,r) is the

evaluation of efficiency of model M in predicting defects in

release r. Let c

, c

, …, c

be the classes from release r in

descending order of predicted defects according to the model M

and d

, d

, …, d

be the number of defects in each class. D

sum(d

, …, d

), i.e., the total defects in the first i classes. Let k be

the smallest index such that D

> 0.8*D

, then E(M

,r)= D

E(M

,r) and E(M

,r) were calculated for all r∈C. In order to

decide whether the cluster exists from the defect prediction point

of view a hypothesis must be defined:

– There is no difference in the efficiency of defect prediction

between the general model and the cluster model:

E(M

,r)=E(M

,r): r∈C.

– There is a difference in the efficiency of defect prediction

between the general model and the cluster model:

E(M

,r)>E(M

,r): r∈C.

The hypotheses are evaluated by the parametric t-test for

dependent samples. Following general assumptions should be

checked in order to use a parametric test: level of measurement

(the variables must be measured at the interval or ratio level

scale), independence of observations, homogeneity of variance

and the normal distribution of the sample. The homogeneity of

variance is checked by Levene's test, while the assumption that

the sample came from a normally distributed population is tested

by the Shapiro-Wilk test [13]. When some of the assumptions are

violated, the Wilcoxon matched pairs test is used.

There is an overlap between training and testing sets. In order to

avoid this overlap, a separate model must be created for each of

the releases from the investigated model: M

C-r

. In such case we

would get n different models (where n is the number of cluster

members) and each of the models would be using different set of

the releases as the training set. As a result, the definition of the

cluster would be fuzzy. On the other hand, excluding one release

from the training set affects the model very slightly. Therefore,

we decided to use the overlaping approach.

Threats to validity

A number of limitations that may compromise to some extent the

quality of the results of this study are listed below.

It is possible that there are mistakes in the defect identification.

The comments in the source code version control system are not

always well written and, therefore, it was sometimes very hard to

decide whether a change is connected with a defect or not. In

some cases the comment could be confronted with a bug tracking

system, but unfortunately it was not possible in all projects.

The defects are assigned to classes according to the bugfix date. It

could be probably better to assign the defect to the version, where

the defect has been found, but unfortunately, the source code

version control system does not contain such information.

We were not able to track operations like changing class name or

moving class between packages. Therefore, after such a change,

the class is interpreted as a new class. Similar difficulties were

created by anonymous classes. Hence, the anonymous classes

were ignored in the analysis.

The defects are identified according to the comments in the

source code version control system. The guidelines of

commenting bugfixes may vary among different projects.

Therefore, it is possible that interpretation of the term defect is

not unique among the investigated projects.

RESULTS

The results of two different approaches to clustering, using

hierarchical and k-means clustering as well as Kohonen's neural

network, are presented below.

Study 1 – two clusters

In the first study all the releases of all the projects were divided

into two clusters, since the complete linkage hierarchical

clustering has suggested the possibility of a “natural” partition

into two sets of projects. Hence, the k-means two group solution

is analyzed and the results are presented in Tables 1-3.

Table 1. Descriptive statistics – cluster 1

of 2

Num. of cases Mean Std deviation

E(M

,r): r

∈

61 49.73 19.64

E(M

,r): r

∈

61 49.67 18.37

Table 2. Hypothesis tests – cluster 1

of 2

E(M

,r): r∈C E(M

,r): r

∈

Shapiro

- Wilk

test

0.987 0.991

0.782 0.931

Levene's

test

118

F(1,df)

0.434

0.511

T-test

0.057

0.954

According to Tables 1-2, the cluster 1

of 2 does not exist from

the defect prediction point of view.

Table 3. Descriptive statistics – cluster 2

of 2

Num. of cases Mean Std deviation

E(M

,r): r

∈

31 47.18 17.80

E(M

,r): r

∈

31 47.41 17.29

According to Table 3, on average 47.18% of classes must be

tested in order to find 80% of defects when the general model is

used and 47.41% of classes when the 2

cluster model is used.

Therefore, the mean efficiency of the 2

cluster model was worse

than the mean efficiency of the general model. In consequence,

there is no point in testing the hypothesis.

The conducted analysis showed that none of the two investigated

clusters exists in the defect prediction point of view.

Study 2 – Kohonen's neural network

In the second approach Kohonen's neural network was used. Four

clusters were identified according to the network's output. There

are releases that were classified into none of those clusters.

Towards identifying software project clusters with regard to defect prediction

Figures

Citations

An Empirical Comparison of Model Validation Techniques for Defect Prediction Models

Automated parameter optimization of classification techniques for defect prediction models

An investigation on the feasibility of cross-project defect prediction

The Impact of Automated Parameter Optimization on Defect Prediction Models

Software Defect Prediction via Convolutional Neural Network

References

A metrics suite for object oriented design

A complexity measure

A Complexity Measure

A critique of software defect prediction models

A hierarchical model for object-oriented design quality assessment

Related Papers (5)

On the relative value of cross-company and within-company data for defect prediction

Cross-project defect prediction: a large scale experiment on data vs. domain vs. process

Data Mining Static Code Attributes to Learn Defect Predictors

A metrics suite for object oriented design

Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings

Frequently Asked Questions (15)

Q1. What have the authors contributed in "Towards identifying software project clusters with regard to defect prediction" ?

Q2. What have the authors stated for future works in "Towards identifying software project clusters with regard to defect prediction" ?

Q3. What is the general assumption that should be checked in order to use a parametric test?

Q4. How was the efficiency of predicting defects in a release of a model evaluated?

Q5. What was the criterion for a strong predictor for another project?

Q6. What is the typical approach in studies connected with defect prediction models?

Q7. What is the test used to check the homogeneity of variance?

Q8. What is the metric used to compute the relativeness between methods of a class?

Q9. What is the metric for lack of cohesion in methods?

Q10. How many classes were visited in order to find 80% of defects?

Q11. What was the level of automation in the testing process?

Q12. Why is it difficult to reproduce the study in anindustrial environment?

Q13. What is the metric used to determine whether a commit is a bugfix?

Q14. How many classes are tested in order to find 80% of defects?

Q15. What is the author's opinion on the defect prediction methodology?