scispace - formally typeset
Journal ArticleDOI

Cross-Project and Within-Project Semisupervised Software Defect Prediction: A Unified Approach

Reads0
Chats0
TLDR
A unified and effective solution for both CSDP and WSDP problems is provided and a cost-sensitive kernelized semisupervised dictionary learning (CKSDL) approach is proposed that outperforms state-of-the-art WSDP methods, using unlabeled cross-project defect data can help improve the WSDP performance, and CKSDL generally obtains significantly better prediction performance than related SSDP methods in the CSDP scenario.
Abstract
When there exist not enough historical defect data for building an accurate prediction model, semisupervised defect prediction (SSDP) and cross-project defect prediction (CPDP) are two feasible solutions. Existing CPDP methods assume that the available source data are well labeled. However, due to expensive human efforts for labeling a large amount of defect data, usually, we can only utilize the suitable unlabeled source data. We call CPDP in this scenario as cross-project semisupervised defect prediction (CSDP). Although some within-project semisupervised defect prediction (WSDP) methods have been developed in recent years, there still exists much room for improvement on prediction performance. In this paper, we aim to provide a unified and effective solution for both CSDP and WSDP problems. We introduce the semisupervised dictionary learning technique and propose a cost-sensitive kernelized semisupervised dictionary learning (CKSDL) approach. CKSDL can make full use of the limited labeled defect data and a large amount of unlabeled data in the kernel space. In addition, CKSDL considers the misclassification costs in the dictionary learning process. Extensive experiments on 16 projects indicate that CKSDL outperforms state-of-the-art WSDP methods, using unlabeled cross-project defect data can help improve the WSDP performance, and CKSDL generally obtains significantly better prediction performance than related SSDP methods in the CSDP scenario.

read more

Citations
More filters
Journal ArticleDOI

Software Defect Prediction via Attention-Based Recurrent Neural Network

TL;DR: This paper proposes a framework called defect prediction via attention-based recurrent neural network (DP-ARNN), which first parses abstract syntax trees of programs and extracts them as vectors and employs the attention mechanism to further generate significant features for accurate defect prediction.
Journal ArticleDOI

Seml: A Semantic LSTM Model for Software Defect Prediction

TL;DR: Seml, a novel framework that combines word embedding and deep learning methods for defect prediction, is proposed, which outperforms three state-of-the-art defect prediction approaches on most of the datasets for both within-project defect prediction and cross- project defect prediction.
Journal ArticleDOI

Revisiting Supervised and Unsupervised Methods for Effort-Aware Cross-Project Defect Prediction

TL;DR: According to the results on 82 projects in terms of 11 performance measures, it is found that the supervised CPDP methods are more promising than the unsupervised method in practical application scenarios, since the limitation of testing resource and the impact on developers cannot be ignored in these scenarios.
Journal ArticleDOI

Effort-aware and just-in-time defect prediction with neural network.

TL;DR: Evaluation results on a well-known data set suggest that the proposed deep learning based approach for effort-aware just-in-time defect prediction outperforms the state-of-the-art approaches on each of the subject projects.
Journal ArticleDOI

Multiview Transfer Learning for Software Defect Prediction

TL;DR: A heterogeneous data orienting multiview transfer learning for software defect prediction, denoted as MTDP, which can achieve different dimensions and granularities features to automatically learn labels through neural network models is proposed.
References
More filters
Journal Article

Statistical Comparisons of Classifiers over Multiple Data Sets

TL;DR: A set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers is recommended: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparisons of more classifiers over multiple data sets.
Journal ArticleDOI

Robust Face Recognition via Sparse Representation

TL;DR: This work considers the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise, and proposes a general classification algorithm for (image-based) object recognition based on a sparse representation computed by C1-minimization.
Proceedings ArticleDOI

Face recognition using eigenfaces

TL;DR: An approach to the detection and identification of human faces is presented, and a working, near-real-time face recognition system which tracks a subject's head and then recognizes the person by comparing characteristics of the face to those of known individuals is described.
Book ChapterDOI

Kernel Principal Component Analysis

TL;DR: A new method for performing a nonlinear form of Principal Component Analysis by the use of integral operator kernel functions is proposed and experimental results on polynomial feature extraction for pattern recognition are presented.
Journal ArticleDOI

A Systematic Literature Review on Fault Prediction Performance in Software Engineering

TL;DR: Although there are a set of fault prediction studies in which confidence is possible, more studies are needed that use a reliable methodology and which report their context, methodology, and performance comprehensively.
Related Papers (5)