scispace - formally typeset
Proceedings ArticleDOI

Is it a bug or an enhancement?: a text-based approach to classify change requests

Reads0
Chats0
TLDR
This paper investigates whether the text of the issues posted in bug tracking systems is enough to classify them into corrective maintenance and other kinds of activities and shows that alternating decision trees, naive Bayes classifiers, and logistic regression can be used to accurately distinguish bugs from other kinds.
Abstract
Bug tracking systems are valuable assets for managing maintenance activities. They are widely used in open-source projects as well as in the software industry. They collect many different kinds of issues: requests for defect fixing, enhancements, refactoring/restructuring activities and organizational issues. These different kinds of issues are simply labeled as "bug" for lack of a better classification support or of knowledge about the possible kinds.This paper investigates whether the text of the issues posted in bug tracking systems is enough to classify them into corrective maintenance and other kinds of activities.We show that alternating decision trees, naive Bayes classifiers, and logistic regression can be used to accurately distinguish bugs from other kinds of issues. Results from empirical studies performed on issues for Mozilla, Eclipse, and JBoss indicate that issues can be classified with between 77% and 82% of correct decisions.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Machine learning

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Proceedings ArticleDOI

What makes a good bug report

TL;DR: The CUEZILLA prototype is a tool that measures the quality of new bug reports and recommends which elements should be added to improve the quality, and discusses several recommendations for better bug tracking systems which should focus on engaging bug reporters, better tool support, and improved handling of bug duplicates.
Proceedings ArticleDOI

An extensive comparison of bug prediction approaches

TL;DR: The absence of an established benchmark makes it hard, if not impossible, to compare approaches to bug prediction, and this paper aims to address this problem.
Journal ArticleDOI

Evaluating defect prediction approaches: a benchmark and an extensive comparison

TL;DR: The results indicate that, while some approaches perform better than others in a statistically significant manner, external validity in defect prediction is still an open problem, as generalizing results to different contexts/learners proved to be a partially unsuccessful endeavor.
Proceedings ArticleDOI

AR-miner: mining informative reviews for developers from mobile app marketplace

TL;DR: This work presents “AR-Miner” — a novel computational framework for App Review Mining, which performs comprehensive analytics from raw user reviews by first extracting informativeuser reviews by filtering noisy and irrelevant ones, then grouping the informative reviews automatically using topic modeling, and finally presenting the groups of most “informative” reviews via an intuitive visualization approach.
References
More filters
Book

Case Study Research: Design and Methods

Robert K. Yin
TL;DR: In this article, buku ini mencakup lebih dari 50 studi kasus, memberikan perhatian untuk analisis kuantitatif, membahas lebah lengkap penggunaan desain metode campuran penelitian, and termasuk wawasan metodologi baru.
Book

An introduction to the bootstrap

TL;DR: This article presents bootstrap methods for estimation, using simple arguments, with Minitab macros for implementing these methods, as well as some examples of how these methods could be used for estimation purposes.
Journal ArticleDOI

Applied Logistic Regression.

TL;DR: Applied Logistic Regression, Third Edition provides an easily accessible introduction to the logistic regression model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables.
Journal ArticleDOI

Learning representations by back-propagating errors

TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.
Book

C4.5: Programs for Machine Learning

TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.
Related Papers (5)