scispace - formally typeset
Journal ArticleDOI

Medical diagnosis with C4.5 rule preceded by artificial neural network ensemble

Zhi-Hua Zhou, +1 more
- Vol. 7, Iss: 1, pp 37-42
TLDR
Case studies on diabetes, hepatitis, and breast cancer show that C4.5 Rule-PANE could generate rules with strong generalization ability, which benefits from an artificial neural network ensemble, and strong comprehensibility, whichbenefits from rule induction.
Abstract
Comprehensibility is very important when machine learning techniques are used in computer-aided medical diagnosis. Since an artificial neural network ensemble is composed of multiple artificial neural networks, its comprehensibility is worse than that of a single artificial neural network. In this paper, C4.5 Rule-PANE, which combines an artificial neural network ensemble with rule induction by regarding the former as a preprocess of the latter, is proposed. At first, an artificial neural network ensemble is trained. Then, a new training data set is generated by feeding the feature vectors of original training instances to the trained ensemble and replacing the expected class labels of original training instances with the class labels output from the ensemble. Additional training data may also be appended by randomly generating feature vectors and combining them with their corresponding class labels output from the ensemble. Finally, a specific rule induction approach, i.e., C4.5 Rule, is used to learn rules from the new training data set. Case studies on diabetes, hepatitis , and breast cancer show that C4.5 Rule-PANE could generate rules with strong generalization ability, which benefits from an artificial neural network ensemble, and strong comprehensibility, which benefits from rule induction.

read more

Citations
More filters
Journal ArticleDOI

Exploratory Undersampling for Class-Imbalance Learning

TL;DR: Experiments show that the proposed algorithms, BalanceCascade and EasyEnsemble, have better AUC scores than many existing class-imbalance learning methods and have approximately the same training time as that of under-sampling, which trains significantly faster than other methods.
Journal ArticleDOI

Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples

TL;DR: Case studies on three medical data sets and a successful application to microcalcification detection for breast cancer diagnosis show that undiagnosed samples are helpful in building CAD systems, and Co-Forest is able to enhance the performance of the hypothesis that is learned on only a small amount of diagnosed samples by utilizing the available undiognosed samples.
Proceedings ArticleDOI

Exploratory Under-Sampling for Class-Imbalance Learning

TL;DR: Experiments show that the proposed algorithms, BalanceCascade and EasyEnsemble, have better AUC scores than many existing class-imbalance learning methods and have approximately the same training time as that of under-sampling, which trains significantly faster than other methods.
Journal ArticleDOI

Using Three Machine Learning Techniques for Predicting Breast Cancer Recurrence

TL;DR: The present research studied the application of data mining techniques to develop predictive models for breast cancer recurrence in patients who were followed-up for two years, and found that the predicted accuracy of the DT model is the lowest of all.

Predicting breast cancer survivability using data mining techniques

TL;DR: This paper investigated three data mining techniques: the Naive Bayes, the back-propagated neural network, and the C4.5 decision tree algorithms, and found out that C 4.5 algorithm has a much better performance than the other two techniques.
References
More filters
Book

An introduction to the bootstrap

TL;DR: This article presents bootstrap methods for estimation, using simple arguments, with Minitab macros for implementing these methods, as well as some examples of how these methods could be used for estimation purposes.
Journal ArticleDOI

Learning representations by back-propagating errors

TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.
Journal ArticleDOI

Bagging predictors

Leo Breiman
TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.

Programs for Machine Learning

TL;DR: In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments, which will be a welcome addition to the library of many researchers and students.
Related Papers (5)