scispace - formally typeset
Journal ArticleDOI

Induction of ripple-down rules applied to modeling large databases

TLDR
A methodology for the modeling of large data sets is described which results in rule sets having minimal inter-rule interactions, and being simply maintained.
Abstract
A methodology forthe modeling of large data sets is described which results in rule sets having minimal inter-rule interactions, and being simply maintained. An algorithm for developing such rule sets automatically is described and its efficacy shown with standard test data sets. Comparative studies of manual and automatic modeling of a data set of some nine thousand five hundred cases are reported. A study is reported in which ten years of patient data have been modeled on a month by month basis to determine how well a diagnostic system developed by automated induction would have performed had it been in use throughout the project.

read more

Citations
More filters
Book

Data Mining: Practical Machine Learning Tools and Techniques

TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
Journal ArticleDOI

Do we need hundreds of classifiers to solve real world classification problems

TL;DR: The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in theTop-20, respectively).
Proceedings ArticleDOI

Interactive machine learning

TL;DR: An interactive machine-learning (IML) model that allows users to train, classify/view and correct the classifications and the Crayons tool embodies the notions of interactive machine learning is proposed.
Journal ArticleDOI

Predicting students' final performance from participation in on-line discussion forums

TL;DR: To determine how the selection of instances and attributes, the use of different classification algorithms and the date when data is gathered affect the accuracy and comprehensibility of the prediction, a new Moodle module for gathering forum indicators was developed and different executions were carried out.
References
More filters
Book

C4.5: Programs for Machine Learning

TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.
Book

Classification and regression trees

Leo Breiman
TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Book

Numerical Recipes in C: The Art of Scientific Computing

TL;DR: Numerical Recipes: The Art of Scientific Computing as discussed by the authors is a complete text and reference book on scientific computing with over 100 new routines (now well over 300 in all), plus upgraded versions of many of the original routines, with many new topics presented at the same accessible level.

Programs for Machine Learning

TL;DR: In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments, which will be a welcome addition to the library of many researchers and students.