scispace - formally typeset
Open Access

Programs for Machine Learning

TLDR
In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments, which will be a welcome addition to the library of many researchers and students.
Abstract
Algorithms for constructing decision trees are among the most well known and widely used of all machine learning methods. Among decision tree algorithms, J. Ross Quinlan's ID3 and its successor, C4.5, are probably the most popular in the machine learning community. These algorithms and variations on them have been the subject of numerous research papers since Quinlan introduced ID3. Until recently, most researchers looking for an introduction to decision trees turned to Quinlan's seminal 1986 Machine Learning journal article [Quinlan, 1986]. In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments. As such, this book will be a welcome addition to the library of many researchers and students.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

Protein classification with multiple algorithms

TL;DR: A comparative evaluation of several algorithms that learn such classification models from data concerning patterns of proteins with known structure is presented, and several approaches that combine multiple learning algorithms to increase the accuracy of predictions are evaluated.
Book ChapterDOI

Logistic Regression and Boosting for Labeled Bags of Instances

TL;DR: This paper upgrades linear logistic regression and boosting to multi-instance data, where each example consists of a labeled bag of instances, and presents empirical results for artificial data generated according to the underlying generative model that is assumed.
Book ChapterDOI

Pairwise preference learning and ranking

TL;DR: The main objective of this work is to investigate the trade-off between the quality of the induced ranking function and the computational complexity of the algorithm, both depending on the amount of preference information given for each example.
Proceedings ArticleDOI

Augmenting API documentation with insights from stack overflow

TL;DR: SISE, a novel machine learning based approach that uses as features the sentences themselves, their formatting, their question, their answer, and their authors as well as part-of-speech tags and the similarity of a sentence to the corresponding API documentation, resulted in the highest number of sentences that were considered to add useful information not found in the API documentation.
Proceedings Article

Discretizing continuous attributes while learning Bayesian networks

TL;DR: A method for learning Bayesian networks that handles the discretization of continuous variables as an integral part of the learning process is introduced, using a new metric based on the Minimal Description Length principle for choosing the threshold values for theDiscretization while learning the Bayesian network structure.
References
More filters
Journal ArticleDOI

Induction of Decision Trees

J. R. Quinlan
- 25 Mar 1986 - 
TL;DR: In this paper, an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, is described, and a reported shortcoming of the basic algorithm is discussed.
Book

Classification and regression trees

Leo Breiman
TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Journal ArticleDOI

An Empirical Comparison of Pruning Methods for Decision Tree Induction

TL;DR: This paper compares five methods for pruning decision trees, developed from sets of examples, and shows that three methods—critical value, error complexity and reduced error—perform well, while the other two may cause problems.
Book ChapterDOI

Unknown attribute values in induction

TL;DR: This paper compares the effectiveness of several approaches to the development and use of decision tree classifiers as measured by their performance on a collection of datasets.