Programs for Machine Learning

Open Access

Programs for Machine Learning

TLDR

In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments, which will be a welcome addition to the library of many researchers and students.

Abstract:

Algorithms for constructing decision trees are among the most well known and widely used of all machine learning methods. Among decision tree algorithms, J. Ross Quinlan's ID3 and its successor, C4.5, are probably the most popular in the machine learning community. These algorithms and variations on them have been the subject of numerous research papers since Quinlan introduced ID3. Until recently, most researchers looking for an introduction to decision trees turned to Quinlan's seminal 1986 Machine Learning journal article [Quinlan, 1986]. In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments. As such, this book will be a welcome addition to the library of many researchers and students.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Multivariate convex regression with adaptive partitioning

Lauren A. Hannah, +1 more

- 01 Jan 2013 -

Journal of Machine Learning Research

TL;DR: This work introduces convex adaptive partitioning (CAP), which creates a globally convex regression model from locally linear estimates fit on adaptively selected covariate partitions and demonstrates empirical performance by comparing the performance of CAP to other shape-constrained and unconstrained regression methods for predicting weekly wages and value function approximation for pricing American basket options.

...read moreread less

Journal ArticleDOI

A Distance Measure Approach to Exploring the Rough Set Boundary Region for Attribute Reduction

Neil Mac Parthaláin, +2 more

- 01 Mar 2010 -

IEEE Transactions on Knowledge and Data ...

TL;DR: This paper examines a rough set FS technique which uses the information gathered from both the lower approximation dependency value and a distance metric which considers the number of objects in the boundary region and the distance of those objects from the lower approximations.

...read moreread less

Journal ArticleDOI

Classifying imbalanced data sets using similarity based hierarchical decomposition

Cigdem Beyan, +1 more

- 01 May 2015 -

Pattern Recognition

TL;DR: A new hierarchical decomposition method for imbalanced data sets which is different from previously proposed solutions to the class imbalance problem and does not require any data pre-processing step as many other solutions need.

...read moreread less

Journal ArticleDOI

Protein classification with imbalanced data.

Xing-Ming Zhao, +3 more

- 12 Dec 2007 -

Proteins

TL;DR: Generally, protein classification is a multi‐class classification problem and can be reduced to a set of binary classification problems, where one classifier is designed for each class, but in this case the number of proteins in one class is usually much smaller than that of the proteins outside the class.

...read moreread less

Journal ArticleDOI

Creating Evolving User Behavior Profiles Automatically

Jose Antonio Iglesias, +3 more

- 01 May 2012 -

IEEE Transactions on Knowledge and Data ...

TL;DR: This paper combines the evolving classifier with a trie-based user profiling to obtain a powerful self-learning online scheme and develops further the recursive formula of the potential of a data point to become a cluster center using cosine distance.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Classification and Regression Trees.

John Van Ryzin, +4 more

- 01 Mar 1986 -

Journal of the American Statistical Asso...

Journal ArticleDOI

Induction of Decision Trees

J. R. Quinlan

- 25 Mar 1986 -

Machine Learning

TL;DR: In this paper, an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, is described, and a reported shortcoming of the basic algorithm is discussed.

...read moreread less

Book

Classification and regression trees

Leo Breiman

TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

...read moreread less

Journal ArticleDOI

An Empirical Comparison of Pruning Methods for Decision Tree Induction

John Mingers

- 01 Nov 1989 -

Machine Learning

TL;DR: This paper compares five methods for pruning decision trees, developed from sets of examples, and shows that three methods—critical value, error complexity and reduced error—perform well, while the other two may cause problems.

...read moreread less

Book ChapterDOI

Unknown attribute values in induction

J. R. Quinlan

TL;DR: This paper compares the effectiveness of several approaches to the development and use of decision tree classifiers as measured by their performance on a collection of datasets.

...read moreread less

Related Papers (5)

C4.5: Programs for Machine Learning

J. Ross Quinlan

Induction of Decision Trees

J. R. Quinlan

- 25 Mar 1986 -

Machine Learning

Programs for Machine Learning

Citations

Multivariate convex regression with adaptive partitioning

A Distance Measure Approach to Exploring the Rough Set Boundary Region for Attribute Reduction

Classifying imbalanced data sets using similarity based hierarchical decomposition

Protein classification with imbalanced data.

Creating Evolving User Behavior Profiles Automatically

References

Classification and Regression Trees.

Induction of Decision Trees

Classification and regression trees

An Empirical Comparison of Pruning Methods for Decision Tree Induction

Unknown attribute values in induction

Related Papers (5)

C4.5: Programs for Machine Learning

Induction of Decision Trees

Data Mining: Practical Machine Learning Tools and Techniques

Random Forests

Bagging predictors