Open Access
Programs for Machine Learning
Steven L. Salzberg,Alberto Segre +1 more
TLDR
In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments, which will be a welcome addition to the library of many researchers and students.Abstract:
Algorithms for constructing decision trees are among the most well known and widely used of all machine learning methods. Among decision tree algorithms, J. Ross Quinlan's ID3 and its successor, C4.5, are probably the most popular in the machine learning community. These algorithms and variations on them have been the subject of numerous research papers since Quinlan introduced ID3. Until recently, most researchers looking for an introduction to decision trees turned to Quinlan's seminal 1986 Machine Learning journal article [Quinlan, 1986]. In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments. As such, this book will be a welcome addition to the library of many researchers and students.read more
Citations
More filters
Journal ArticleDOI
A New Sequential Covering Strategy for Inducing Classification Rules With Ant Colony Algorithms
TL;DR: This paper proposes a new sequential covering strategy for ACO classification algorithms to mitigate the problem of rule interaction, where the order of the rules is implicitly encoded as pheromone values and the search is guided by the quality of a candidate list of rules.
Proceedings ArticleDOI
Parameterized generation of labeled datasets for text categorization based on a hierarchical directory
TL;DR: ACCIO as discussed by the authors is a system for automatically acquiring labeled datasets for text categorization from the World Wide Web, by capitalizing on the body of knowledge encoded in the structure of existing hierarchical directories such as the Open Directory.
Journal ArticleDOI
Combining Information Extraction Systems Using Voting and Stacked Generalization
TL;DR: Investigation of the effectiveness of voting and stacked generalization in the context of information extraction (IE) finds that both voting and stacking work better when relying on probabilistic estimates by the base-level systems.
Journal ArticleDOI
Applying Machine Learning Toward an Automatic Classification of It
TL;DR: After a survey of previous treatments of the pronoun it in the literature, some features of instances of it' are proposed that can be used in a novel memory-based learning method to automatically classify those instances.
Multi-perspective process mining
TL;DR: This thesis addresses research challenges in which a multi-perspective view on processes is needed and that look beyond the control-flow perspective, which defines the sequence of activities of a process.
References
More filters
Journal ArticleDOI
Classification and Regression Trees.
Journal ArticleDOI
Induction of Decision Trees
TL;DR: In this paper, an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, is described, and a reported shortcoming of the basic algorithm is discussed.
Book
Classification and regression trees
TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Journal ArticleDOI
An Empirical Comparison of Pruning Methods for Decision Tree Induction
TL;DR: This paper compares five methods for pruning decision trees, developed from sets of examples, and shows that three methods—critical value, error complexity and reduced error—perform well, while the other two may cause problems.
Book ChapterDOI
Unknown attribute values in induction
TL;DR: This paper compares the effectiveness of several approaches to the development and use of decision tree classifiers as measured by their performance on a collection of datasets.