Open Access
Programs for Machine Learning
Steven L. Salzberg,Alberto Segre +1 more
TLDR
In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments, which will be a welcome addition to the library of many researchers and students.Abstract:
Algorithms for constructing decision trees are among the most well known and widely used of all machine learning methods. Among decision tree algorithms, J. Ross Quinlan's ID3 and its successor, C4.5, are probably the most popular in the machine learning community. These algorithms and variations on them have been the subject of numerous research papers since Quinlan introduced ID3. Until recently, most researchers looking for an introduction to decision trees turned to Quinlan's seminal 1986 Machine Learning journal article [Quinlan, 1986]. In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments. As such, this book will be a welcome addition to the library of many researchers and students.read more
Citations
More filters
Journal ArticleDOI
Heuristic decision making in medicine
TL;DR: Those features of heuristics that make them useful in health care settings are outlined, including their surprising accuracy, transparency, and wide accessibility, as well as the low costs and little time required to employ them.
Journal ArticleDOI
Detecting unknown malicious code by applying classification techniques on OpCode patterns
TL;DR: The imbalance problem is investigated, referring to several real-life scenarios in which malicious files are expected to be about 10% of the total inspected files, and a chronological evaluation showed a clear trend in which the performance improves as the training set is more updated.
Proceedings ArticleDOI
Newsjunkie: providing personalized newsfeeds via analysis of information novelty
TL;DR: Newsjunkie is described, a system that personalizes news for users by identifying the novelty of stories in the context of stories they have already reviewed, and employs novelty-analysis algorithms that represent articles as words and named entities.
Proceedings ArticleDOI
Redundancy based feature selection for microarray data
TL;DR: The relationship between feature relevance and redundancy is studied and an efficient method that can effectively remove redundant genes is proposed that has been demonstrated through an empirical study using public microarray data sets.
Journal ArticleDOI
Test-cost-sensitive attribute reduction
TL;DR: This paper points out that when tests must be undertaken in parallel, attribute reduction is mandatory in dealing with the minimal test cost reduct problem, and proposes a framework for a heuristic algorithm to deal with the new problem.
References
More filters
Journal ArticleDOI
Classification and Regression Trees.
Journal ArticleDOI
Induction of Decision Trees
TL;DR: In this paper, an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, is described, and a reported shortcoming of the basic algorithm is discussed.
Book
Classification and regression trees
TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Journal ArticleDOI
An Empirical Comparison of Pruning Methods for Decision Tree Induction
TL;DR: This paper compares five methods for pruning decision trees, developed from sets of examples, and shows that three methods—critical value, error complexity and reduced error—perform well, while the other two may cause problems.
Book ChapterDOI
Unknown attribute values in induction
TL;DR: This paper compares the effectiveness of several approaches to the development and use of decision tree classifiers as measured by their performance on a collection of datasets.