Boolean Feature Discovery in Empirical Learning
Giulia Pagallo,David Haussler +1 more
Reads0
Chats0
TLDR
Two new methods that adaptively introduce relevant features while learning a decision tree from examples are presented, showing empirically that these methods outperform a standard decision tree algorithm for learning small random DNF functions when the examples are drawn at random from the uniform distribution.Abstract:
We investigate the problem of learning Boolean functions with a short DNF representation using decision trees as a concept description language. Unfortunately, Boolean concepts with a short description may not have a small decision tree representation when the tests at the nodes are limited to the primitive attributes. This representational shortcoming may be overcome by using Boolean features at the decision nodes. We present two new methods that adaptively introduce relevant features while learning a decision tree from examples. We show empirically that these methods outperform a standard decision tree algorithm for learning small random DNF functions when the examples are drawn at random from the uniform distribution.read more
Citations
More filters
Book ChapterDOI
Fast effective rule induction
TL;DR: This paper evaluates the recently-proposed rule learning algorithm IREP on a large and diverse collection of benchmark problems, and proposes a number of modifications resulting in an algorithm RIPPERk that is very competitive with C4.5 and C 4.5rules with respect to error rates, but much more efficient on large samples.
Journal ArticleDOI
Feature Selection for Classification
Manoranjan Dash,Huan Liu +1 more
TL;DR: This survey identifies the future research areas in feature selection, introduces newcomers to this field, and paves the way for practitioners who search for suitable methods for solving domain-specific real-world applications.
Selection of relevant features and examples in machine
Avrim L. Bluma,Pat Langley +1 more
TL;DR: A survey of machine learning methods for handling data sets containing large amounts of irrelevant information can be found in this article, where the authors focus on two key issues: selecting relevant features and selecting relevant examples.
Journal ArticleDOI
Selection of relevant features and examples in machine learning
Avrim Blum,Pat Langley +1 more
TL;DR: This survey reviews work in machine learning on methods for handling data sets containing large amounts of irrelevant information and describes the advances that have been made in both empirical and theoretical work in this area.
Book ChapterDOI
Irrelevant features and the subset selection problem
TL;DR: A method for feature subset selection using cross-validation that is applicable to any induction algorithm is described, and experiments conducted with ID3 and C4.5 on artificial and real datasets are discussed.
References
More filters
Journal ArticleDOI
Classification and Regression Trees.
Journal ArticleDOI
Induction of Decision Trees
TL;DR: In this paper, an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, is described, and a reported shortcoming of the basic algorithm is discussed.
Book
Classification and regression trees
TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Proceedings ArticleDOI
A theory of the learnable
TL;DR: This paper regards learning as the phenomenon of knowledge acquisition in the absence of explicit programming, and gives a precise methodology for studying this phenomenon from a computational viewpoint.
Book
Estimation of Dependences Based on Empirical Data
TL;DR: In this article, the Big Picture of Inference: Direct Inference Instead of Generalization (INFI) instead of generalization (2000-2010) is presented. But this is not the case in this paper.