Learning with many irrelevant features

Open AccessProceedings Article

Learning with many irrelevant features

Hussein Almuallim, +1 more

- pp 547-552

Chats0

TLDR

It is shown that any learning algorithm implementing the MIN-FEATURES bias requires Θ(1/e ln 1/δ+ 1/e[2p + p ln n]) training examples to guarantee PAC-learning a concept having p relevant features out of n available features, and suggests that training data should be preprocessed to remove irrelevant features before being given to ID3 or FRINGE.

Abstract:

In many domains, an appropriate inductive bias is the MIN-FEATURES bias, which prefers consistent hypotheses definable over as few features as possible. This paper defines and studies this bias. First, it is shown that any learning algorithm implementing the MIN-FEATURES bias requires Θ(1/e ln 1/δ+ 1/e[2p + p ln n]) training examples to guarantee PAC-learning a concept having p relevant features out of n available features. This bound is only logarithmic in the number of irrelevant features. The paper also presents a quasi-polynomial time algorithm, FOCUS, which implements MIN-FEATURES. Experimental studies are presented that compare FOCUS to the ID3 and FRINGE algorithms. These experiments show that-- contrary to expectations--these algorithms do not implement good approximations of MIN-FEATURES. The coverage, sample complexity, and generalization performance of FOCUS is substantially better than either ID3 or FRINGE on learning problems where the MIN-FEATURES bias is appropriate. This suggests that, in practical applications, training data should be preprocessed to remove irrelevant features before being given to ID3 or FRINGE.

Citations

PDF

Open Access

More filters

Book

Data Mining: Concepts and Techniques

Jiawei Han, +2 more

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.

...read moreread less

Book

Data Mining: Practical Machine Learning Tools and Techniques

Ian H. Witten, +2 more

TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.

...read moreread less

Journal ArticleDOI

Wrappers for feature subset selection

Ron Kohavi, +1 more

- 01 Dec 1997 -

Artificial Intelligence

TL;DR: The wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain and compares the wrapper approach to induction without feature subset selection and to Relief, a filter approach tofeature subset selection.

...read moreread less

Book

Evolutionary algorithms for solving multi-objective problems

Gary B. Lamont, +1 more

TL;DR: This paper presents a meta-anatomy of the multi-Criteria Decision Making process, which aims to provide a scaffolding for the future development of multi-criteria decision-making systems.

...read moreread less

Correlation-based Feature Selection for Machine Learning

Mark Hall

TL;DR: This thesis addresses the problem of feature selection for machine learning through a correlation based approach with CFS (Correlation based Feature Selection), an algorithm that couples this evaluation formula with an appropriate correlation measure and a heuristic search strategy.

...read moreread less

Alselm Blumer, +3 more

- 06 Apr 1987 -

Information Processing Letters

TL;DR: It is shown that a polynomial learning algorithm, as defined by Valiant (1984), is obtained whenever there exists aPolynomial-time method of producing, for any sequence of observations, a nearly minimum hypothesis that is consistent with these observations.

...read moreread less

Learning with many irrelevant features

Citations

Data Mining: Concepts and Techniques

Data Mining: Practical Machine Learning Tools and Techniques

Wrappers for feature subset selection

Evolutionary algorithms for solving multi-objective problems

Correlation-based Feature Selection for Machine Learning

References

Induction of Decision Trees

Learnability and the Vapnik-Chervonenkis dimension

Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Generalization as search

Occam's razor

Related Papers (5)

Wrappers for feature subset selection

A Practical Approach to Feature Selection

Estimating attributes: analysis and extensions of RELIEF

C4.5: Programs for Machine Learning

Selection of relevant features and examples in machine learning