A review of feature selection methods on synthetic data

doi:10.1007/S10115-012-0487-8

Journal ArticleDOI

A review of feature selection methods on synthetic data

Verónica Bolón-Canedo, +2 more

- 01 Mar 2013 -

Knowledge and Information Systems

- Vol. 34, Iss: 3, pp 483-519

Chats0

TLDR

Several synthetic datasets are employed for this purpose, aiming at reviewing the performance of feature selection methods in the presence of a crescent number or irrelevant features, noise in the data, redundancy and interaction between attributes, as well as a small ratio between number of samples and number of features.

Abstract:

With the advent of high dimensionality, adequate identification of relevant features of the data has become indispensable in real-world scenarios. In this context, the importance of feature selection is beyond doubt and different methods have been developed. However, with such a vast body of algorithms available, choosing the adequate feature selec- tion method is not an easy-to-solve question and it is necessary to check their effectiveness on different situations. Nevertheless, the assessment of relevant features is difficult in real datasets and so an interesting option is to use artificial data. In this paper, several synthetic datasets are employed for this purpose, aiming at reviewing the performance of feature selec- tion methods in the presence of a crescent number or irrelevant features, noise in the data, redundancy and interaction between attributes, as well as a small ratio between number of samples and number of features. Seven filters, two embedded methods, and two wrappers are applied over eleven synthetic datasets, tested by four classifiers, so as to be able to choose a robust method, paving the way for its application to real datasets.

A review of feature selection methods on synthetic data

Citations

Applications of machine learning to machine fault diagnosis: A review and roadmap

Relief-based feature selection: Introduction and review.

A review of microarray datasets and applied feature selection methods

Feature selection using Joint Mutual Information Maximisation

Feature Selection: A literature Review

References

Classification and Regression Trees.

C4.5: Programs for Machine Learning

Data Mining: Practical Machine Learning Tools and Techniques

Genetic Algorithms

Classification and regression trees

Related Papers (5)

An introduction to variable and feature selection

Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy

A review of feature selection techniques in bioinformatics

A survey on feature selection methods

Gene Selection for Cancer Classification using Support Vector Machines