A review of feature selection techniques in bioinformatics
TLDR
A basic taxonomy of feature selection techniques is provided, providing their use, variety and potential in a number of both common as well as upcoming bioinformatics applications.Abstract:
Feature selection techniques have become an apparent need in many bioinformatics applications. In addition to the large pool of techniques that have already been developed in the machine learning and data mining fields, specific applications in bioinformatics have led to a wealth of newly proposed techniques.
In this article, we make the interested reader aware of the possibilities of feature selection, providing a basic taxonomy of feature selection techniques, and discussing their use, variety and potential in a number of both common as well as upcoming bioinformatics applications.
Contact: yvan.saeys@psb.ugent.be
Supplementary information: http://bioinformatics.psb.ugent.be/supplementary_data/yvsae/fsreviewread more
Citations
More filters
Book
Applied Predictive Modeling
Max Kuhn,Kjell Johnson +1 more
TL;DR: This research presents a novel and scalable approach called “Smartfitting” that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of designing and implementing statistical models for regression models.
Journal ArticleDOI
Random forest in remote sensing: A review of applications and future directions
Mariana Belgiu,Lucian Drăguţ +1 more
TL;DR: This review has revealed that RF classifier can successfully handle high data dimensionality and multicolinearity, being both fast and insensitive to overfitting.
Proceedings Article
Efficient and Robust Feature Selection via Joint ℓ2,1-Norms Minimization
TL;DR: A new robust feature selection method with emphasizing joint l2,1-norm minimization on both loss function and regularization is proposed, which has been applied into both genomic and proteomic biomarkers discovery.
Journal ArticleDOI
Feature Selection: A Data Perspective
TL;DR: This survey revisits feature selection research from a data perspective and reviews representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data, and categorizes them into four main groups: similarity- based, information-theoretical-based, sparse-learning-based and statistical-based.
Journal ArticleDOI
Learning from class-imbalanced data
TL;DR: An in depth review of rare event detection from an imbalanced learning perspective and a comprehensive taxonomy of the existing application domains of im balanced learning are provided.
References
More filters
Book
Adaptation in natural and artificial systems
TL;DR: Names of founding work in the area of Adaptation and modiication, which aims to mimic biological optimization, and some (Non-GA) branches of AI.
Book
Data Mining: Practical Machine Learning Tools and Techniques
TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
Journal ArticleDOI
An introduction to variable and feature selection
Isabelle Guyon,André Elisseeff +1 more
TL;DR: The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.
Journal ArticleDOI
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Todd R. Golub,Todd R. Golub,Donna K. Slonim,Pablo Tamayo,Christine Huard,Michelle Gaasenbeek,Jill P. Mesirov,Hilary A. Coller,Mignon L. Loh,James R. Downing,Michael A. Caligiuri,Clara D. Bloomfield,Eric S. Lander +12 more
TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.