scispace - formally typeset
Proceedings ArticleDOI

Hybrid feature selection and peptide binding affinity prediction using an EDA based algorithm

TLDR
This study employs a hybrid Estimation of Distribution Algorithm (EDA) based filter-wrapper methodology to simultaneously extract informative feature subsets and build robust QSAR models.
Abstract
Protein function prediction is an important problem in functional genomics. Typically, protein sequences are represented by feature vectors. A major problem of protein datasets that increase the complexity of classification models is their large number of features. The process of drug discovery often involves the use of quantitative structure-activity relationship (QSAR) models to identify chemical structures that could have good inhibitory effects on specific targets and have low toxicity (non-specific activity). QSAR models are regression or classification models used in the chemical and biological sciences. Because of high dimensionality problems, a feature selection problem is imminent. In this study, we thus employ a hybrid Estimation of Distribution Algorithm (EDA) based filter-wrapper methodology to simultaneously extract informative feature subsets and build robust QSAR models. The performance of the algorithm was tested on the benchmark classification challenge datasets obtained from the CoePRa competition platform, developed in 2006. Our results clearly demonstrate the efficacy of a hybrid EDA filter-wrapper algorithm in comparison to the results reported earlier.

read more

Citations
More filters
Journal ArticleDOI

A Survey on Evolutionary Computation Approaches to Feature Selection

TL;DR: This paper presents a comprehensive survey of the state-of-the-art work on EC for feature selection, which identifies the contributions of these different algorithms.
Journal ArticleDOI

A Hybrid Genetic Algorithm With Wrapper-Embedded Approaches for Feature Selection

TL;DR: A hybrid genetic algorithm with wrapper−embedded feature approach for selection approach (HGAWE), which combines genetic algorithm (global search) with embedded regularization approaches (local search) together and a novel chromosome representation for global and local optimization procedures in HGAWE is proposed.
Proceedings ArticleDOI

A Novel Evolutionary Algorithm for Automated Machine Learning Focusing on Classifier Ensembles

TL;DR: This work proposes a new Evolutionary Algorithm for the Auto-ML task of automatically selecting the best ensemble of classifiers and their hyper-parameter settings for an input dataset and obtained significantly smaller classification error rates than that Auto-WEKA version.
Journal ArticleDOI

Knowledge management overview of feature selection problem in high-dimensional financial data: cooperative co-evolution and MapReduce perspectives

TL;DR: A knowledge management overview of evolutionary feature selection approaches, state-of-the-art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions is presented.
Journal ArticleDOI

An evolutionary algorithm for automated machine learning focusing on classifier ensembles: An improved algorithm and extended results

TL;DR: An improved version of the previous Evolutionary Algorithm (EA) – more precisely, an Estimation of Distribution Algorithm – for the Auto-ML task of automatically selecting the best classifier ensemble and its best hyper-parameter settings for an input dataset.
References
More filters
Journal ArticleDOI

LIBSVM: A library for support vector machines

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Book

Data Mining: Concepts and Techniques

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Journal ArticleDOI

The WEKA data mining software: an update

TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.
Journal ArticleDOI

An introduction to variable and feature selection

TL;DR: The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.
Related Papers (5)