scispace - formally typeset
Open AccessJournal ArticleDOI

Feature Selection: A Practitioner View

Saptarsi Goswami, +1 more
- 08 Oct 2014 - 
- Vol. 6, Iss: 11, pp 66-77
TLDR
A near comprehensive list of problems that have been solved using feature selection across technical and commercial domain is produced and can serve as a valuable tool to practitioners across industry and academia.
Abstract
Feature selection is one of the most important preprocessing steps in data mining and knowledge Engineering. In this short review paper, apart from a brief taxonomy of current feature selection methods, we review feature selection methods that are being used in practice. Subsequently we produce a near comprehensive list of problems that have been solved using feature selection across technical and commercial domain. This can serve as a valuable tool to practitioners across industry and academia. We also present empirical results of filter based methods on various datasets. The empirical study covers task of classification, regression, text classification and clustering respectively. We also compare filter based ranking methods using rank correlation.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A hybrid intrusion detection system based on ABC-AFS algorithm for misuse and anomaly detection

TL;DR: A new hybrid classification method based on Artificial Bee Colony (ABC) and Artificial Fish Swarm (AFS) algorithms is proposed that outperforms in terms of performance metrics and can achieve 99% detection rate and 0.01% false positive rate.
Journal ArticleDOI

Sentiment Analysis of Review Datasets Using Naive Bayes and K-NN Classifier

TL;DR: The paper elaborately discusses two supervised machine learning algorithms: K-Nearest Neighbour(K-NN) and Naive Bayes and compares their overall accuracy, precisions as well as recall values and it was seen that in case of movie reviews Naïve Bayes gave far better results than K-NN but for hotel reviews these algorithms gave lesser, almost same accuracies.
Journal ArticleDOI

Optimal feature selection using a modified differential evolution algorithm and its effectiveness for prediction of heart disease

TL;DR: Various performance measures of integrating the modified differential evolution algorithm with fuzzy AHP and a feed-forward neural network in the prediction of heart disease are evaluated and the prediction time of the proposed hybrid model is evaluated and has shown promising results.
Journal ArticleDOI

Sentiment Analysis of Review Datasets Using Naïve Bayes' and K-NN Classifier

TL;DR: In this article, a sentiment focussed web crawling framework is proposed to facilitate the quick discovery of sentimental contents of movie reviews and hotel reviews and analysis of the same using statistical methods to capture elements of subjective style and sentence polarity.
Journal ArticleDOI

An integrated data-driven framework for urban energy use modeling (UEUM)

TL;DR: An integrated framework for urban energy use modeling (UEUM) that localizes energy performance data, considers urban socio-spatial context, and captures both urban building operational and transportation energy use through a bottom-up data-driven approach is presented.
References
More filters
Journal Article

R: A language and environment for statistical computing.

R Core Team
- 01 Jan 2014 - 
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Journal ArticleDOI

An introduction to variable and feature selection

TL;DR: The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.
Journal ArticleDOI

A review of feature selection techniques in bioinformatics

TL;DR: A basic taxonomy of feature selection techniques is provided, providing their use, variety and potential in a number of both common as well as upcoming bioinformatics applications.

Correlation-based Feature Selection for Machine Learning

Mark Hall
TL;DR: This thesis addresses the problem of feature selection for machine learning through a correlation based approach with CFS (Correlation based Feature Selection), an algorithm that couples this evaluation formula with an appropriate correlation measure and a heuristic search strategy.
Related Papers (5)