scispace - formally typeset
Journal ArticleDOI

Recent advances in feature selection and its applications

TLDR
This review paper presents a selection of challenges which are of particular current interests, such as feature selection for high-dimensional small sample size data, large-scale data, and secure feature selection, as well as some representative applications of feature selection.
Abstract
Feature selection is one of the key problems for machine learning and data mining. In this review paper, a brief historical background of the field is given, followed by a selection of challenges which are of particular current interests, such as feature selection for high-dimensional small sample size data, large-scale data, and secure feature selection. Along with these challenges, some hot topics for feature selection have emerged, e.g., stable feature selection, multi-view feature selection, distributed feature selection, multi-label feature selection, online feature selection, and adversarial feature selection. Then, the recent advances of these topics are surveyed in this paper. For each topic, the existing problems are analyzed, and then, current solutions to these problems are presented and discussed. Besides the topics, some representative applications of feature selection are also introduced, such as applications in bioinformatics, social media, and multimedia retrieval.

read more

Citations
More filters
Journal ArticleDOI

Binary grasshopper optimisation algorithm approaches for feature selection problems

TL;DR: Binary variants of the recent Grasshopper Optimisation Algorithm are proposed in this work and employed to select the optimal feature subset for classification purposes within a wrapper-based framework and the comparative results show the superior performance of the BGOA and B GOA-M methods compared to other similar techniques in the literature.
Journal ArticleDOI

Heart Disease Identification Method Using Machine Learning Classification in E-Healthcare

TL;DR: The experimental results show that the proposed feature selection algorithm (FCMIM) is feasible with classifier support vector machine for designing a high-level intelligent system to identify heart disease and it achieved good accuracy as compared to previously proposed methods.
Journal ArticleDOI

A recent overview of the state-of-the-art elements of text classification

TL;DR: Six baseline elements of text classification including data collection, data analysis for labelling, feature construction and weighing, feature selection and projection, training of a classification model, and solution evaluation are described.
Journal ArticleDOI

Stability of feature selection algorithm: A review

TL;DR: An overview of feature selection techniques and instability of the feature selection algorithm is provided and some of the solutions which can handle the different source of instability are presented.
Journal ArticleDOI

Manifold regularized discriminative feature selection for multi-label learning

TL;DR: A low-dimensional embedding is constructed based on the original feature space to fit the label distribution for capturing the label correlations locally, which is also constrained using the label information in consideration of the co-occurrence relationships of label pairs, and the convergence is guaranteed.
References
More filters
Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Journal ArticleDOI

Induction of Decision Trees

J. R. Quinlan
- 25 Mar 1986 - 
TL;DR: In this paper, an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, is described, and a reported shortcoming of the basic algorithm is discussed.
Related Papers (5)