Showing papers on "Feature selection published in 2017"

PDF

Open Access

Journal Article•DOI•

[...]

Jundong Li¹, Kewei Cheng¹, Suhang Wang¹, Fred Morstatter¹, Robert P. Trevino¹, Jiliang Tang², Huan Liu¹ - Show less +3 more•Institutions (2)

Arizona State University¹, Michigan State University²

06 Dec 2017-ACM Computing Surveys

TL;DR: This survey revisits feature selection research from a data perspective and reviews representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data, and categorizes them into four main groups: similarity- based, information-theoretical-based, sparse-learning-based and statistical-based.

...read moreread less

Abstract: Feature selection, as a data preprocessing strategy, has been proven to be effective and efficient in preparing data (especially high-dimensional data) for various data-mining and machine-learning problems. The objectives of feature selection include building simpler and more comprehensible models, improving data-mining performance, and preparing clean, understandable data. The recent proliferation of big data has presented some substantial challenges and opportunities to feature selection. In this survey, we provide a comprehensive and structured overview of recent advances in feature selection research. Motivated by current challenges and opportunities in the era of big data, we revisit feature selection research from a data perspective and review representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data. Methodologically, to emphasize the differences and similarities of most existing feature selection algorithms for conventional data, we categorize them into four main groups: similarity-based, information-theoretical-based, sparse-learning-based, and statistical-based methods. To facilitate and promote the research in this community, we also present an open source feature selection repository that consists of most of the popular feature selection algorithms (http://featureselection.asu.edu/). Also, we use it as an example to show how to evaluate feature selection algorithms. At the end of the survey, we present a discussion about some open problems and challenges that require more attention in future research.

...read moreread less

1,566 citations

Journal Article•DOI•

Hybrid Whale Optimization Algorithm with Simulated Annealing for Feature Selection

[...]

Majdi Mafarja¹, Seyedali Mirjalili²•Institutions (2)

Birzeit University¹, Griffith University²

18 Oct 2017-Neurocomputing

TL;DR: The experimental results confirm the efficiency of the proposed approaches in improving the classification accuracy compared to other wrapper-based algorithms, which insures the ability of WOA algorithm in searching the feature space and selecting the most informative attributes for classification tasks.

...read moreread less

853 citations

Book Chapter•DOI•

Tree-based models

[...]

Linda A. Clark, Daryl Pregibon

01 Nov 2017

TL;DR: In this article, the authors describe S functions for tree-based modeling, which is an alternative to linear and additive models for regression problems and to linear logistic and additive logistic models for classification problems.

...read moreread less

Abstract: This chapter describes S functions for tree-based modeling. Tree-based models provide an alternative to linear and additive models for regression problems and to linear logistic and additive logistic models for classification problems. Tree-based modeling is an exploratory technique for uncovering structure in data. Specifically, the technique is useful for classification and regression problems where one has a set of classification or predictor variables and a single-response variable. Statistical inference for tree-based models is in its infancy and far behind that for logistic and linear regression analyses. This is partly because a particular type of variable selection underlies tree-based. Our approach is not to have a single function for tree-based modeling, but rather a collection of functions, which, together with existing S functions, form a basis for building and assessing this new class of models. Implementation centers around the idea of a tree object. A subtree of a tree object can be selected or deleted in a natural way through subscripting.

...read moreread less

662 citations

Journal Article•DOI•

A review of supervised object-based land-cover image classification

[...]

Lei Ma¹, Manchun Li¹, Xiaoxue Ma², Xiaoxue Ma¹, Liang Cheng¹, Peijun Du¹, Yongxue Liu¹ - Show less +3 more•Institutions (2)

Nanjing University¹, Jiangsu Second Normal University²

01 Aug 2017-Isprs Journal of Photogrammetry and Remote Sensing

TL;DR: It is found that supervised object- based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework, and spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest shows the best performance inobject-based classification.

...read moreread less

Abstract: Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial vehicle) or agricultural sites where it also correlates with the number of targeted classes. More than 95.6% of studies involve an area less than 300 ha, and the spatial resolution of images is predominantly between 0 and 2 m. Furthermore, we identify some methods that may advance supervised object-based image classification. For example, deep learning and type-2 fuzzy techniques may further improve classification accuracy. Lastly, scientists are strongly encouraged to report results of uncertainty studies to further explore the effects of varied factors on supervised object-based image classification.

...read moreread less

608 citations

Journal Article•DOI•

Correlation and variable importance in random forests

[...]

Baptiste Gregorutti¹, Bertrand Michel¹, Philippe Saint-Pierre¹•Institutions (1)

Pierre-and-Marie-Curie University¹

01 May 2017-Statistics and Computing

TL;DR: This paper provides a theoretical study of the permutation importance measure for an additive regression model and motivates the use of the recursive feature elimination (RFE) algorithm for variable selection in this context.

...read moreread less

Abstract: This paper is about variable selection with the random forests algorithm in presence of correlated predictors. In high-dimensional regression or classification frameworks, variable selection is a difficult task, that becomes even more challenging in the presence of highly correlated predictors. Firstly we provide a theoretical study of the permutation importance measure for an additive regression model. This allows us to describe how the correlation between predictors impacts the permutation importance. Our results motivate the use of the recursive feature elimination (RFE) algorithm for variable selection in this context. This algorithm recursively eliminates the variables using permutation importance measure as a ranking criterion. Next various simulation experiments illustrate the efficiency of the RFE algorithm for selecting a small number of variables together with a good prediction error. Finally, this selection algorithm is tested on the Landsat Satellite data from the UCI Machine Learning Repository.

...read moreread less

525 citations

Proceedings Article•DOI•

End-to-end encrypted traffic classification with one-dimensional convolution neural networks

[...]

Wei Wang¹, Ming Zhu¹, Jinlin Wang², Xuewen Zeng², Zhongzhen Yang² - Show less +1 more•Institutions (2)

University of Science and Technology of China¹, Chinese Academy of Sciences²

22 Jul 2017

TL;DR: Among all of the four experiments, with the best traffic representation and the fine-tuned model, 11 of 12 evaluation metrics of the experiment results outperform the state-of-the-art method, which indicates the effectiveness of the proposed method.

...read moreread less

Abstract: Traffic classification plays an important and basic role in network management and cyberspace security. With the widespread use of encryption techniques in network applications, encrypted traffic has recently become a great challenge for the traditional traffic classification methods. In this paper we proposed an end-to-end encrypted traffic classification method with one-dimensional convolution neural networks. This method integrates feature extraction, feature selection and classifier into a unified end-to-end framework, intending to automatically learning nonlinear relationship between raw input and expected output. To the best of our knowledge, it is the first time to apply an end-to-end method to the encrypted traffic classification domain. The method is validated with the public ISCX VPN-nonVPN traffic dataset. Among all of the four experiments, with the best traffic representation and the fine-tuned model, 11 of 12 evaluation metrics of the experiment results outperform the state-of-the-art method, which indicates the effectiveness of the proposed method.

...read moreread less

496 citations

Journal Article•DOI•

A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring

[...]

Yufei Xia¹, Chuanzhe Liu¹, YuYing Li¹, Nana Liu¹•Institutions (1)

China University of Mining and Technology¹

15 Jul 2017-Expert Systems With Applications

TL;DR: A sequential ensemble credit scoring model based on a variant of gradient boosting machine (i.e., extreme gradient boosting (XGBoost) is proposed, which demonstrates that Bayesian hyper-parameter optimization performs better than random search, grid search, and manual search.

...read moreread less

Abstract: Credit scoring is an effective tool for banks to properly guide decision profitably on granting loans. Ensemble methods, which according to their structures can be divided into parallel and sequential ensembles, have been recently developed in the credit scoring domain. These methods have proven their superiority in discriminating borrowers accurately. However, among the ensemble models, little consideration has been provided to the following: (1) highlighting the hyper-parameter tuning of base learner despite being critical to well-performed ensemble models; (2) building sequential models (i.e., boosting, as most have focused on developing the same or different algorithms in parallel); and (3) focusing on the comprehensibility of models. This paper aims to propose a sequential ensemble credit scoring model based on a variant of gradient boosting machine (i.e., extreme gradient boosting (XGBoost)). The model mainly comprises three steps. First, data pre-processing is employed to scale the data and handle missing values. Second, a model-based feature selection system based on the relative feature importance scores is utilized to remove redundant variables. Third, the hyper-parameters of XGBoost are adaptively tuned with Bayesian hyper-parameter optimization and used to train the model with selected feature subset. Several hyper-parameter optimization methods and baseline classifiers are considered as reference points in the experiment. Results demonstrate that Bayesian hyper-parameter optimization performs better than random search, grid search, and manual search. Moreover, the proposed model outperforms baseline models on average over four evaluation measures: accuracy, error rate, the area under the curve (AUC) H measure (AUC-H measure), and Brier score. The proposed model also provides feature importance scores and decision chart, which enhance the interpretability of credit scoring model.

...read moreread less

495 citations

Journal Article•DOI•

Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model

[...]

Shadi Aljawarneh¹, Monther Aldwairi², Monther Aldwairi¹, Muneer Bani Yassein¹•Institutions (2)

Jordan University of Science and Technology¹, Zayed University²

22 Mar 2017-Journal of Computational Science

TL;DR: A new hybrid model can be used to estimate the intrusion scope threshold degree based on the network transaction data’s optimal features that were made available for training and revealed that the hybrid approach had a significant effect on the minimisation of the computational and time complexity involved when determining the feature association impact scale.

...read moreread less

484 citations

Journal Article•DOI•

Group sparse regularization for deep neural networks

[...]

Simone Scardapane¹, Danilo Comminiello¹, Amir Hussain², Aurelio Uncini¹•Institutions (2)

Sapienza University of Rome¹, University of Stirling²

07 Jun 2017-Neurocomputing

TL;DR: The group Lasso penalty is extended, originally proposed in the linear regression literature, to impose group-level sparsity on the networks connections, where each group is defined as the set of outgoing weights from a unit.

...read moreread less

424 citations

Journal Article•DOI•

A new feature selection method to improve the document clustering using particle swarm optimization algorithm

[...]

Laith Abualigah¹, Ahamad Tajudin Khader¹, Essam Said Hanandeh²•Institutions (2)

Universiti Sains Malaysia¹, Zarqa Private University²

06 Sep 2017-Journal of Computational Science

TL;DR: A novel feature selection method, namely,feature selection method using the particle swarm optimization (PSO) algorithm (FSPSOTC) to solve the feature selection problem by creating a new subset of informative text features that can improve the performance of the text clustering technique and reduce the computational time.

...read moreread less

401 citations

Journal Article•DOI•

Toward an optimal kernel extreme learning machine using a chaotic moth-flame optimization strategy with applications in medical diagnoses

[...]

Mingjing Wang¹, Huiling Chen¹, Huiling Chen², Bo Yang², Xuehua Zhao³, Lufeng Hu⁴, Zhennao Cai¹, Hui Huang¹, Changfei Tong¹ - Show less +5 more•Institutions (4)

Wenzhou University¹, Jilin University², Shenzhen Institute of Information Technology³, First Affiliated Hospital of Wenzhou Medical University⁴

06 Dec 2017-Neurocomputing

TL;DR: Promisingly, the proposed CMFOFS - KELM can serve as an effective and efficient computer aided tool for medical diagnosis in the field of medical decision making.

...read moreread less

Journal Article•DOI•

A Survey on semi-supervised feature selection methods

[...]

Razieh Sheikhpour¹, Mehdi Agha Sarram¹, Sajjad Gharaghani², Mohammad Ali Zare Chahooki¹•Institutions (2)

Yazd University¹, University of Tehran²

01 Apr 2017-Pattern Recognition

TL;DR: In this paper, semi-supervised feature selection methods are fully investigated and two taxonomies of these methods are presented based on two different perspectives which represent the hierarchical structure of semi- supervised feature Selection methods.

...read moreread less

Journal Article•DOI•

Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering

[...]

Laith Abualigah¹, Ahamad Tajudin Khader¹•Institutions (1)

Universiti Sains Malaysia¹

11 Apr 2017-The Journal of Supercomputing

TL;DR: The results show that the proposed algorithm hybrid algorithm (H-FSPSOTC) improved the performance of the clustering algorithm by generating a new subset of more informative features, and is compared with the other comparative algorithms published in the literature.

...read moreread less

Abstract: The text clustering technique is an appropriate method used to partition a huge amount of text documents into groups. The documents size affects the text clustering by decreasing its performance. Subsequently, text documents contain sparse and uninformative features, which reduce the performance of the underlying text clustering algorithm and increase the computational time. Feature selection is a fundamental unsupervised learning technique used to select a new subset of informative text features to improve the performance of the text clustering and reduce the computational time. This paper proposes a hybrid of particle swarm optimization algorithm with genetic operators for the feature selection problem. The k-means clustering is used to evaluate the effectiveness of the obtained features subsets. The experiments were conducted using eight common text datasets with variant characteristics. The results show that the proposed algorithm hybrid algorithm (H-FSPSOTC) improved the performance of the clustering algorithm by generating a new subset of more informative features. The proposed algorithm is compared with the other comparative algorithms published in the literature. Finally, the feature selection technique encourages the clustering algorithm to obtain accurate clusters.

...read moreread less

Journal Article•DOI•

Learning partial differential equations via data discovery and sparse optimization

[...]

Hayden Schaeffer¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 2017-Proceedings of The Royal Society A: Mathematical, Physical and Engineering Sciences

TL;DR: This work develops a learning algorithm to identify the terms in the underlying partial differential equations and to approximate the coefficients of the terms only using data, which uses sparse optimization in order to perform feature selection and parameter estimation.

...read moreread less

Abstract: We investigate the problem of learning an evolution equation directly from some given data. This work develops a learning algorithm to identify the terms in the underlying partial differential equations and to approximate the coefficients of the terms only using data. The algorithm uses sparse optimization in order to perform feature selection and parameter estimation. The features are data driven in the sense that they are constructed using nonlinear algebraic equations on the spatial derivatives of the data. Several numerical experiments show the proposed method's robustness to data noise and size, its ability to capture the true features of the data, and its capability of performing additional analytics. Examples include shock equations, pattern formation, fluid flow and turbulence, and oscillatory convection.

...read moreread less

Journal Article•DOI•

Evolutionary Population Dynamics and Grasshopper Optimization approaches for feature selection problems

[...]

Majdi Mafarja¹, Ibrahim Aljarah², Ali Asghar Heidari³, Abdelaziz I. Hammouri⁴, Hossam Faris², Ala' M. Al-Zoubi², Seyedali Mirjalili⁵ - Show less +3 more•Institutions (5)

Birzeit University¹, University of Jordan², University of Tehran³, Al-Balqa` Applied University⁴, Griffith University⁵

01 Dec 2017-Knowledge Based Systems

TL;DR: The comprehensive results and various comparisons reveal that the EPD has a remarkable impact on the efficacy of the GOA and using the selection mechanism enhanced the capability of the proposed approach to outperform other optimizers and find the best solutions with improved convergence trends.

...read moreread less

Abstract: Searching for the optimal subset of features is known as a challenging problem in feature selection process. To deal with the difficulties involved in this problem, a robust and reliable optimization algorithm is required. In this paper, Grasshopper Optimization Algorithm (GOA) is employed as a search strategy to design a wrapper-based feature selection method. The GOA is a recent population-based metaheuristic that mimics the swarming behaviors of grasshoppers. In this work, an efficient optimizer based on the simultaneous use of the GOA, selection operators, and Evolutionary Population Dynamics (EPD) is proposed in the form of four different strategies to mitigate the immature convergence and stagnation drawbacks of the conventional GOA. In the first two approaches, one of the top three agents and a randomly generated one are selected to reposition a solution from the worst half of the population. In the third and fourth approaches, to give a chance to the low fitness solutions in reforming the population, Roulette Wheel Selection (RWS) and Tournament Selection (TS) are utilized to select the guiding agent from the first half. The proposed GOA_EPD approaches are employed to tackle various feature selection tasks. The proposed approaches are benchmarked on 22 UCI datasets. The comprehensive results and various comparisons reveal that the EPD has a remarkable impact on the efficacy of the GOA and using the selection mechanism enhanced the capability of the proposed approach to outperform other optimizers and find the best solutions with improved convergence trends. Furthermore, the comparative experiments demonstrate the superiority of the proposed approaches when compared to other similar methods in the literature.

...read moreread less

Journal Article•DOI•

Electricity load forecasting by an improved forecast engine for building level consumers

[...]

Yang Liu¹, Wei Wang, Noradin Ghadimi²•Institutions (2)

Henan University¹, Islamic Azad University²

15 Nov 2017-Energy

TL;DR: A new prediction model for small scale load prediction i.e., buildings or sites is outlined, based on improved version of empirical mode decomposition (EMD) which is called sliding window EMD (SWEMD), a new feature selection algorithm and hybrid forecast engine.

...read moreread less

Journal Article•DOI•

Five myths about variable selection.

[...]

Georg Heinze¹, Daniela Dunkler¹•Institutions (1)

Medical University of Vienna¹

01 Jan 2017-Transplant International

TL;DR: It is emphasized that variable selection and all problems related with it can often be avoided by the use of expert knowledge, and how five common misconceptions often lead to inappropriate application of variable selection is discussed.

...read moreread less

Abstract: Multivariable regression models are often used in transplantation research to identify or to confirm baseline variables which have an independent association, causally or only evidenced by statistical correlation, with transplantation outcome. Although sound theory is lacking, variable selection is a popular statistical method which seemingly reduces the complexity of such models. However, in fact, variable selection often complicates analysis as it invalidates common tools of statistical inference such as P-values and confidence intervals. This is a particular problem in transplantation research where sample sizes are often only small to moderate. Furthermore, variable selection requires computer-intensive stability investigations and a particularly cautious interpretation of results. We discuss how five common misconceptions often lead to inappropriate application of variable selection. We emphasize that variable selection and all problems related with it can often be avoided by the use of expert knowledge.

...read moreread less

Journal Article•DOI•

Robust Joint Graph Sparse Coding for Unsupervised Spectral Feature Selection

[...]

Xiaofeng Zhu¹, Xuelong Li², Shichao Zhang³, Chunhua Ju³, Xindong Wu⁴ - Show less +1 more•Institutions (4)

Guangxi Normal University¹, Chinese Academy of Sciences², Zhejiang Gongshang University³, University of Vermont⁴

01 Jun 2017-IEEE Transactions on Neural Networks

TL;DR: This paper proposes a new unsupervised spectral feature selection model by embedding a graph regularizer into the framework of joint sparse regression for preserving the local structures of data by proposing a novel joint graph sparse coding (JGSC) model.

...read moreread less

Abstract: In this paper, we propose a new unsupervised spectral feature selection model by embedding a graph regularizer into the framework of joint sparse regression for preserving the local structures of data. To do this, we first extract the bases of training data by previous dictionary learning methods and, then, map original data into the basis space to generate their new representations, by proposing a novel joint graph sparse coding (JGSC) model. In JGSC, we first formulate its objective function by simultaneously taking subspace learning and joint sparse regression into account, then, design a new optimization solution to solve the resulting objective function, and further prove the convergence of the proposed solution. Furthermore, we extend JGSC to a robust JGSC (RJGSC) via replacing the least square loss function with a robust loss function, for achieving the same goals and also avoiding the impact of outliers. Finally, experimental results on real data sets showed that both JGSC and RJGSC outperformed the state-of-the-art algorithms in terms of ${k}$ -nearest neighbor classification performance.

...read moreread less

Journal Article•DOI•

Intrusion detection model using fusion of chi-square feature selection and multi class SVM

[...]

Ikram Sumaiya Thaseen¹, Cherukuri Aswani Kumar¹•Institutions (1)

VIT University¹

01 Oct 2017-Journal of King Saud University - Computer and Information Sciences

TL;DR: The main idea behind this model is to construct a multi class SVM which has not been adopted for IDS so far to decrease the training and testing time and increase the individual classification accuracy of the network attacks.

...read moreread less

Journal Article•DOI•

Multi-Objective Particle Swarm Optimization Approach for Cost-Based Feature Selection in Classification

[...]

Yong Zhang¹, Dunwei Gong¹, Jian Cheng¹•Institutions (1)

China University of Mining and Technology¹

01 Jan 2017-IEEE/ACM Transactions on Computational Biology and Bioinformatics

TL;DR: Experimental results show that the proposed PSO-based multi-objective feature selection algorithm can automatically evolve a set of nondominated solutions, and it is a highly competitive feature selection method for solving cost-based feature selection problems.

...read moreread less

Abstract: Feature selection is an important data-preprocessing technique in classification problems such as bioinformatics and signal processing Generally, there are some situations where a user is interested in not only maximizing the classification performance but also minimizing the cost that may be associated with features This kind of problem is called cost-based feature selection However, most existing feature selection approaches treat this task as a single-objective optimization problem This paper presents the first study of multi-objective particle swarm optimization PSO for cost-based feature selection problems The task of this paper is to generate a Pareto front of nondominated solutions, that is, feature subsets, to meet different requirements of decision-makers in real-world applications In order to enhance the search capability of the proposed algorithm, a probability-based encoding technology and an effective hybrid operator, together with the ideas of the crowding distance, the external archive, and the Pareto domination relationship, are applied to PSO The proposed PSO-based multi-objective feature selection algorithm is compared with several multi-objective feature selection algorithms on five benchmark datasets Experimental results show that the proposed algorithm can automatically evolve a set of nondominated solutions, and it is a highly competitive feature selection method for solving cost-based feature selection problems

...read moreread less

Proceedings Article•DOI•

Person Re-Identification by Deep Joint Learning of Multi-Loss Classification

[...]

Wei Li¹, Xiatian Zhu¹, Shaogang Gong¹•Institutions (1)

Queen Mary University of London¹

19 Aug 2017

TL;DR: Li et al. as discussed by the authors proposed a joint learning of local and global feature selection losses designed to optimise person re-id when using only generic matching metrics such as the L2 distance.

...read moreread less

Abstract: Existing person re-identification (re-id) methods rely mostly on either localised or global feature representation alone. This ignores their joint benefit and mutual complementary effects. In this work, we show the advantages of jointly learning local and global features in a Convolutional Neural Network (CNN) by aiming to discover correlated local and global features in different context. Specifically, we formulate a method for joint learning of local and global feature selection losses designed to optimise person re-id when using only generic matching metrics such as the L2 distance. We design a novel CNN architecture for Jointly Learning Multi-Loss (JLML) of local and global discriminative feature optimisation subject concurrently to the same re-id labelled information. Extensive comparative evaluations demonstrate the advantages of this new JLML model for person re-id over a wide range of state-of-the-art re-id methods on five benchmarks (VIPeR, GRID, CUHK01, CUHK03, Market-1501).

...read moreread less

Journal Article•DOI•

A feature selection model based on genetic rank aggregation for text sentiment classification

[...]

Aytuğ Onan¹, Serdar Korukoglu²•Institutions (2)

Celal Bayar University¹, Ege University²

01 Feb 2017-Journal of Information Science

TL;DR: An ensemble approach for feature selection is presented, which aggregates the several individual feature lists obtained by the different feature selection methods so that a more robust and efficient feature subset can be obtained.

...read moreread less

Abstract: Sentiment analysis is an important research direction of natural language processing, text mining and web mining which aims to extract subjective information in source materials The main challenge encountered in machine learning method-based sentiment classification is the abundant amount of data available This amount makes it difficult to train the learning algorithms in a feasible time and degrades the classification accuracy of the built model Hence, feature selection becomes an essential task in developing robust and efficient classification models whilst reducing the training time In text mining applications, individual filter-based feature selection methods have been widely utilized owing to their simplicity and relatively high performance This paper presents an ensemble approach for feature selection, which aggregates the several individual feature lists obtained by the different feature selection methods so that a more robust and efficient feature subset can be obtained In order to aggregate the individual feature lists, a genetic algorithm has been utilized Experimental evaluations indicated that the proposed aggregation model is an efficient method and it outperforms individual filter-based feature selection methods on sentiment classification

...read moreread less

Journal Article•DOI•

Minimum redundancy maximum relevance feature selection approach for temporal gene expression data

[...]

Milos Radovic¹, Mohamed Ghalwash¹, Mohamed Ghalwash², Mohamed Ghalwash³, Nenad Filipovic⁴, Zoran Obradovic¹ - Show less +2 more•Institutions (4)

Temple University¹, Ain Shams University², IBM³, University of Kragujevac⁴

03 Jan 2017-BMC Bioinformatics

TL;DR: A filter-based feature selection method for temporal gene expression data based on maximum relevance and minimum redundancy criteria is developed, which outperforms alternatives widely used in gene expression studies.

...read moreread less

Abstract: Feature selection, aiming to identify a subset of features among a possibly large set of features that are relevant for predicting a response, is an important preprocessing step in machine learning. In gene expression studies this is not a trivial task for several reasons, including potential temporal character of data. However, most feature selection approaches developed for microarray data cannot handle multivariate temporal data without previous data flattening, which results in loss of temporal information. We propose a temporal minimum redundancy - maximum relevance (TMRMR) feature selection approach, which is able to handle multivariate temporal data without previous data flattening. In the proposed approach we compute relevance of a gene by averaging F-statistic values calculated across individual time steps, and we compute redundancy between genes by using a dynamical time warping approach. The proposed method is evaluated on three temporal gene expression datasets from human viral challenge studies. Obtained results show that the proposed method outperforms alternatives widely used in gene expression studies. In particular, the proposed method achieved improvement in accuracy in 34 out of 54 experiments, while the other methods outperformed it in no more than 4 experiments. We developed a filter-based feature selection method for temporal gene expression data based on maximum relevance and minimum redundancy criteria. The proposed method incorporates temporal information by combining relevance, which is calculated as an average F-statistic value across different time steps, with redundancy, which is calculated by employing dynamical time warping approach. As evident in our experiments, incorporating the temporal information into the feature selection process leads to selection of more discriminative features.

...read moreread less

Journal Article•DOI•

Feature Selection Based on Structured Sparsity: A Comprehensive Study

[...]

Jie Gui¹, Zhenan Sun¹, Shuiwang Ji², Dacheng Tao³, Tieniu Tan¹ - Show less +1 more•Institutions (3)

Chinese Academy of Sciences¹, Washington State University², University of Technology, Sydney³

01 Jul 2017-IEEE Transactions on Neural Networks

TL;DR: This paper compares the differences and commonalities of these methods based on regression and regularization strategies, but also provides useful guidelines to practitioners working in related fields to guide them how to do feature selection.

...read moreread less

Abstract: Feature selection (FS) is an important component of many pattern recognition tasks. In these tasks, one is often confronted with very high-dimensional data. FS algorithms are designed to identify the relevant feature subset from the original features, which can facilitate subsequent analysis, such as clustering and classification. Structured sparsity-inducing feature selection (SSFS) methods have been widely studied in the last few years, and a number of algorithms have been proposed. However, there is no comprehensive study concerning the connections between different SSFS methods, and how they have evolved. In this paper, we attempt to provide a survey on various SSFS methods, including their motivations and mathematical representations. We then explore the relationship among different formulations and propose a taxonomy to elucidate their evolution. We group the existing SSFS methods into two categories, i.e., vector-based feature selection (feature selection based on lasso) and matrix-based feature selection (feature selection based on ${l_{r,p}}$ -norm). Furthermore, FS has been combined with other machine learning algorithms for specific applications, such as multitask learning, multilabel learning, multiview learning, classification, and clustering. This paper not only compares the differences and commonalities of these methods based on regression and regularization strategies, but also provides useful guidelines to practitioners working in related fields to guide them how to do feature selection.

...read moreread less

Journal Article•DOI•

A hybrid feature selection algorithm for gene expression data classification

[...]

Huijuan Lu¹, Junying Chen¹, Ke Yan¹, Qun Jin², Yu Xue³, Zhigang Gao⁴ - Show less +2 more•Institutions (4)

China Jiliang University¹, Waseda University², Nanjing University of Information Science and Technology³, Hangzhou Dianzi University⁴

20 Sep 2017-Neurocomputing

TL;DR: Experimental results show that the proposing MIMAGA-Selection method significantly reduces the dimension of gene expression data and removes the redundancies for classification and the reduced gene expression dataset provides highest classification accuracy compared to conventional feature selection algorithms.

...read moreread less

Journal Article•DOI•

A data-driven multi-model methodology with deep feature selection for short-term wind forecasting

[...]

Cong Feng¹, Mingjian Cui¹, Bri-Mathias Hodge², Jie Zhang¹•Institutions (2)

University of Texas at Dallas¹, National Renewable Energy Laboratory²

15 Mar 2017-Applied Energy

TL;DR: Numerical results show that comparing to the single-algorithm models, the developed multi-model framework with deep feature selection procedure has improved the forecasting accuracy by up to 30%.

...read moreread less

Journal Article•DOI•

Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier

[...]

Asha S. Manek¹, P. Deepa Shenoy², M. Chandra Mohan¹, Venugopal K R²•Institutions (2)

Jawaharlal Nehru Technological University, Hyderabad¹, University Visvesvaraya College of Engineering²

01 Mar 2017-World Wide Web

TL;DR: A Gini Index based feature selection method with Support Vector Machine (SVM) classifier is proposed for sentiment classification for large movie review data set and the results show that the Gini index method has better classification performance in terms of reduced error rate and accuracy.

...read moreread less

Abstract: With the rapid development of the World Wide Web, electronic word-of-mouth interaction has made consumers active participants. Nowadays, a large number of reviews posted by the consumers on the Web provide valuable information to other consumers. Such information is highly essential for decision making and hence popular among the internet users. This information is very valuable not only for prospective consumers to make decisions but also for businesses in predicting the success and sustainability. In this paper, a Gini Index based feature selection method with Support Vector Machine (SVM) classifier is proposed for sentiment classification for large movie review data set. The results show that our Gini Index method has better classification performance in terms of reduced error rate and accuracy.

...read moreread less

Journal Article•DOI•

A GA-LR wrapper approach for feature selection in network intrusion detection☆

[...]

Chaouki Khammassi¹, Chaouki Khammassi², Saoussen Krichen¹•Institutions (2)

Institut Supérieur de Gestion¹, Carthage University²

01 Sep 2017-Computers & Security

TL;DR: A wrapper approach based on a genetic algorithm as a search strategy and logistic regression as a learning algorithm for network intrusion detection systems to select the best subset of features to increase the accuracy and the classification performance of the IDS.

...read moreread less

Journal Article•DOI•

An Enhanced Grey Wolf Optimization Based Feature Selection Wrapped Kernel Extreme Learning Machine for Medical Diagnosis.

[...]

Qiang Li¹, Huiling Chen¹, Hui Huang¹, Xuehua Zhao², Zhen Nao Cai¹, Changfei Tong¹, Wenbin Liu¹, Xin Tian³ - Show less +4 more•Institutions (3)

Wenzhou University¹, Shenzhen Institute of Information Technology², Peking Union Medical College³

26 Jan 2017-Computational and Mathematical Methods in Medicine

TL;DR: The proposed approach is compared against the original GA and GWO on the two common disease diagnosis problems in terms of a set of performance metrics, including classification accuracy, sensitivity, specificity, precision, G-mean, F-measure, and the size of selected features.

...read moreread less

Abstract: In this study, a new predictive framework is proposed by integrating an improved grey wolf optimization (IGWO) and kernel extreme learning machine (KELM), termed as IGWO-KELM, for medical diagnosis. The proposed IGWO feature selection approach is used for the purpose of finding the optimal feature subset for medical data. In the proposed approach, genetic algorithm (GA) was firstly adopted to generate the diversified initial positions, and then grey wolf optimization (GWO) was used to update the current positions of population in the discrete searching space, thus getting the optimal feature subset for the better classification purpose based on KELM. The proposed approach is compared against the original GA and GWO on the two common disease diagnosis problems in terms of a set of performance metrics, including classification accuracy, sensitivity, specificity, precision, G-mean, F-measure, and the size of selected features. The simulation results have proven the superiority of the proposed method over the other two competitive counterparts.

...read moreread less

Journal Article•DOI•

Recent advances in feature selection and its applications

[...]

Yun Li¹, Tao Li¹, Huan Liu²•Institutions (2)

Nanjing University of Posts and Telecommunications¹, Arizona State University²

01 Dec 2017-Knowledge and Information Systems

TL;DR: This review paper presents a selection of challenges which are of particular current interests, such as feature selection for high-dimensional small sample size data, large-scale data, and secure feature selection, as well as some representative applications of feature selection.

...read moreread less

Abstract: Feature selection is one of the key problems for machine learning and data mining. In this review paper, a brief historical background of the field is given, followed by a selection of challenges which are of particular current interests, such as feature selection for high-dimensional small sample size data, large-scale data, and secure feature selection. Along with these challenges, some hot topics for feature selection have emerged, e.g., stable feature selection, multi-view feature selection, distributed feature selection, multi-label feature selection, online feature selection, and adversarial feature selection. Then, the recent advances of these topics are surveyed in this paper. For each topic, the existing problems are analyzed, and then, current solutions to these problems are presented and discussed. Besides the topics, some representative applications of feature selection are also introduced, such as applications in bioinformatics, social media, and multimedia retrieval.

...read moreread less

Collapse