scispace - formally typeset
Search or ask a question

Showing papers on "Feature selection published in 2019"


Journal ArticleDOI
TL;DR: This paper proposes a novel method that aims at finding significant features by applying machine learning techniques resulting in improving the accuracy in the prediction of cardiovascular disease with the hybrid random forest with a linear model (HRFLM).
Abstract: Heart disease is one of the most significant causes of mortality in the world today. Prediction of cardiovascular disease is a critical challenge in the area of clinical data analysis. Machine learning (ML) has been shown to be effective in assisting in making decisions and predictions from the large quantity of data produced by the healthcare industry. We have also seen ML techniques being used in recent developments in different areas of the Internet of Things (IoT). Various studies give only a glimpse into predicting heart disease with ML techniques. In this paper, we propose a novel method that aims at finding significant features by applying machine learning techniques resulting in improving the accuracy in the prediction of cardiovascular disease. The prediction model is introduced with different combinations of features and several known classification techniques. We produce an enhanced performance level with an accuracy level of 88.7% through the prediction model for heart disease with the hybrid random forest with a linear model (HRFLM).

783 citations


Proceedings ArticleDOI
02 Mar 2019
TL;DR: The FSAF module robustly improves the baseline RetinaNet by a large margin under various settings, while introducing nearly free inference overhead, and the resulting best model can achieve a state-of-the-art 44.6% mAP, outperforming all existing single-shot detectors on COCO.
Abstract: We motivate and present feature selective anchor-free (FSAF) module, a simple and effective building block for single-shot object detectors. It can be plugged into single-shot detectors with feature pyramid structure. The FSAF module addresses two limitations brought up by the conventional anchor-based detection: 1) heuristic-guided feature selection; 2) overlap-based anchor sampling. The general concept of the FSAF module is online feature selection applied to the training of multi-level anchor-free branches. Specifically, an anchor-free branch is attached to each level of the feature pyramid, allowing box encoding and decoding in the anchor-free manner at an arbitrary level. During training, we dynamically assign each instance to the most suitable feature level. At the time of inference, the FSAF module can work independently or jointly with anchor-based branches. We instantiate this concept with simple implementations of anchor-free branches and online feature selection strategy. Experimental results on the COCO detection track show that our FSAF module performs better than anchor-based counterparts while being faster. When working jointly with anchor-based branches, the FSAF module robustly improves the baseline RetinaNet by a large margin under various settings, while introducing nearly free inference overhead. And the resulting best model can achieve a state-of-the-art 44.6% mAP, outperforming all existing single-shot detectors on COCO.

578 citations


Journal ArticleDOI
TL;DR: This paper reviews the recent literature on machine learning models that have been used for condition monitoring in wind turbines and shows that most models use SCADA or simulated data, with almost two-thirds of methods using classification and the rest relying on regression.

482 citations


Journal ArticleDOI
TL;DR: Based on this study, the best variable selection methods for most datasets are Jiang's method and the method implemented in the VSURF R package, and for datasets with many predictors, the methods implement in the R packages varSelRF and Boruta are preferable due to computational efficiency.
Abstract: Random forest classification is a popular machine learning method for developing prediction models in many research settings. Often in prediction modeling, a goal is to reduce the number of variables needed to obtain a prediction in order to reduce the burden of data collection and improve efficiency. Several variable selection methods exist for the setting of random forest classification; however, there is a paucity of literature to guide users as to which method may be preferable for different types of datasets. Using 311 classification datasets freely available online, we evaluate the prediction error rates, number of variables, computation times and area under the receiver operating curve for many random forest variable selection methods. We compare random forest variable selection methods for different types of datasets (datasets with binary outcomes, datasets with many predictors, and datasets with imbalanced outcomes) and for different types of methods (standard random forest versus conditional random forest methods and test based versus performance based methods). Based on our study, the best variable selection methods for most datasets are Jiang's method and the method implemented in the VSURF R package. For datasets with many predictors, the methods implemented in the R packages varSelRF and Boruta are preferable due to computational efficiency. A significant contribution of this study is the ability to assess different variable selection techniques in the setting of random forest classification in order to identify preferable methods based on applications in expert and intelligent systems.

446 citations


Book
03 Jan 2019
TL;DR: A new feature selection method using particle swarm optimization algorithm with a novel weighting scheme and a detailed dimension reduction technique are proposed to obtain a new subset of more informative features with low-dimensional space to improve the performance of the text clustering (TC) algorithm.
Abstract: Text document (TD) clustering is a new trend in text mining in which the TDs are separated into several coherent clusters, where documents in the same cluster are similar. In this study, a new method for solving the TD clustering problem worked in the following two stages: (i) A new feature selection method using particle swarm optimization algorithm with a novel weighting scheme and a detailed dimension reduction technique are proposed to obtain a new subset of more informative features with low-dimensional space. This new subset is used to improve the performance of the text clustering (TC) algorithm in the subsequent stage and reduce its computation time. The k-mean clustering algorithm is used to evaluate the effectiveness of the obtained subsets. (ii) Four krill herd algorithms (KHAs), namely, (a) basic KHA, (b) modified KHA, (c) hybrid KHA, and (d) multi-objective hybrid KHA, are proposed to solve the TC problem; these algorithms are incremental improvements of the preceding versions. For the evaluation process, seven benchmark text datasets are used with different characterizations and complexities. Results show that the proposed methods and algorithms obtained the best results in comparison with the other comparative methods published in the literature.

414 citations


Journal ArticleDOI
TL;DR: Effectiveness and feasibility of the 1D CNN based fault diagnosis method is validated by applying it to two commonly used benchmark real vibration data sets and comparing the results with the other competing intelligent fault diagnosis methods.
Abstract: Timely and accurate bearing fault detection and diagnosis is important for reliable and safe operation of industrial systems. In this study, performance of a generic real-time induction bearing fault diagnosis system employing compact adaptive 1D Convolutional Neural Network (CNN) classifier is extensively studied. In the literature, although many studies have developed highly accurate algorithms for detecting bearing faults, their results have generally been limited to relatively small train/test data sets. As opposed to conventional intelligent fault diagnosis systems that usually encapsulate feature extraction, feature selection and classification as distinct blocks, the proposed system takes directly raw time-series sensor data as input and it can efficiently learn optimal features with the proper training. The main advantages of the 1D CNN based approach are 1) its compact architecture configuration (rather than the complex deep architectures) which performs only 1D convolutions making it suitable for real-time fault detection and monitoring, 2) its cost effective and practical real-time hardware implementation, 3) its ability to work without any pre-determined transformation (such as FFT or DWT), hand-crafted feature extraction and feature selection, and 4) its capability to provide efficient training of the classifier with limited size of training data set and limited number of BP iterations. Effectiveness and feasibility of the 1D CNN based fault diagnosis method is validated by applying it to two commonly used benchmark real vibration data sets and comparing the results with the other competing intelligent fault diagnosis methods.

362 citations


Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed feature selection method effectively reduces the dimensions of the dataset and achieves superior classification accuracy using the selected features.

353 citations


Journal ArticleDOI
TL;DR: Experimental results reveal the capability of CCSA to find an optimal feature subset which maximizes the classification performance and minimizes the number of selected features, and show that CCSA is superior compared to CSA and the other algorithms.
Abstract: Crow search algorithm (CSA) is a new natural inspired algorithm proposed by Askarzadeh in 2016. The main inspiration of CSA came from crow search mechanism for hiding their food. Like most of the optimization algorithms, CSA suffers from low convergence rate and entrapment in local optima. In this paper, a novel meta-heuristic optimizer, namely chaotic crow search algorithm (CCSA), is proposed to overcome these problems. The proposed CCSA is applied to optimize feature selection problem for 20 benchmark datasets. Ten chaotic maps are employed during the optimization process of CSA. The performance of CCSA is compared with other well-known and recent optimization algorithms. Experimental results reveal the capability of CCSA to find an optimal feature subset which maximizes the classification performance and minimizes the number of selected features. Moreover, the results show that CCSA is superior compared to CSA and the other algorithms. In addition, the experiments show that sine chaotic map is the appropriate map to significantly boost the performance of CSA.

349 citations


Journal ArticleDOI
TL;DR: Vita is considerably faster than Boruta and thus more suitable for large data sets, but only Boruta can also be applied in low-dimensional settings, while Vita was the most robust approach under a pure null model without any predictor variables related to the outcome.
Abstract: Machine learning methods and in particular random forests are promising approaches for prediction based on high dimensional omics data sets. They provide variable importance measures to rank predictors according to their predictive power. If building a prediction model is the main goal of a study, often a minimal set of variables with good prediction performance is selected. However, if the objective is the identification of involved variables to find active networks and pathways, approaches that aim to select all relevant variables should be preferred. We evaluated several variable selection procedures based on simulated data as well as publicly available experimental methylation and gene expression data. Our comparison included the Boruta algorithm, the Vita method, recurrent relative variable importance, a permutation approach and its parametric variant (Altmann) as well as recursive feature elimination (RFE). In our simulation studies, Boruta was the most powerful approach, followed closely by the Vita method. Both approaches demonstrated similar stability in variable selection, while Vita was the most robust approach under a pure null model without any predictor variables related to the outcome. In the analysis of the different experimental data sets, Vita demonstrated slightly better stability in variable selection and was less computationally intensive than Boruta. In conclusion, we recommend the Boruta and Vita approaches for the analysis of high-dimensional data sets. Vita is considerably faster than Boruta and thus more suitable for large data sets, but only Boruta can also be applied in low-dimensional settings.

342 citations


Journal ArticleDOI
TL;DR: The most recent feature selection methods developed for and applied in medical problems are reviewed, covering prolific research fields such as medical imaging, biomedical signal processing, and DNA microarray data analysis.

320 citations


Journal ArticleDOI
TL;DR: This work provides the reader with the basic concepts necessary to build an ensemble for feature selection, as well as reviewing the up-to-date advances and commenting on the future trends that are still to be faced.

Journal ArticleDOI
TL;DR: Binary variants of the recent Grasshopper Optimisation Algorithm are proposed in this work and employed to select the optimal feature subset for classification purposes within a wrapper-based framework and the comparative results show the superior performance of the BGOA and B GOA-M methods compared to other similar techniques in the literature.
Abstract: Feature Selection (FS) is a challenging machine learning-related task that aims at reducing the number of features by removing irrelevant, redundant and noisy data while maintaining an acceptable level of classification accuracy. FS can be considered as an optimisation problem. Due to the difficulty of this problem and having a large number of local solutions, stochastic optimisation algorithms are promising techniques to solve this problem. As a seminal attempt, binary variants of the recent Grasshopper Optimisation Algorithm (GOA) are proposed in this work and employed to select the optimal feature subset for classification purposes within a wrapper-based framework. Two mechanisms are employed to design a binary GOA, the first one is based on Sigmoid and V-shaped transfer functions, and will be indicated by BGOA-S and BGOA-V, respectively. While the second mechanism uses a novel technique that combines the best solution obtained so far. In addition, a mutation operator is employed to enhance the exploration phase in BGOA algorithm (BGOA-M). The proposed methods are evaluated using 25 standard UCI datasets and compared with 8 well-regarded metaheuristic wrapper-based approaches, and six well known filter-based (e.g., correlation FS) approaches. The comparative results show the superior performance of the BGOA and BGOA-M methods compared to other similar techniques in the literature.

Journal ArticleDOI
TL;DR: A novel unsupervised context-sensitive framework—deep change vector analysis (DCVA)—for CD in multitemporal VHR images that exploit convolutional neural network (CNN) features is proposed and experimental results on mult itemporal data sets of Worldview-2, Pleiades, and Quickbird images confirm the effectiveness of the proposed method.
Abstract: Change detection (CD) in multitemporal images is an important application of remote sensing. Recent technological evolution provided very high spatial resolution (VHR) multitemporal optical satellite images showing high spatial correlation among pixels and requiring an effective modeling of spatial context to accurately capture change information. Here, we propose a novel unsupervised context-sensitive framework—deep change vector analysis (DCVA)—for CD in multitemporal VHR images that exploit convolutional neural network (CNN) features. To have an unsupervised system, DCVA starts from a suboptimal pretrained multilayered CNN for obtaining deep features that can model spatial relationship among neighboring pixels and thus complex objects. An automatic feature selection strategy is employed layerwise to select features emphasizing both high and low prior probability change information. Selected features from multiple layers are combined into a deep feature hypervector providing a multiscale scene representation. The use of the same pretrained CNN for semantic segmentation of single images enables us to obtain coherent multitemporal deep feature hypervectors that can be compared pixelwise to obtain deep change vectors that also model spatial context information. Deep change vectors are analyzed based on their magnitude to identify changed pixels. Then, deep change vectors corresponding to identified changed pixels are binarized to obtain a compressed binary deep change vectors that preserve information about the direction (kind) of change. Changed pixels are analyzed for multiple CD based on the binary features, thus implicitly using the spatial information. Experimental results on multitemporal data sets of Worldview-2, Pleiades, and Quickbird images confirm the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: The results show that TQWT performs better or comparable to the state-of-the-art speech signal processing techniques used in PD classification, and Mel-frequency cepstral and the tunable-Q wavelet coefficients, which give the highest accuracies, contain complementary information inPD classification problem resulting in an improved system when combined using a filter feature selection technique.

Journal ArticleDOI
TL;DR: The results of the study indicate that ensemble techniques, such as bagging and boosting, are effective in improving the prediction accuracy of weak classifiers, and exhibit satisfactory performance in identifying risk of heart disease.

Journal ArticleDOI
TL;DR: A binary version of the hybrid grey wolf optimization (GWO) and particle swarm optimization (PSO) is proposed to solve feature selection problems in this paper and significantly outperformed the binary GWO (BGWO), the binary PSO, the binary genetic algorithm, and the whale optimization algorithm with simulated annealing when using several performance measures.
Abstract: A binary version of the hybrid grey wolf optimization (GWO) and particle swarm optimization (PSO) is proposed to solve feature selection problems in this paper. The original PSOGWO is a new hybrid optimization algorithm that benefits from the strengths of both GWO and PSO. Despite the superior performance, the original hybrid approach is appropriate for problems with a continuous search space. Feature selection, however, is a binary problem. Therefore, a binary version of hybrid PSOGWO called BGWOPSO is proposed to find the best feature subset. To find the best solutions, the wrapper-based method K-nearest neighbors classifier with Euclidean separation matric is utilized. For performance evaluation of the proposed binary algorithm, 18 standard benchmark datasets from UCI repository are employed. The results show that BGWOPSO significantly outperformed the binary GWO (BGWO), the binary PSO, the binary genetic algorithm, and the whale optimization algorithm with simulated annealing when using several performance measures including accuracy, selecting the best optimal features, and the computational time.

Journal ArticleDOI
TL;DR: Experimental results confirm the efficiency of the proposed approaches in improving the classification accuracy compared to other wrapper-based algorithms, which proves the ability of BOA algorithm in searching the feature space and selecting the most informative attributes for classification tasks.
Abstract: In this paper, binary variants of the Butterfly Optimization Algorithm (BOA) are proposed and used to select the optimal feature subset for classification purposes in a wrapper-mode. BOA is a recently proposed algorithm that has not been systematically applied to feature selection problems yet. BOA can efficiently explore the feature space for optimal or near-optimal feature subset minimizing a given fitness function. The two proposed binary variants of BOA are applied to select the optimal feature combination that maximizes classification accuracy while minimizing the number of selected features. In these variants, the native BOA is utilized while its continuous steps are bounded in a threshold using a suitable threshold function after squashing them. The proposed binary algorithms are compared with five state-of-the-art approaches and four latest high performing optimization algorithms. A number of assessment indicators are utilized to properly assess and compare the performance of these algorithms over 21 datasets from the UCI repository. The experimental results confirm the efficiency of the proposed approaches in improving the classification accuracy compared to other wrapper-based algorithms, which proves the ability of BOA algorithm in searching the feature space and selecting the most informative attributes for classification tasks.

Journal ArticleDOI
TL;DR: Simple multinomial methods, including generalized principal component analysis (GLM-PCA) for non-normal distributions, and feature selection using deviance are proposed, which outperform the current practice in a downstream clustering assessment using ground truth datasets.
Abstract: Single-cell RNA-Seq (scRNA-Seq) profiles gene expression of individual cells. Recent scRNA-Seq datasets have incorporated unique molecular identifiers (UMIs). Using negative controls, we show UMI counts follow multinomial sampling with no zero inflation. Current normalization procedures such as log of counts per million and feature selection by highly variable genes produce false variability in dimension reduction. We propose simple multinomial methods, including generalized principal component analysis (GLM-PCA) for non-normal distributions, and feature selection using deviance. These methods outperform the current practice in a downstream clustering assessment using ground truth datasets.

Journal ArticleDOI
TL;DR: A CNN-based framework is suggested, that can be applied on a collection of data from a variety of sources, including different markets, in order to extract features for predicting the future of those markets.
Abstract: Feature extraction from financial data is one of the most important problems in market prediction domain for which many approaches have been suggested. Among other modern tools, convolutional neural networks (CNN) have recently been applied for automatic feature selection and market prediction. However, in experiments reported so far, less attention has been paid to the correlation among different markets as a possible source of information for extracting features. In this paper, we suggest a CNN-based framework, that can be applied on a collection of data from a variety of sources, including different markets, in order to extract features for predicting the future of those markets. The suggested framework has been applied for predicting the next day’s direction of movement for the indices of S&P 500, NASDAQ, DJI, NYSE, and RUSSELL based on various sets of initial variables. The evaluations show a significant improvement in prediction’s performance compared to the state of the art baseline algorithms.

Journal ArticleDOI
TL;DR: A hybrid optimization method is presented for the FS problem; it combines the slap swarm algorithm (SSA) with the particle swarm optimization (SSAPSO) to create an algorithm called SSAPSO, in which the efficacy of the exploration and the exploitation steps is improved.
Abstract: Feature selection (FS) is a machine learning process commonly used to reduce the high dimensionality problems of datasets This task permits to extract the most representative information of high sized pools of data, reducing the computational effort in other tasks as classification This article presents a hybrid optimization method for the FS problem; it combines the slap swarm algorithm (SSA) with the particle swarm optimization The hybridization between both approaches creates an algorithm called SSAPSO, in which the efficacy of the exploration and the exploitation steps is improved To verify the performance of the proposed algorithm, it is tested over two experimental series, in the first one, it is compared with other similar approaches using benchmark functions Meanwhile, in the second set of experiments, the SSAPSO is used to determine the best set of features using different UCI datasets Where the redundant or the confusing features are removed from the original dataset while keeping or yielding a better accuracy The experimental results provide the evidence of the enhancement in the SSAPSO regarding the performance and the accuracy without affecting the computational effort

Posted Content
TL;DR: In this article, a feature selective anchor-free (FSAF) module is proposed to address two limitations of anchor-based detection: 1) heuristic-guided feature selection and 2) overlap-based anchor sampling.
Abstract: We motivate and present feature selective anchor-free (FSAF) module, a simple and effective building block for single-shot object detectors. It can be plugged into single-shot detectors with feature pyramid structure. The FSAF module addresses two limitations brought up by the conventional anchor-based detection: 1) heuristic-guided feature selection; 2) overlap-based anchor sampling. The general concept of the FSAF module is online feature selection applied to the training of multi-level anchor-free branches. Specifically, an anchor-free branch is attached to each level of the feature pyramid, allowing box encoding and decoding in the anchor-free manner at an arbitrary level. During training, we dynamically assign each instance to the most suitable feature level. At the time of inference, the FSAF module can work jointly with anchor-based branches by outputting predictions in parallel. We instantiate this concept with simple implementations of anchor-free branches and online feature selection strategy. Experimental results on the COCO detection track show that our FSAF module performs better than anchor-based counterparts while being faster. When working jointly with anchor-based branches, the FSAF module robustly improves the baseline RetinaNet by a large margin under various settings, while introducing nearly free inference overhead. And the resulting best model can achieve a state-of-the-art 44.6% mAP, outperforming all existing single-shot detectors on COCO.

Journal ArticleDOI
TL;DR: This paper focuses on a survey of feature selection methods and can conclude that most of the FS methods use static data, while the existing DR algorithms do not address the issues with the dynamic data.
Abstract: Abstract Nowadays, being in digital era the data generated by various applications are increasing drastically both row-wise and column wise; this creates a bottleneck for analytics and also increases the burden of machine learning algorithms that work for pattern recognition. This cause of dimensionality can be handled through reduction techniques. The Dimensionality Reduction (DR) can be handled in two ways namely Feature Selection (FS) and Feature Extraction (FE). This paper focuses on a survey of feature selection methods, from this extensive survey we can conclude that most of the FS methods use static data. However, after the emergence of IoT and web-based applications, the data are generated dynamically and grow in a fast rate, so it is likely to have noisy data, it also hinders the performance of the algorithm. With the increase in the size of the data set, the scalability of the FS methods becomes jeopardized. So the existing DR algorithms do not address the issues with the dynamic data. Using FS methods not only reduces the burden of the data but also avoids overfitting of the model.

Journal ArticleDOI
TL;DR: This paper generalizes variable selection methods in a simple manner to introduce their classifications, merits and drawbacks, to provide a better understanding of their characteristics, similarities and differences.
Abstract: With the advances in innovative instrumentation and various valuable applications, near-infrared (NIR) spectroscopy has become a mature analytical technique in various fields. Variable (wavelength) selection is a critical step in multivariate calibration of NIR spectra, which can improve the prediction performance, make the calibration reliable and provide simpler interpretation. During the last several decades, there have been a large number of variable selection methods proposed in NIR spectroscopy. In this paper, we generalize variable selection methods in a simple manner to introduce their classifications, merits and drawbacks, to provide a better understanding of their characteristics, similarities and differences. We also introduce some hybrid and modified methods, highlighting their improvements. Finally, we summarize the limitations of existing variable selection methods, providing our remarks and suggestions on the development of variable selection methods, to promote the development of NIR spectroscopy.

Journal ArticleDOI
Gao Xianwei1, Chun Shan1, Changzhen Hu1, Zequn Niu1, Liu Zhen1 
TL;DR: It is proved that the ensemble model effectively improves detection accuracy, and it is found that the quality of data features is an important factor to determine the detection effect.
Abstract: In recent years, advanced threat attacks are increasing, but the traditional network intrusion detection system based on feature filtering has some drawbacks which make it difficult to find new attacks in time. This paper takes NSL-KDD data set as the research object, analyses the latest progress and existing problems in the field of intrusion detection technology, and proposes an adaptive ensemble learning model. By adjusting the proportion of training data and setting up multiple decision trees, we construct a MultiTree algorithm. In order to improve the overall detection effect, we choose several base classifiers, including decision tree, random forest, kNN, DNN, and design an ensemble adaptive voting algorithm. We use NSL-KDD Test+ to verify our approach, the accuracy of the MultiTree algorithm is 84.2%, while the final accuracy of the adaptive voting algorithm reaches 85.2%. Compared with other research papers, it is proved that our ensemble model effectively improves detection accuracy. In addition, through the analysis of data, it is found that the quality of data features is an important factor to determine the detection effect. In the future, we should optimize the feature selection and preprocessing of intrusion detection data to achieve better results.

Journal ArticleDOI
TL;DR: The experimental results show that the solution size obtained by the SaPSO algorithm is smaller than its EC counterparts on all datasets, and it performs better than its non-EC and EC counterparts in terms of classification accuracy not only on most training sets but also on most test sets.
Abstract: Many evolutionary computation (EC) methods have been used to solve feature selection problems and they perform well on most small-scale feature selection problems. However, as the dimensionality of feature selection problems increases, the solution space increases exponentially. Meanwhile, there are more irrelevant features than relevant features in datasets, which leads to many local optima in the huge solution space. Therefore, the existing EC methods still suffer from the problem of stagnation in local optima on large-scale feature selection problems. Furthermore, large-scale feature selection problems with different datasets may have different properties. Thus, it may be of low performance to solve different large-scale feature selection problems with an existing EC method that has only one candidate solution generation strategy (CSGS). In addition, it is time-consuming to find a suitable EC method and corresponding suitable parameter values for a given large-scale feature selection problem if we want to solve it effectively and efficiently. In this article, we propose a self-adaptive particle swarm optimization (SaPSO) algorithm for feature selection, particularly for large-scale feature selection. First, an encoding scheme for the feature selection problem is employed in the SaPSO. Second, three important issues related to self-adaptive algorithms are investigated. After that, the SaPSO algorithm with a typical self-adaptive mechanism is proposed. The experimental results on 12 datasets show that the solution size obtained by the SaPSO algorithm is smaller than its EC counterparts on all datasets. The SaPSO algorithm performs better than its non-EC and EC counterparts in terms of classification accuracy not only on most training sets but also on most test sets. Furthermore, as the dimensionality of the feature selection problem increases, the advantages of SaPSO become more prominent. This highlights that the SaPSO algorithm is suitable for solving feature selection problems, particularly large-scale feature selection problems.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a new discriminative correlation filter (DCF) based tracking method, which enables joint spatial-temporal filter learning in a lower dimensional discriminativity manifold, and applied structured spatial sparsity constraints to multi-channel filters.
Abstract: With efficient appearance learning models, discriminative correlation filter (DCF) has been proven to be very successful in recent video object tracking benchmarks and competitions. However, the existing DCF paradigm suffers from two major issues, i.e., spatial boundary effect and temporal filter degradation. To mitigate these challenges, we propose a new DCF-based tracking method. The key innovations of the proposed method include adaptive spatial feature selection and temporal consistent constraints, with which the new tracker enables joint spatial-temporal filter learning in a lower dimensional discriminative manifold. More specifically, we apply structured spatial sparsity constraints to multi-channel filters. Consequently, the process of learning spatial filters can be approximated by the lasso regularization. To encourage temporal consistency, the filter model is restricted to lie around its historical value and updated locally to preserve the global structure in the manifold. Last, a unified optimization framework is proposed to jointly select temporal consistency preserving spatial features and learn discriminative filters with the augmented Lagrangian method. Qualitative and quantitative evaluations have been conducted on a number of well-known benchmarking datasets such as OTB2013, OTB50, OTB100, Temple-Colour, UAV123, and VOT2018. The experimental results demonstrate the superiority of the proposed method over the state-of-the-art approaches.

Journal ArticleDOI
TL;DR: A comprehensive review on feature selection techniques for text classification, including Nearest Neighbor (NN) method, Naïve Bayes, Support Vector Machine (SVM), Decision Tree (DT), and Neural Networks, is given.
Abstract: Big multimedia data is heterogeneous in essence, that is, the data may be a mixture of video, audio, text, and images. This is due to the prevalence of novel applications in recent years, such as social media, video sharing, and location based services (LBS), etc. In many multimedia applications, for example, video/image tagging and multimedia recommendation, text classification techniques have been used extensively to facilitate multimedia data processing. In this paper, we give a comprehensive review on feature selection techniques for text classification. We begin by introducing some popular representation schemes for documents, and similarity measures used in text classification. Then, we review the most popular text classifiers, including Nearest Neighbor (NN) method, Naive Bayes (NB), Support Vector Machine (SVM), Decision Tree (DT), and Neural Networks. Next, we survey four feature selection models, namely the filter, wrapper, embedded and hybrid, discussing pros and cons of the state-of-the-art feature selection approaches. Finally, we conclude the paper and give a brief introduction to some interesting feature selection work that does not belong to the four models.

Journal ArticleDOI
TL;DR: The proposed work, deploys filter and wrapper based method with firefly algorithm in the wrapper for selecting the features, and shows that 10 features are sufficient to detect the intrusion showing improved accuracy.

Journal ArticleDOI
01 Feb 2019
TL;DR: This paper proposes an IDS based on feature selection and clustering algorithm using filter and wrapper methods that has a high accuracy and detection rate with a low false positive rate compared to the existing methods in the literature.
Abstract: Due to the widespread diffusion of network connectivity, the demand for network security and protection against cyber-attacks is ever increasing. Intrusion detection systems (IDS) perform an essential role in today's network security. This paper proposes an IDS based on feature selection and clustering algorithm using filter and wrapper methods. Filter and wrapper methods are named feature grouping based on linear correlation coefficient (FGLCC) algorithm and cuttlefish algorithm (CFA), respectively. Decision tree is used as the classifier in the proposed method. For performance verification, the proposed method was applied on KDD Cup 99 large data sets. The results verified a high accuracy (95.03%) and detection rate (95.23%) with a low false positive rate (1.65%) compared to the existing methods in the literature.

Journal ArticleDOI
TL;DR: Comparisons with other state-of-the-art deep neural networks and traditional methods proves that the proposed method can overcome defects of traditional signal process and artificial feature selection.