scispace - formally typeset
Search or ask a question

Showing papers on "Dimensionality reduction published in 2017"


Posted Content
TL;DR: Multi-task learning (MTL) as mentioned in this paper is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the generalization performance of all the tasks.
Abstract: Multi-Task Learning (MTL) is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the generalization performance of all the tasks. In this paper, we give a survey for MTL from the perspective of algorithmic modeling, applications and theoretical analyses. For algorithmic modeling, we give a definition of MTL and then classify different MTL algorithms into five categories, including feature learning approach, low-rank approach, task clustering approach, task relation learning approach and decomposition approach as well as discussing the characteristics of each approach. In order to improve the performance of learning tasks further, MTL can be combined with other learning paradigms including semi-supervised learning, active learning, unsupervised learning, reinforcement learning, multi-view learning and graphical models. When the number of tasks is large or the data dimensionality is high, we review online, parallel and distributed MTL models as well as dimensionality reduction and feature hashing to reveal their computational and storage advantages. Many real-world applications use MTL to boost their performance and we review representative works in this paper. Finally, we present theoretical analyses and discuss several future directions for MTL.

1,202 citations


Journal ArticleDOI
TL;DR: It is shown that SIMLR is scalable and greatly enhances clustering performance while improving the visualization and interpretability of single-cell sequencing data.
Abstract: We present single-cell interpretation via multikernel learning (SIMLR), an analytic framework and software which learns a similarity measure from single-cell RNA-seq data in order to perform dimension reduction, clustering and visualization. On seven published data sets, we benchmark SIMLR against state-of-the-art methods. We show that SIMLR is scalable and greatly enhances clustering performance while improving the visualization and interpretability of single-cell sequencing data.

530 citations


Proceedings Article
06 Aug 2017
TL;DR: A joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN) while exploiting theDeep neural network's ability to approximate any nonlinear function is proposed.
Abstract: Most learning approaches treat dimensionality reduction (DR) and clustering separately (i.e., sequentially), but recent research has shown that optimizing the two tasks jointly can substantially improve the performance of both. The premise behind the latter genre is that the data samples are obtained via linear transformation of latent representations that are easy to cluster; but in practice, the transformation from the latent space to the data can be more complicated. In this work, we assume that this transformation is an unknown and possibly nonlinear function. To recover the 'clustering-friendly' latent representations and to better cluster the data, we propose a joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN). The motivation is to keep the advantages of jointly optimizing the two tasks, while exploiting the deep neural network's ability to approximate any nonlinear function. This way, the proposed approach can work well for a broad class of generative models. Towards this end, we carefully design the DNN structure and the associated joint optimization criterion, and propose an effective and scalable algorithm to handle the formulated optimization problem. Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach.

521 citations


Journal ArticleDOI
TL;DR: A solid intuition is built for what is LDA, and how LDA works, thus enabling readers of all levels to get a better understanding of the LDA and to know how to apply this technique in different applications.
Abstract: Linear Discriminant Analysis (LDA) is a very common technique for dimensionality reduction problems as a preprocessing step for machine learning and pattern classification applications. At the same time, it is usually used as a black box, but (sometimes) not well understood. The aim of this paper is to build a solid intuition for what is LDA, and how LDA works, thus enabling readers of all levels be able to get a better understanding of the LDA and to know how to apply this technique in different applications. The paper first gave the basic definitions and steps of how LDA technique works supported with visual explanations of these steps. Moreover, the two methods of computing the LDA space, i.e. class-dependent and class-independent methods, were explained in details. Then, in a step-by-step approach, two numerical examples are demonstrated to show how the LDA space can be calculated in case of the class-dependent and class-independent methods. Furthermore, two of the most common LDA problems (i.e. Small Sample Size (SSS) and non-linearity problems) were highlighted and illustrated, and state-of-the-art solutions to these problems were investigated and explained. Finally, a number of experiments was conducted with different datasets to (1) investigate the effect of the eigenvectors that used in the LDA space on the robustness of the extracted feature for the classification accuracy, and (2) to show when the SSS problem occurs and how it can be addressed.

518 citations


Journal ArticleDOI
TL;DR: A generic computer vision system designed for exploiting trained deep Convolutional Neural Networks as a generic feature extractor and mixing these features with more traditional hand-crafted features is presented, demonstrating the generalizability of the proposed approach.

376 citations


Proceedings ArticleDOI
01 Oct 2017
TL;DR: Experimental results show the proposed CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes outperforms existing deep architectures, and can localize images in hard conditions, where classic SIFT-based methods fail.
Abstract: In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes. CNNs allow us to learn suitable feature representations for localization that are robust against motion blur and illumination changes. We make use of LSTM units on the CNN output, which play the role of a structured dimensionality reduction on the feature vector, leading to drastic improvements in localization performance. We provide extensive quantitative comparison of CNN-based and SIFT-based localization methods, showing the weaknesses and strengths of each. Furthermore, we present a new large-scale indoor dataset with accurate ground truth from a laser scanner. Experimental results on both indoor and outdoor public datasets show our method outperforms existing deep architectures, and can localize images in hard conditions, e.g., in the presence of mostly textureless surfaces, where classic SIFT-based methods fail.

322 citations


Journal ArticleDOI
TL;DR: This paper proposes a new unsupervised spectral feature selection model by embedding a graph regularizer into the framework of joint sparse regression for preserving the local structures of data by proposing a novel joint graph sparse coding (JGSC) model.
Abstract: In this paper, we propose a new unsupervised spectral feature selection model by embedding a graph regularizer into the framework of joint sparse regression for preserving the local structures of data. To do this, we first extract the bases of training data by previous dictionary learning methods and, then, map original data into the basis space to generate their new representations, by proposing a novel joint graph sparse coding (JGSC) model. In JGSC, we first formulate its objective function by simultaneously taking subspace learning and joint sparse regression into account, then, design a new optimization solution to solve the resulting objective function, and further prove the convergence of the proposed solution. Furthermore, we extend JGSC to a robust JGSC (RJGSC) via replacing the least square loss function with a robust loss function, for achieving the same goals and also avoiding the impact of outliers. Finally, experimental results on real data sets showed that both JGSC and RJGSC outperformed the state-of-the-art algorithms in terms of ${k}$ -nearest neighbor classification performance.

321 citations


Journal ArticleDOI
TL;DR: The main idea behind this model is to construct a multi class SVM which has not been adopted for IDS so far to decrease the training and testing time and increase the individual classification accuracy of the network attacks.

321 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed PSO-based multi-objective feature selection algorithm can automatically evolve a set of nondominated solutions, and it is a highly competitive feature selection method for solving cost-based feature selection problems.
Abstract: Feature selection is an important data-preprocessing technique in classification problems such as bioinformatics and signal processing Generally, there are some situations where a user is interested in not only maximizing the classification performance but also minimizing the cost that may be associated with features This kind of problem is called cost-based feature selection However, most existing feature selection approaches treat this task as a single-objective optimization problem This paper presents the first study of multi-objective particle swarm optimization PSO for cost-based feature selection problems The task of this paper is to generate a Pareto front of nondominated solutions, that is, feature subsets, to meet different requirements of decision-makers in real-world applications In order to enhance the search capability of the proposed algorithm, a probability-based encoding technology and an effective hybrid operator, together with the ideas of the crowding distance, the external archive, and the Pareto domination relationship, are applied to PSO The proposed PSO-based multi-objective feature selection algorithm is compared with several multi-objective feature selection algorithms on five benchmark datasets Experimental results show that the proposed algorithm can automatically evolve a set of nondominated solutions, and it is a highly competitive feature selection method for solving cost-based feature selection problems

291 citations


Journal ArticleDOI
TL;DR: FIt-S NE, a sped-up version of t-SNE, enables visualization of rare cell types in large datasets by obviating the need for downsampling.
Abstract: t-distributed Stochastic Neighborhood Embedding (t-SNE) is a method for dimensionality reduction and visualization that has become widely popular in recent years. Efficient implementations of t-SNE are available, but they scale poorly to datasets with hundreds of thousands to millions of high dimensional data-points. We present Fast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE), which dramatically accelerates the computation of t-SNE. The most time-consuming step of t-SNE is a convolution that we accelerate by interpolating onto an equispaced grid and subsequently using the fast Fourier transform to perform the convolution. We also optimize the computation of input similarities in high dimensions using multi-threaded approximate nearest neighbors. We further present a modification to t-SNE called "late exaggeration," which allows for easier identification of clusters in t-SNE embeddings. Finally, for datasets that cannot be loaded into the memory, we present out-of-core randomized principal component analysis (oocPCA), so that the top principal components of a dataset can be computed without ever fully loading the matrix, hence allowing for t-SNE of large datasets to be computed on resource-limited machines.

281 citations


Journal ArticleDOI
TL;DR: It is shown how visualization can provide highly valuable feedback for network designers through experiments conducted in three traditional image classification benchmark datasets, and the presence of interpretable clusters of learned representations and the partitioning of artificial neurons into groups with apparently related discriminative roles are discovered.
Abstract: In machine learning, pattern classification assigns high-dimensional vectors (observations) to classes based on generalization from examples. Artificial neural networks currently achieve state-of-the-art results in this task. Although such networks are typically used as black-boxes, they are also widely believed to learn (high-dimensional) higher-level representations of the original observations. In this paper, we propose using dimensionality reduction for two tasks: visualizing the relationships between learned representations of observations, and visualizing the relationships between artificial neurons. Through experiments conducted in three traditional image classification benchmark datasets, we show how visualization can provide highly valuable feedback for network designers. For instance, our discoveries in one of these datasets (SVHN) include the presence of interpretable clusters of learned representations, and the partitioning of artificial neurons into groups with apparently related discriminative roles.

Journal ArticleDOI
TL;DR: An ensemble approach for feature selection is presented, which aggregates the several individual feature lists obtained by the different feature selection methods so that a more robust and efficient feature subset can be obtained.
Abstract: Sentiment analysis is an important research direction of natural language processing, text mining and web mining which aims to extract subjective information in source materials The main challenge encountered in machine learning method-based sentiment classification is the abundant amount of data available This amount makes it difficult to train the learning algorithms in a feasible time and degrades the classification accuracy of the built model Hence, feature selection becomes an essential task in developing robust and efficient classification models whilst reducing the training time In text mining applications, individual filter-based feature selection methods have been widely utilized owing to their simplicity and relatively high performance This paper presents an ensemble approach for feature selection, which aggregates the several individual feature lists obtained by the different feature selection methods so that a more robust and efficient feature subset can be obtained In order to aggregate the individual feature lists, a genetic algorithm has been utilized Experimental evaluations indicated that the proposed aggregation model is an efficient method and it outperforms individual filter-based feature selection methods on sentiment classification

Journal ArticleDOI
TL;DR: A group of hypothesis tests are performed to show that combining the ANNs with the PCA gives slightly higher classification accuracy than the other two combinations, and that the trading strategies guided by the comprehensive classification mining procedures based on PCA and ANNs gain significantly higher risk-adjusted profits than the comparison benchmarks.
Abstract: A data mining procedure to forecast daily stock market return is proposed.The raw data includes 60 financial and economic features over a 10-year period.Combining ANNs with PCA gives slightly higher classification accuracy.Combining ANNs with PCA provides significantly higher risk-adjusted profits. In financial markets, it is both important and challenging to forecast the daily direction of the stock market return. Among the few studies that focus on predicting daily stock market returns, the data mining procedures utilized are either incomplete or inefficient, especially when a large amount of features are involved. This paper presents a complete and efficient data mining process to forecast the daily direction of the S&P 500 Index ETF (SPY) return based on 60 financial and economic features. Three mature dimensionality reduction techniques, including principal component analysis (PCA), fuzzy robust principal component analysis (FRPCA), and kernel-based principal component analysis (KPCA) are applied to the whole data set to simplify and rearrange the original data structure. Corresponding to different levels of the dimensionality reduction, twelve new data sets are generated from the entire cleaned data using each of the three different dimensionality reduction methods. Artificial neural networks (ANNs) are then used with the thirty-six transformed data sets for classification to forecast the daily direction of future market returns. Moreover, the three different dimensionality reduction methods are compared with respect to the natural data set. A group of hypothesis tests are then performed over the classification and simulation results to show that combining the ANNs with the PCA gives slightly higher classification accuracy than the other two combinations, and that the trading strategies guided by the comprehensive classification mining procedures based on PCA and ANNs gain significantly higher risk-adjusted profits than the comparison benchmarks, while also being slightly higher than those strategies guided by the forecasts based on the FRPCA and KPCA models.

Journal ArticleDOI
TL;DR: This paper compares the differences and commonalities of these methods based on regression and regularization strategies, but also provides useful guidelines to practitioners working in related fields to guide them how to do feature selection.
Abstract: Feature selection (FS) is an important component of many pattern recognition tasks. In these tasks, one is often confronted with very high-dimensional data. FS algorithms are designed to identify the relevant feature subset from the original features, which can facilitate subsequent analysis, such as clustering and classification. Structured sparsity-inducing feature selection (SSFS) methods have been widely studied in the last few years, and a number of algorithms have been proposed. However, there is no comprehensive study concerning the connections between different SSFS methods, and how they have evolved. In this paper, we attempt to provide a survey on various SSFS methods, including their motivations and mathematical representations. We then explore the relationship among different formulations and propose a taxonomy to elucidate their evolution. We group the existing SSFS methods into two categories, i.e., vector-based feature selection (feature selection based on lasso) and matrix-based feature selection (feature selection based on ${l_{r,p}}$ -norm). Furthermore, FS has been combined with other machine learning algorithms for specific applications, such as multitask learning, multilabel learning, multiview learning, classification, and clustering. This paper not only compares the differences and commonalities of these methods based on regression and regularization strategies, but also provides useful guidelines to practitioners working in related fields to guide them how to do feature selection.

Journal ArticleDOI
TL;DR: This monograph builds on Tensor Networks for Dimensionality Reduction and Large-scale Optimization by discussing tensor network models for super-compressed higher-order representation of data/parameters and cost functions, together with an outline of their applications in machine learning and data analytics.
Abstract: Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.

Journal ArticleDOI
TL;DR: In this paper, a modification of an autoencoder type deep neural network was applied to the task of dimension reduction of molecular dynamics data, which can reliably find low-dimensional embeddings for high-dimensional feature spaces which capture the slow dynamics of the underlying stochastic processes.
Abstract: Inspired by the success of deep learning techniques in the physical and chemical sciences, we apply a modification of an autoencoder type deep neural network to the task of dimension reduction of molecular dynamics data. We can show that our time-lagged autoencoder reliably finds low-dimensional embeddings for high-dimensional feature spaces which capture the slow dynamics of the underlying stochastic processes - beyond the capabilities of linear dimension reduction techniques.

Book
01 Jan 2017
TL;DR: In this article, the authors give computers the ability to learn from data training using simple ML Algorithms for Classification ML Classifiers Using scikit-learn Building Good Training Datasets - Data Preprocessing Compressing Data via Dimensionality Reduction Best Practices for Model Evaluation and Hyperparameter Tuning Combining Different Models for Ensemble Learning Applying ML to Sentiment Analysis Embedding a ML Model into a Web Application Predicting Continuous Target Variables with Regression Analysis Working with Unlabeled Data - Clustering Analysis Implementing Multilayer Artificial Neural Networks Parallelizing
Abstract: Table of Contents Giving Computers the Ability to Learn from Data Training Simple ML Algorithms for Classification ML Classifiers Using scikit-learn Building Good Training Datasets - Data Preprocessing Compressing Data via Dimensionality Reduction Best Practices for Model Evaluation and Hyperparameter Tuning Combining Different Models for Ensemble Learning Applying ML to Sentiment Analysis Embedding a ML Model into a Web Application Predicting Continuous Target Variables with Regression Analysis Working with Unlabeled Data - Clustering Analysis Implementing Multilayer Artificial Neural Networks Parallelizing Neural Network Training with TensorFlow TensorFlow Mechanics Classifying Images with Deep Convolutional Neural Networks Modeling Sequential Data Using Recurrent Neural Networks GANs for Synthesizing New Data RL for Decision Making in Complex Environments

Journal ArticleDOI
TL;DR: A computational model named Laplacian Regularized Sparse Subspace Learning for MiRNA-Disease Association prediction (LRSSLMDA), which projected miRNAs/diseases’ statistical feature profile and graph theoretical feature profile to a common subspace and would be a valuable computational tool for miRNA-disease association prediction.
Abstract: Predicting novel microRNA (miRNA)-disease associations is clinically significant due to miRNAs' potential roles of diagnostic biomarkers and therapeutic targets for various human diseases. Previous studies have demonstrated the viability of utilizing different types of biological data to computationally infer new disease-related miRNAs. Yet researchers face the challenge of how to effectively integrate diverse datasets and make reliable predictions. In this study, we presented a computational model named Laplacian Regularized Sparse Subspace Learning for MiRNA-Disease Association prediction (LRSSLMDA), which projected miRNAs/diseases' statistical feature profile and graph theoretical feature profile to a common subspace. It used Laplacian regularization to preserve the local structures of the training data and a L1-norm constraint to select important miRNA/disease features for prediction. The strength of dimensionality reduction enabled the model to be easily extended to much higher dimensional datasets than those exploited in this study. Experimental results showed that LRSSLMDA outperformed ten previous models: the AUC of 0.9178 in global leave-one-out cross validation (LOOCV) and the AUC of 0.8418 in local LOOCV indicated the model's superior prediction accuracy; and the average AUC of 0.9181+/-0.0004 in 5-fold cross validation justified its accuracy and stability. In addition, three types of case studies further demonstrated its predictive power. Potential miRNAs related to Colon Neoplasms, Lymphoma, Kidney Neoplasms, Esophageal Neoplasms and Breast Neoplasms were predicted by LRSSLMDA. Respectively, 98%, 88%, 96%, 98% and 98% out of the top 50 predictions were validated by experimental evidences. Therefore, we conclude that LRSSLMDA would be a valuable computational tool for miRNA-disease association prediction.

Journal ArticleDOI
TL;DR: Hierarchical Stochastic Neighbor Embedding (HSNE) is introduced, a method for analysis of mass cytometry data that can handle very large datasets and allows their intuitive and hierarchical exploration.
Abstract: Mass cytometry allows high-resolution dissection of the cellular composition of the immune system. However, the high-dimensionality, large size, and non-linear structure of the data poses considerable challenges for the data analysis. In particular, dimensionality reduction-based techniques like t-SNE offer single-cell resolution but are limited in the number of cells that can be analyzed. Here we introduce Hierarchical Stochastic Neighbor Embedding (HSNE) for the analysis of mass cytometry data sets. HSNE constructs a hierarchy of non-linear similarities that can be interactively explored with a stepwise increase in detail up to the single-cell level. We apply HSNE to a study on gastrointestinal disorders and three other available mass cytometry data sets. We find that HSNE efficiently replicates previous observations and identifies rare cell populations that were previously missed due to downsampling. Thus, HSNE removes the scalability limit of conventional t-SNE analysis, a feature that makes it highly suitable for the analysis of massive high-dimensional data sets.

Journal ArticleDOI
TL;DR: A novel approach called joint sparse principal component analysis (JSPCA) is proposed to jointly select useful features and enhance robustness to outliers and the experimental results demonstrate that the proposed approach is feasible and effective.

Journal ArticleDOI
Jiaming Xu1, Bo Xu1, Peng Wang1, Suncong Zheng1, Guanhua Tian1, Jun Zhao1 
TL;DR: A flexible Self-Taught Convolutional neural network framework for Short Text Clustering, which can flexibly and successfully incorporate more useful semantic features and learn non-biased deep text representation in an unsupervised manner is proposed.

Journal ArticleDOI
TL;DR: This work uses a deep neural network of the variational autoencoder type to construct a parametric low-dimensional base model parameterization of complex binary geological media and finds that the dimensionality reduction (DR) approach outperforms principle component analysis (PCA), optimization-PCA, and discrete cosine transform (DCT) DR techniques for unconditional geostatistical simulation of a channelized prior model.

Journal ArticleDOI
TL;DR: Three meta-heuristic algorithms are adapted to solve the feature selection problem and a new dynamic dimension reduction (DDR) method is provided to reduce the number of features used in clustering and thus improve the performance of the algorithms.
Abstract: Three meta-heuristic algorithms are adapted to solve the feature selection problem.Feature selection methods are established based on a novel weighting scheme.Dimension reduction technique is proposed to reduce the number of features.K-mean clustering algorithm is used based on the features obtained.The proposed methods outperform the comparative methods. This paper proposes three feature selection algorithms with feature weight scheme and dynamic dimension reduction for the text document clustering problem. Text document clustering is a new trend in text mining; in this process, text documents are separated into several coherent clusters according to carefully selected informative features by using proper evaluation function, which usually depends on term frequency. Informative features in each document are selected using feature selection methods. Genetic algorithm (GA), harmony search (HS) algorithm, and particle swarm optimization (PSO) algorithm are the most successful feature selection methods established using a novel weighting scheme, namely, length feature weight (LFW), which depends on term frequency and appearance of features in other documents. A new dynamic dimension reduction (DDR) method is also provided to reduce the number of features used in clustering and thus improve the performance of the algorithms. Finally, k-mean, which is a popular clustering method, is used to cluster the set of text documents based on the terms (or features) obtained by dynamic reduction. Seven text mining benchmark text datasets of different sizes and complexities are evaluated. Analysis with k-mean shows that particle swarm optimization with length feature weight and dynamic reduction produces the optimal outcomes for almost all datasets tested. This paper provides new alternatives for text mining community to cluster text documents by using cohesive and informative features.

Journal ArticleDOI
Shinhyun Choi1, Jong Hoon Shin1, Jihang Lee1, Patrick Sheridan1, Wei Lu1 
TL;DR: It is experimentally demonstrated that memristor arrays can be used to perform principal component analysis, one of the most commonly used feature extraction techniques, through online, unsupervised learning.
Abstract: Memristors have been considered as a leading candidate for a number of critical applications ranging from nonvolatile memory to non-Von Neumann computing systems. Feature extraction, which aims to transform input data from a high-dimensional space to a space with fewer dimensions, is an important technique widely used in machine learning and pattern recognition applications. Here, we experimentally demonstrate that memristor arrays can be used to perform principal component analysis, one of the most commonly used feature extraction techniques, through online, unsupervised learning. Using Sanger’s rule, that is, the generalized Hebbian algorithm, the principal components were obtained as the memristor conductances in the network after training. The network was then used to analyze sensory data from a standard breast cancer screening database with high classification success rate (97.1%).

Journal ArticleDOI
TL;DR: The proposed approach is extensively evaluated on three challenging benchmark scene datasets and the experimental results show that the proposed approach leads to superior classification performance compared with the state-of-the-art classification methods.
Abstract: In this paper, a fused global saliency-based multiscale multiresolution multistructure local binary pattern (salM 3LBP) feature and local codebookless model (CLM) feature is proposed for high-resolution image scene classification. First, two different but complementary types of descriptors (pixel intensities and differences) are developed to extract global features, characterizing the dominant spatial features in multiple scale, multiple resolution, and multiple structure manner. The micro/macrostructure information and rotation invariance are guaranteed in the global feature extraction process. For dense local feature extraction, CLM is utilized to model local enrichment scale invariant feature transform descriptor and dimension reduction is conducted via joint low-rank learning with support vector machine. Finally, a fused feature representation between salM3LBP and CLM as the scene descriptor to train a kernel-based extreme learning machine for scene classification is presented. The proposed approach is extensively evaluated on three challenging benchmark scene datasets (the 21-class land-use scene, 19-class satellite scene, and a newly available 30-class aerial scene), and the experimental results show that the proposed approach leads to superior classification performance compared with the state-of-the-art classification methods.

Posted Content
09 Apr 2017
TL;DR: A strategy for incorporating dimensionality reduction via Principal Component Analysis to enhance the resilience of machine learning, targeting both the classification and the training phase is presented and investigated.
Abstract: We propose the use of dimensionality reduction as a defense against evasion attacks on ML classifiers. We present and investigate a strategy for incorporating dimensionality reduction via Principal Component Analysis to enhance the resilience of machine learning, targeting both the classification and the training phase. We empirically evaluate and demonstrate the feasibility of dimensionality reduction of data as a defense mechanism against evasion attacks using multiple real-world datasets. Our key findings are that the defenses are (i) effective against strategic evasion attacks in the literature, increasing the resources required by an adversary for a successful attack by a factor of about two, (ii) applicable across a range of ML classifiers, including Support Vector Machines and Deep Neural Networks, and (iii) generalizable to multiple application domains, including image classification, malware classification, and human activity classification.

Journal ArticleDOI
TL;DR: The ability for t-SNE to reveal population stratification at different scales could be useful for human genetic association studies.
Abstract: The t-distributed stochastic neighbor embedding t-SNE is a new dimension reduction and visualization technique for high-dimensional data. t-SNE is rarely applied to human genetic data, even though it is commonly used in other data-intensive biological fields, such as single-cell genomics. We explore the applicability of t-SNE to human genetic data and make these observations: (i) similar to previously used dimension reduction techniques such as principal component analysis (PCA), t-SNE is able to separate samples from different continents; (ii) unlike PCA, t-SNE is more robust with respect to the presence of outliers; (iii) t-SNE is able to display both continental and sub-continental patterns in a single plot. We conclude that the ability for t-SNE to reveal population stratification at different scales could be useful for human genetic association studies.

Journal ArticleDOI
TL;DR: A new unsupervised feature selection by integrating a subspace learning method into a new feature selection method (i.e., Locality Preserving Projection) and adding a graph regularization term into the resulting feature selection model to simultaneously conduct feature selection and subspaceLearning.

Journal ArticleDOI
TL;DR: A novel dimensionality reduction algorithm, locality adaptive discriminant analysis (LADA) for HSI classification that aims to learn a representative subspace of data, and focuses on the data points with close relationship in spectral and spatial domains.
Abstract: Linear discriminant analysis (LDA) is a popular technique for supervised dimensionality reduction, but with less concern about a local data structure. This makes LDA inapplicable to many real-world situations, such as hyperspectral image (HSI) classification. In this letter, we propose a novel dimensionality reduction algorithm, locality adaptive discriminant analysis (LADA) for HSI classification. The proposed algorithm aims to learn a representative subspace of data, and focuses on the data points with close relationship in spectral and spatial domains. An intuitive motivation is that data points of the same class have similar spectral feature and the data points among spatial neighborhood are usually associated with the same class. Compared with traditional LDA and its variants, LADA is able to adaptively exploit the local manifold structure of data. Experiments carried out on several real hyperspectral data sets demonstrate the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: A novel method based on genetic algorithm-back propagation neural network (GA-BPNN) for classifying ECG signals with feature extraction using wavelet packet decomposition (WPD) and could be efficiently applied in the automatic identification of cardiac arrhythmias.
Abstract: Feature extraction and classification of electrocardiogram (ECG) signals are necessary for the automatic diagnosis of cardiac diseases. In this study, a novel method based on genetic algorithm-back propagation neural network (GA-BPNN) for classifying ECG signals with feature extraction using wavelet packet decomposition (WPD) is proposed. WPD combined with the statistical method is utilized to extract the effective features of ECG signals. The statistical features of the wavelet packet coefficients are calculated as the feature sets. GA is employed to decrease the dimensions of the feature sets and to optimize the weights and biases of the back propagation neural network (BPNN). Thereafter, the optimized BPNN classifier is applied to classify six types of ECG signals. In addition, an experimental platform is constructed for ECG signal acquisition to supply the ECG data for verifying the effectiveness of the proposed method. The GA-BPNN method with the MIT-BIH arrhythmia database achieved a dimension reduction of nearly 50% and produced good classification results with an accuracy of 97.78%. The experimental results based on the established acquisition platform indicated that the GA-BPNN method achieved a high classification accuracy of 99.33% and could be efficiently applied in the automatic identification of cardiac arrhythmias.