scispace - formally typeset
Search or ask a question

Showing papers on "Support vector machine published in 2015"


BookDOI
07 May 2015
TL;DR: Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underlying signal in a set of data and extract useful and reproducible patterns from big datasets.
Abstract: Discover New Methods for Dealing with High-Dimensional Data A sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underlying signal in a set of data. Top experts in this rapidly evolving field, the authors describe the lasso for linear regression and a simple coordinate descent algorithm for its computation. They discuss the application of 1 penalties to generalized linear models and support vector machines, cover generalized penalties such as the elastic net and group lasso, and review numerical methods for optimization. They also present statistical inference methods for fitted (lasso) models, including the bootstrap, Bayesian methods, and recently developed approaches. In addition, the book examines matrix decomposition, sparse multivariate analysis, graphical models, and compressed sensing. It concludes with a survey of theoretical results for the lasso. In this age of big data, the number of features measured on a person or object can be large and might be larger than the number of observations. This book shows how the sparsity assumption allows us to tackle these problems and extract useful and reproducible patterns from big datasets. Data analysts, computer scientists, and theorists will appreciate this thorough and up-to-date treatment of sparse statistical modeling.

2,275 citations


Journal ArticleDOI
TL;DR: Experimental results based on several hyperspectral image data sets demonstrate that the proposed method can achieve better classification performance than some traditional methods, such as support vector machines and the conventional deep learning-based methods.
Abstract: Recently, convolutional neural networks have demonstrated excellent performance on various visual tasks, including the classification of common two-dimensional images. In this paper, deep convolutional neural networks are employed to classify hyperspectral images directly in spectral domain. More specifically, the architecture of the proposed classifier contains five layers with weights which are the input layer, the convolutional layer, the max pooling layer, the full connection layer, and the output layer. These five layers are implemented on each spectral signature to discriminate against others. Experimental results based on several hyperspectral image data sets demonstrate that the proposed method can achieve better classification performance than some traditional methods, such as support vector machines and the conventional deep learning-based methods.

1,316 citations


Journal ArticleDOI
TL;DR: A new feature extraction (FE) and image classification framework are proposed for hyperspectral data analysis based on deep belief network (DBN) and a novel deep architecture is proposed, which combines the spectral-spatial FE and classification together to get high classification accuracy.
Abstract: Hyperspectral data classification is a hot topic in remote sensing community. In recent years, significant effort has been focused on this issue. However, most of the methods extract the features of original data in a shallow manner. In this paper, we introduce a deep learning approach into hyperspectral image classification. A new feature extraction (FE) and image classification framework are proposed for hyperspectral data analysis based on deep belief network (DBN). First, we verify the eligibility of restricted Boltzmann machine (RBM) and DBN by the following spectral information-based classification. Then, we propose a novel deep architecture, which combines the spectral–spatial FE and classification together to get high classification accuracy. The framework is a hybrid of principal component analysis (PCA), hierarchical learning-based FE, and logistic regression (LR). Experimental results with hyperspectral data indicate that the classifier provide competitive solution with the state-of-the-art methods. In addition, this paper reveals that deep learning system has huge potential for hyperspectral data classification.

1,028 citations


Journal ArticleDOI
TL;DR: DANN uses the same feature set and training data as CADD to train a deep neural network (DNN), which can capture non-linear relationships among features and are better suited than SVMs for problems with a large number of samples and features.
Abstract: Summary: Annotating genetic variants, especially non-coding variants, for the purpose of identifying pathogenic variants remains a challenge. Combined annotation-dependent depletion (CADD) is an algorithm designed to annotate both coding and non-coding variants, and has been shown to outperform other annotation algorithms. CADD trains a linear kernel support vector machine (SVM) to differentiate evolutionarily derived, likely benign, alleles from simulated, likely deleterious, variants. However, SVMs cannot capture non-linear relationships among the features, which can limit performance. To address this issue, we have developed DANN. DANN uses the same feature set and training data as CADD to train a deep neural network (DNN). DNNs can capture non-linear relationships among features and are better suited than SVMs for problems with a large number of samples and features. We exploit Compute Unified Device Architecture-compatible graphics processing units and deep learning techniques such as dropout and momentum training to accelerate the DNN training. DANN achieves about a 19% relative reduction in the error rate and about a 14% relative increase in the area under the curve (AUC) metric over CADD’s SVM methodology. Availability and implementation: All data and source code are available at https://cbcl.ics.uci.edu/public_data/DANN/. Contact: ude.icu.sci@xhx

773 citations


Proceedings ArticleDOI
26 Jul 2015
TL;DR: This work proposes a deep learning based classification method that hierarchically constructs high-level features in an automated way and exploits a Convolutional Neural Network to encode pixels' spectral and spatial information and a Multi-Layer Perceptron to conduct the classification task.
Abstract: Spectral observations along the spectrum in many narrow spectral bands through hyperspectral imaging provides valuable information towards material and object recognition, which can be consider as a classification task. Most of the existing studies and research efforts are following the conventional pattern recognition paradigm, which is based on the construction of complex handcrafted features. However, it is rarely known which features are important for the problem at hand. In contrast to these approaches, we propose a deep learning based classification method that hierarchically constructs high-level features in an automated way. Our method exploits a Convolutional Neural Network to encode pixels' spectral and spatial information and a Multi-Layer Perceptron to conduct the classification task. Experimental results and quantitative validation on widely used datasets showcasing the potential of the developed approach for accurate hyperspectral data classification.

738 citations


Journal ArticleDOI
TL;DR: The results of applying the above algorithms to epithermal Au prospectivity mapping of the Rodalquilar district, Spain, indicate that the RF outperformed the other MLA algorithms (ANNs, RTs and SVMs), showing higher stability and robustness with varying training parameters and better success rates and ROC analysis results.

670 citations


Posted Content
TL;DR: Zhang et al. as mentioned in this paper proposed an online visual tracking algorithm by learning discriminative saliency map using Convolutional Neural Network (CNN), which takes outputs from hidden layers of the network as feature descriptors since they show excellent representation performance in various general visual recognition problems.
Abstract: We propose an online visual tracking algorithm by learning discriminative saliency map using Convolutional Neural Network (CNN). Given a CNN pre-trained on a large-scale image repository in offline, our algorithm takes outputs from hidden layers of the network as feature descriptors since they show excellent representation performance in various general visual recognition problems. The features are used to learn discriminative target appearance models using an online Support Vector Machine (SVM). In addition, we construct target-specific saliency map by backpropagating CNN features with guidance of the SVM, and obtain the final tracking result in each frame based on the appearance model generatively constructed with the saliency map. Since the saliency map visualizes spatial configuration of target effectively, it improves target localization accuracy and enable us to achieve pixel-level target segmentation. We verify the effectiveness of our tracking algorithm through extensive experiment on a challenging benchmark, where our method illustrates outstanding performance compared to the state-of-the-art tracking algorithms.

665 citations


Journal ArticleDOI
TL;DR: An introductory tutorial on the usage of the Hyperopt library, including the description of search spaces, minimization (in serial and parallel), and the analysis of the results collected in the course of minimization.
Abstract: Sequential model-based optimization (also known as Bayesian optimization) is one of the most efficient methods (per function evaluation) of function minimization. This efficiency makes it appropriate for optimizing the hyperparameters of machine learning algorithms that are slow to train. The Hyperopt library provides algorithms and parallelization infrastructure for performing hyperparameter optimization (model selection) in Python. This paper presents an introductory tutorial on the usage of the Hyperopt library, including the description of search spaces, minimization (in serial and parallel), and the analysis of the results collected in the course of minimization. This paper also gives an overview of Hyperopt-Sklearn, a software project that provides automatic algorithm configuration of the Scikit-learn machine learning library. Following Auto-Weka, we take the view that the choice of classifier and even the choice of preprocessing module can be taken together to represent a single large hyperparameter optimization problem. We use Hyperopt to define a search space that encompasses many standard components (e.g. SVM, RF, KNN, PCA, TFIDF) and common patterns of composing them together. We demonstrate, using search algorithms in Hyperopt and standard benchmarking data sets (MNIST, 20-newsgroups, convex shapes), that searching this space is practical and effective. In particular, we improve on best-known scores for the model space for both MNIST and convex shapes. The paper closes with some discussion of ongoing and future work.

657 citations


01 Jan 2015
TL;DR: These experiments indicate that the “one-against-one” and DAG methods are more suitable for practical use than the other methods, and show that for large problems methods by considering all data at once in general need fewer support vectors.
Abstract: Support vector machines (SVM) were originally designed for binary classification How to effectively extend it for multi-class classification is still an on-going research issue Several methods have been proposed where typically we construct a multi-class classifier by combining several binary classifiers Some authors also proposed methods that consider all classes at once As it is computationally more expensive to solve multiclass problems, comparisons of these methods using large-scale problems have not been seriously conducted Especially for methods solving multi-class SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets In this paper we give decomposition implementations for two such “all-together” methods: [25], [27] and [7] We then compare their performance with three methods based on binary classifications: “one-against-all,” “one-against-one,” and DAGSVM [23] Our experiments indicate that the “one-against-one” and DAG methods are more suitable for practical use than the other methods Results also show that for large problems methods by considering all data at once in general need fewer support vectors

588 citations


Journal ArticleDOI
TL;DR: The proposed framework employs local binary patterns to extract local image features, such as edges, corners, and spots, and employs the efficient extreme learning machine with a very simple structure as the classifier.
Abstract: It is of great interest in exploiting texture information for classification of hyperspectral imagery (HSI) at high spatial resolution. In this paper, a classification paradigm to exploit rich texture information of HSI is proposed. The proposed framework employs local binary patterns (LBPs) to extract local image features, such as edges, corners, and spots. Two levels of fusion (i.e., feature-level fusion and decision-level fusion) are applied to the extracted LBP features along with global Gabor features and original spectral features, where feature-level fusion involves concatenation of multiple features before the pattern classification process while decision-level fusion performs on probability outputs of each individual classification pipeline and soft-decision fusion rule is adopted to merge results from the classifier ensemble. Moreover, the efficient extreme learning machine with a very simple structure is employed as the classifier. Experimental results on several HSI data sets demonstrate that the proposed framework is superior to some traditional alternatives.

574 citations


Journal ArticleDOI
11 Dec 2015-Sensors
TL;DR: A review of different classification techniques used to recognize human activities from wearable inertial sensor data shows that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms.
Abstract: This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors' placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject.

Journal ArticleDOI
TL;DR: The experimental results show that the use of the HHT, the SVM, and the SVR is a suitable strategy to improve the detection, diagnostic, and prognostic of bearing degradation.
Abstract: The detection, diagnostic, and prognostic of bearing degradation play a key role in increasing the reliability and safety of electrical machines, especially in key industrial sectors. This paper presents a new approach that combines the Hilbert-Huang transform (HHT), the support vector machine (SVM), and the support vector regression (SVR) for the monitoring of ball bearings. The proposed approach uses the HHT to extract new heath indicators from stationary/nonstationary vibration signals able to tack the degradation of the critical components of bearings. The degradation states are detected by a supervised classification technique called SVM, and the fault diagnostic is given by analyzing the extracted health indicators. The estimation of the remaining useful life is obtained by a one-step time-series prediction based on SVR. A set of experimental data collected from degraded bearings is used to validate the proposed approach. The experimental results show that the use of the HHT, the SVM, and the SVR is a suitable strategy to improve the detection, diagnostic, and prognostic of bearing degradation.

Journal ArticleDOI
TL;DR: In this paper, a hybrid model for fault detection and classification of motor bearing is presented, where the permutation entropy (PE) of the vibration signal is calculated to detect the malfunctions of the bearing.

Journal ArticleDOI
TL;DR: This study demonstrates how applying signal classification to Gaussian random signals can yield decoding accuracies of up to 70% or higher in two-class decoding with small sample sets, taking sample size into account.

Book ChapterDOI
01 Jan 2015
TL;DR: The SVM concepts presented in Chapter 3 can be generalized to become applicable to regression problems, and is characterized by the use of kernels, sparse solution, and VC control of the margin and the number of support vectors.
Abstract: Rooted in statistical learning or Vapnik-Chervonenkis (VC) theory, support vector machines (SVMs) are well positioned to generalize on yet-to-be-seen data. The SVM concepts presented in Chapter 3 can be generalized to become applicable to regression problems. As in classification, support vector regression (SVR) is characterized by the use of kernels, sparse solution, and VC control of the margin and the number of support vectors. Although less popular than SVM, SVR has been proven to be an effective tool in real-value function estimation. As a supervised-learning approach, SVR trains using a symmetrical loss function, which equally penalizes high and low misestimates. Using Vapnik’s -insensitive approach, a flexible tube of minimal radius is formed symmetrically around the estimated function, such that the absolute values of errors less than a certain threshold are ignored both above and below the estimate. In this manner, points outside the tube are penalized, but those within the tube, either above or below the function, receive no penalty. One of the main advantages of SVR is that its computational complexity does not depend on the dimensionality of the input space. Additionally, it has excellent generalization capability, with high prediction accuracy.

Journal ArticleDOI
TL;DR: ELM theories manage to address the open problem which has puzzled the neural networks, machine learning and neuroscience communities for 60 years: whether hidden nodes/neurons need to be tuned in learning, and proved that in contrast to the common knowledge and conventional neural network learning tenets,hidden nodes/NEurons do not need to been iteratively tuned in wide types of neural networks and learning models.
Abstract: The emergent machine learning technique—extreme learning machines (ELMs)—has become a hot area of research over the past years, which is attributed to the growing research activities and significant contributions made by numerous researchers around the world. Recently, it has come to our attention that a number of misplaced notions and misunderstandings are being dissipated on the relationships between ELM and some earlier works. This paper wishes to clarify that (1) ELM theories manage to address the open problem which has puzzled the neural networks, machine learning and neuroscience communities for 60 years: whether hidden nodes/neurons need to be tuned in learning, and proved that in contrast to the common knowledge and conventional neural network learning tenets, hidden nodes/neurons do not need to be iteratively tuned in wide types of neural networks and learning models (Fourier series, biological learning, etc.). Unlike ELM theories, none of those earlier works provides theoretical foundations on feedforward neural networks with random hidden nodes; (2) ELM is proposed for both generalized single-hidden-layer feedforward network and multi-hidden-layer feedforward networks (including biological neural networks); (3) homogeneous architecture-based ELM is proposed for feature learning, clustering, regression and (binary/multi-class) classification. (4) Compared to ELM, SVM and LS-SVM tend to provide suboptimal solutions, and SVM and LS-SVM do not consider feature representations in hidden layers of multi-hidden-layer feedforward networks either.

Journal ArticleDOI
TL;DR: A novel feature representation approach, namely the cluster center and nearest neighbor (CANN) approach, which shows that the CANN classifier not only performs better than or similar to k-NN and support vector machines trained and tested by the original feature representation in terms of classification accuracy, detection rates, and false alarms.
Abstract: The aim of an intrusion detection systems (IDS) is to detect various types of malicious network traffic and computer usage, which cannot be detected by a conventional firewall. Many IDS have been developed based on machine learning techniques. Specifically, advanced detection approaches created by combining or integrating multiple learning techniques have shown better detection performance than general single learning techniques. The feature representation method is an important pattern classifier that facilitates correct classifications, however, there have been very few related studies focusing how to extract more representative features for normal connections and effective detection of attacks. This paper proposes a novel feature representation approach, namely the cluster center and nearest neighbor (CANN) approach. In this approach, two distances are measured and summed, the first one based on the distance between each data sample and its cluster center, and the second distance is between the data and its nearest neighbor in the same cluster. Then, this new and one-dimensional distance based feature is used to represent each data sample for intrusion detection by a k-Nearest Neighbor (k-NN) classifier. The experimental results based on the KDD-Cup 99 dataset show that the CANN classifier not only performs better than or similar to k-NN and support vector machines trained and tested by the original feature representation in terms of classification accuracy, detection rates, and false alarms. I also provides high computational efficiency for the time of classifier training and testing (i.e., detection).

Journal ArticleDOI
TL;DR: This paper proposes a special procedure called initial adjustments, which adjusts the weights of ν-SVC based on the Karush-Kuhn-Tucker conditions to prepare an initial solution for the incremental learning of the INSVR learning algorithm.

Proceedings Article
25 Jan 2015
TL;DR: It is shown that optimal training-set attack can be formulated as a bilevel optimization problem and solved efficiently using gradient methods on an implicit function for machine learners with certain Karush-Kuhn-Tucker conditions.
Abstract: We investigate a problem at the intersection of machine learning and security: training-set attacks on machine learners. In such attacks an attacker contaminates the training data so that a specific learning algorithm would produce a model profitable to the attacker. Understanding training-set attacks is important as more intelligent agents (e.g. spam filters and robots) are equipped with learning capability and can potentially be hacked via data they receive from the environment. This paper identifies the optimal training-set attack on a broad family of machine learners. First we show that optimal training-set attack can be formulated as a bilevel optimization problem. Then we show that for machine learners with certain Karush-Kuhn-Tucker conditions we can solve the bilevel problem efficiently using gradient methods on an implicit function. As examples, we demonstrate optimal training-set attacks on Support Vector Machines, logistic regression, and linear regression with extensive experiments. Finally, we discuss potential defenses against such attacks.

Journal ArticleDOI
TL;DR: A new feature selection approach that is based on the integration of a genetic algorithm and particle swarm optimization is proposed and is able to automatically select the most informative features in terms of classification accuracy within an acceptable CPU processing time.
Abstract: A new feature selection approach that is based on the integration of a genetic algorithm and particle swarm optimization is proposed. The overall accuracy of a support vector machine classifier on validation samples is used as a fitness value. The new approach is carried out on the well-known Indian Pines hyperspectral data set. Results confirm that the new approach is able to automatically select the most informative features in terms of classification accuracy within an acceptable CPU processing time without requiring the number of desired features to be set a priori by users. Furthermore, the usefulness of the proposed method is also tested for road detection. Results confirm that the proposed method is capable of discriminating between road and background pixels and performs better than the other approaches used for comparison in terms of performance metrics.

Proceedings ArticleDOI
06 Jul 2015
TL;DR: This work demonstrates the effectiveness ofword2vec by showing that tf-idf and word2vec combined can outperform tf-IDf because word2 Vec provides complementary features (e.g. semantics that TF-idF can't capture) to tf- idf.
Abstract: With the rapid expansion of new available information presented to us online on a daily basis, text classification becomes imperative in order to classify and maintain it. Word2vec offers a unique perspective to the text mining community. By converting words and phrases into a vector representation, word2vec takes an entirely new approach on text classification. Based on the assumption that word2vec brings extra semantic features that helps in text classification, our work demonstrates the effectiveness of word2vec by showing that tf-idf and word2vec combined can outperform tf-idf because word2vec provides complementary features (e.g. semantics that tf-idf can't capture) to tf-idf. Our results show that the combination of word2vec weighted by tf-idf and tf-idf does not outperform tf-idf consistently. It is consistent enough to say the combination of the two can outperform either individually.

Journal ArticleDOI
TL;DR: The results indicate that Random Forest is the top algorithm followed by Support Vector Machines, Kernel Factory, AdaBoost, Neural Networks, K-Nearest Neighbors and Logistic Regression in the domain of stock price direction prediction.
Abstract: We predict long term stock price direction.We benchmark three ensemble methods against four single classifiers.We use five times twofold cross-validation and AUC as a performance measure.Random Forest is the top algorithm.This study is the first to make such an extensive benchmark in this domain. Stock price direction prediction is an important issue in the financial world. Even small improvements in predictive performance can be very profitable. The purpose of this paper is to benchmark ensemble methods (Random Forest, AdaBoost and Kernel Factory) against single classifier models (Neural Networks, Logistic Regression, Support Vector Machines and K-Nearest Neighbor). We gathered data from 5767 publicly listed European companies and used the area under the receiver operating characteristic curve (AUC) as a performance measure. Our predictions are one year ahead. The results indicate that Random Forest is the top algorithm followed by Support Vector Machines, Kernel Factory, AdaBoost, Neural Networks, K-Nearest Neighbors and Logistic Regression. This study contributes to literature in that it is, to the best of our knowledge, the first to make such an extensive benchmark. The results clearly suggest that novel studies in the domain of stock price direction prediction should include ensembles in their sets of algorithms. Our extensive literature review evidently indicates that this is currently not the case.

Journal ArticleDOI
TL;DR: An improved algorithm SVM-RFE + CBR is proposed by incorporating the correlation bias reduction (CBR) strategy into the feature elimination procedure, which outperforms the original SVM -RFE and other typical algorithms.
Abstract: Support vector machine recursive feature elimination (SVM-RFE) is a powerful feature selection algorithm. However, when the candidate feature set contains highly correlated features, the ranking criterion of SVM-RFE will be biased, which would hinder the application of SVM-RFE on gas sensor data. In this paper, the linear and nonlinear SVM-RFE algorithms are studied. After investigating the correlation bias, an improved algorithm SVM-RFE + CBR is proposed by incorporating the correlation bias reduction (CBR) strategy into the feature elimination procedure. Experiments are conducted on a synthetic dataset and two breath analysis datasets, one of which contains temperature modulated sensors. Large and comprehensive sets of transient features are extracted from the sensor responses. The classification accuracy with feature selection proves the efficacy of the proposed SVM-RFE + CBR. It outperforms the original SVM-RFE and other typical algorithms. An ensemble method is further studied to improve the stability of the proposed method. By statistically analyzing the features’ rankings, some knowledge is obtained, which can guide future design of e-noses and feature extraction algorithms.

Proceedings Article
06 Jul 2015
TL;DR: An online visual tracking algorithm by learning discriminative saliency map using Convolutional Neural Network using hidden layers of the network to improve target localization accuracy and achieve pixel-level target segmentation.
Abstract: We propose an online visual tracking algorithm by learning discriminative saliency map using Convolutional Neural Network (CNN). Given a CNN pre-trained on a large-scale image repository in offline, our algorithm takes outputs from hidden layers of the network as feature descriptors since they show excellent representation performance in various general visual recognition problems. The features are used to learn discriminative target appearance models using an online Support Vector Machine (SVM). In addition, we construct target-specific saliency map by backprojecting CNN features with guidance of the SVM, and obtain the final tracking result in each frame based on the appearance model generatively constructed with the saliency map. Since the saliency map reveals spatial configuration of target effectively, it improves target localization accuracy and enables us to achieve pixel-level target segmentation. We verify the effectiveness of our tracking algorithm through extensive experiment on a challenging benchmark, where our method illustrates outstanding performance compared to the state-of-the-art tracking algorithms.

Journal ArticleDOI
TL;DR: A novel method for real-time RUL estimation of Li ion batteries is proposed that integrates classification and regression attributes of Support Vector (SV) based machine learning technique.

Journal ArticleDOI
TL;DR: This letter proposes to adaptively learn a suitable feature representation from unlabeled data by learning a feature mapping function based on stacked sparse autoencoder that embeds the learned spectral-spatial feature into a linear support vector machine for classification.
Abstract: In this letter, different from traditional methods using original spectral features or handcraft spectral–spatial features, we propose to adaptively learn a suitable feature representation from unlabeled data. This is achieved by learning a feature mapping function based on stacked sparse autoencoder. Considering that hyperspectral imagery (HSI) is intrinsically defined in both the spectral and spatial domains, we further establish two variants of feature learning procedures for sparse spectral feature learning and multiscale spatial feature learning. Finally, we embed the learned spectral–spatial feature into a linear support vector machine for classification. Experiments on two hyperspectral images indicate the following: 1) the learned spectral–spatial feature representation is more discriminative for HSI classification compared to previously hand-engineered spectral–spatial features, especially when the training data are limited and 2) the learned features appear not to be specific to a particular image but general in that they are applicable to multiple related images (e.g., images acquired by the same sensor but varying with location or time).

Journal ArticleDOI
TL;DR: Experimental results on three widely used real HSIs indicate that the proposed SC-MK approach outperforms several well-known classification methods.
Abstract: For the classification of hyperspectral images (HSIs), this paper presents a novel framework to effectively utilize the spectral–spatial information of superpixels via multiple kernels, which is termed as superpixel-based classification via multiple kernels (SC-MK). In the HSI, each superpixel can be regarded as a shape-adaptive region, which consists of a number of spatial neighboring pixels with very similar spectral characteristics. First, the proposed SC-MK method adopts an oversegmentation algorithm to cluster the HSI into many superpixels. Then, three kernels are separately employed for the utilization of the spectral information, as well as spatial information, within and among superpixels. Finally, the three kernels are combined together and incorporated into a support vector machine classifier. Experimental results on three widely used real HSIs indicate that the proposed SC-MK approach outperforms several well-known classification methods.

Proceedings ArticleDOI
26 May 2015
TL;DR: This paper addresses the problem with transfer learning from deep convolutional neural networks that are pre-trained for image categorization and provides a rich, semantically meaningful feature set by incorporating depth information by rendering objects from a canonical perspective and colorizing the depth channel according to distance from the object center.
Abstract: Object recognition and pose estimation from RGB-D images are important tasks for manipulation robots which can be learned from examples Creating and annotating datasets for learning is expensive, however We address this problem with transfer learning from deep convolutional neural networks (CNN) that are pre-trained for image categorization and provide a rich, semantically meaningful feature set We incorporate depth information, which the CNN was not trained with, by rendering objects from a canonical perspective and colorizing the depth channel according to distance from the object center We evaluate our approach on the Washington RGB-D Objects dataset, where we find that the generated feature set naturally separates classes and instances well and retains pose manifolds We outperform state-of-the-art on a number of subtasks and show that our approach can yield superior results when only little training data is available

Journal ArticleDOI
TL;DR: In this article, a support vector machine (SVM) was used to forecast M-and X-class solar flares using four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager.
Abstract: We attempt to forecast M- and X-class solar flares using a machine-learning algorithm, called support vector machine (SVM), and four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager, the first instrument to continuously map the full-disk photospheric vector magnetic field from space. Most flare forecasting efforts described in the literature use either line-of-sight magnetograms or a relatively small number of ground-based vector magnetograms. This is the first time a large data set of vector magnetograms has been used to forecast solar flares. We build a catalog of flaring and non-flaring active regions sampled from a database of 2071 active regions, comprised of 1.5 million active region patches of vector magnetic field data, and characterize each active region by 25 parameters. We then train and test the machine-learning algorithm and we estimate its performances using forecast verification metrics with an emphasis on the true skill statistic (TSS). We obtain relatively high TSS scores and overall predictive abilities. We surmise that this is partly due to fine-tuning the SVM for this purpose and also to an advantageous set of features that can only be calculated from vector magnetic field data. We also apply a feature selection algorithm to determine which of our 25 features are useful for discriminating between flaring and non-flaring active regions and conclude that only a handful are needed for good predictive abilities.

Journal ArticleDOI
Jun Zhang1, Xiao Chen1, Yang Xiang1, Wanlei Zhou1, Jie Wu2 
TL;DR: The proposed RTC scheme has the capability of identifying the traffic of zero-day applications as well as accurately discriminating predefined application classes and is significantly better than four state-of-the-art methods.
Abstract: As a fundamental tool for network management and security, traffic classification has attracted increasing attention in recent years. A significant challenge to the robustness of classification performance comes from zero-day applications previously unknown in traffic classification systems. In this paper, we propose a new scheme of Robust statistical Traffic Classification (RTC) by combining supervised and unsupervised machine learning techniques to meet this challenge. The proposed RTC scheme has the capability of identifying the traffic of zero-day applications as well as accurately discriminating predefined application classes. In addition, we develop a new method for automating the RTC scheme parameters optimization process. The empirical study on real-world traffic data confirms the effectiveness of the proposed scheme. When zero-day applications are present, the classification performance of the new scheme is significantly better than four state-of-the-art methods: random forest, correlation-based classification, semi-supervised clustering, and one-class SVM.