scispace - formally typeset
Search or ask a question

Showing papers on "Support vector machine published in 2014"


Proceedings ArticleDOI
23 Jun 2014
TL;DR: In this paper, features extracted from the OverFeat network are used as a generic image representation to tackle the diverse range of recognition tasks of object image classification, scene recognition, fine grained recognition, attribute detection and image retrieval applied to a diverse set of datasets.
Abstract: Recent results indicate that the generic descriptors extracted from the convolutional neural networks are very powerful. This paper adds to the mounting evidence that this is indeed the case. We report on a series of experiments conducted for different recognition tasks using the publicly available code and model of the OverFeat network which was trained to perform object classification on ILSVRC13. We use features extracted from the OverFeat network as a generic image representation to tackle the diverse range of recognition tasks of object image classification, scene recognition, fine grained recognition, attribute detection and image retrieval applied to a diverse set of datasets. We selected these tasks and datasets as they gradually move further away from the original task and data the OverFeat network was trained to solve. Astonishingly, we report consistent superior results compared to the highly tuned state-of-the-art systems in all the visual classification tasks on various datasets. For instance retrieval it consistently outperforms low memory footprint methods except for sculptures dataset. The results are achieved using a linear SVM classifier (or L2 distance in case of retrieval) applied to a feature representation of size 4096 extracted from a layer in the net. The representations are further modified using simple augmentation techniques e.g. jittering. The results strongly suggest that features obtained from deep learning with convolutional nets should be the primary candidate in most visual recognition tasks.

3,346 citations


Journal ArticleDOI
TL;DR: The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in theTop-20, respectively).
Abstract: We evaluate 179 classifiers arising from 17 families (discriminant analysis, Bayesian, neural networks, support vector machines, decision trees, rule-based classifiers, boosting, bagging, stacking, random forests and other ensembles, generalized linear models, nearest-neighbors, partial least squares and principal component regression, logistic and multinomial regression, multiple adaptive regression splines and other methods), implemented in Weka, R (with and without the caret package), C and Matlab, including all the relevant classifiers available today. We use 121 data sets, which represent the whole UCI data base (excluding the large-scale problems) and other own real problems, in order to achieve significant conclusions about the classifier behavior, not dependent on the data set collection. The classifiers most likely to be the bests are the random forest (RF) versions, the best of which (implemented in R and accessed via caret) achieves 94.1% of the maximum accuracy overcoming 90% in the 84.3% of the data sets. However, the difference is not statistically significant with the second best, the SVM with Gaussian kernel implemented in C using LibSVM, which achieves 92.3% of the maximum accuracy. A few models are clearly better than the remaining ones: random forest, SVM with Gaussian and polynomial kernels, extreme learning machine with Gaussian kernel, C5.0 and avNNet (a committee of multi-layer perceptrons implemented in R with the caret package). The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in the top-20, respectively).

2,616 citations


Journal ArticleDOI
TL;DR: The concept of deep learning is introduced into hyperspectral data classification for the first time, and a new way of classifying with spatial-dominated information is proposed, which is a hybrid of principle component analysis (PCA), deep learning architecture, and logistic regression.
Abstract: Classification is one of the most popular topics in hyperspectral remote sensing. In the last two decades, a huge number of methods were proposed to deal with the hyperspectral data classification problem. However, most of them do not hierarchically extract deep features. In this paper, the concept of deep learning is introduced into hyperspectral data classification for the first time. First, we verify the eligibility of stacked autoencoders by following classical spectral information-based classification. Second, a new way of classifying with spatial-dominated information is proposed. We then propose a novel deep learning framework to merge the two features, from which we can get the highest classification accuracy. The framework is a hybrid of principle component analysis (PCA), deep learning architecture, and logistic regression. Specifically, as a deep learning architecture, stacked autoencoders are aimed to get useful high-level features. Experimental results with widely-used hyperspectral data indicate that classifiers built in this deep learning-based framework provide competitive performance. In addition, the proposed joint spectral-spatial deep neural network opens a new window for future research, showcasing the deep learning-based methods' huge potential for accurate hyperspectral data classification.

2,071 citations



Journal ArticleDOI
TL;DR: In this article, the authors introduce attribute-based classification, where objects are identified based on a high-level description that is phrased in terms of semantic attributes, such as the object's color or shape.
Abstract: We study the problem of object recognition for categories for which we have no training examples, a task also called zero--data or zero-shot learning. This situation has hardly been studied in computer vision research, even though it occurs frequently; the world contains tens of thousands of different object classes, and image collections have been formed and suitably annotated for only a few of them. To tackle the problem, we introduce attribute-based classification: Objects are identified based on a high-level description that is phrased in terms of semantic attributes, such as the object's color or shape. Because the identification of each such property transcends the specific learning task at hand, the attribute classifiers can be prelearned independently, for example, from existing image data sets unrelated to the current task. Afterward, new classes can be detected based on their attribute representation, without the need for a new training phase. In this paper, we also introduce a new data set, Animals with Attributes, of over 30,000 images of 50 animal classes, annotated with 85 semantic attributes. Extensive experiments on this and two more data sets show that attribute-based classification indeed is able to categorize images without access to any training images of the target classes.

1,559 citations


Book ChapterDOI
06 Sep 2014
TL;DR: A novel method to mine discriminative parts using Random Forests (rf), which allows us to mine for parts simultaneously for all classes and to share knowledge among them, and compares nicely to other s-o-a component-based classification methods.
Abstract: In this paper we address the problem of automatically recognizing pictured dishes. To this end, we introduce a novel method to mine discriminative parts using Random Forests (rf), which allows us to mine for parts simultaneously for all classes and to share knowledge among them. To improve efficiency of mining and classification, we only consider patches that are aligned with image superpixels, which we call components. To measure the performance of our rf component mining for food recognition, we introduce a novel and challenging dataset of 101 food categories, with 101’000 images. With an average accuracy of 50.76%, our model outperforms alternative classification methods except for cnn, including svm classification on Improved Fisher Vectors and existing discriminative part-mining algorithms by 11.88% and 8.13%, respectively. On the challenging mit-Indoor dataset, our method compares nicely to other s-o-a component-based classification methods.

1,216 citations


Journal ArticleDOI
TL;DR: This work shows that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples, and an exponential speedup is obtained.
Abstract: Supervised machine learning is the classification of new data based on already classified training examples. In this work, we show that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples. In cases where classical sampling algorithms require polynomial time, an exponential speedup is obtained. At the core of this quantum big data algorithm is a nonsparse matrix exponentiation technique for efficiently performing a matrix inversion of the training data inner-product (kernel) matrix.

1,078 citations


Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed edge-preserving filtering based classification method can improve the classification accuracy significantly in a very short time and can be easily applied in real applications.
Abstract: The integration of spatial context in the classification of hyperspectral images is known to be an effective way in improving classification accuracy. In this paper, a novel spectral-spatial classification framework based on edge-preserving filtering is proposed. The proposed framework consists of the following three steps. First, the hyperspectral image is classified using a pixelwise classifier, e.g., the support vector machine classifier. Then, the resulting classification map is represented as multiple probability maps, and edge-preserving filtering is conducted on each probability map, with the first principal component or the first three principal components of the hyperspectral image serving as the gray or color guidance image. Finally, according to the filtered probability maps, the class of each pixel is selected based on the maximum probability. Experimental results demonstrate that the proposed edge-preserving filtering based classification method can improve the classification accuracy significantly in a very short time. Thus, it can be easily applied in real applications.

640 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed an ensemble weight-of-evidence (WoE) and support vector machine (SVM) model to assess the impact of classes of each conditioning factor on flooding through bivariate statistical analysis.

608 citations


Journal ArticleDOI
TL;DR: From experimental results, it is found that power spectrum feature is superior to other two kinds of features; a linear dynamic system based feature smoothing method can significantly improve emotion classification accuracy; and the trajectory of emotion changes can be visualized by reducing subject-independent features with manifold learning.

561 citations


Journal ArticleDOI
TL;DR: A novel transfer learning framework, referred to as Adaptation Regularization based Transfer Learning (ARTL), to model adaptive classifiers in a unified way based on the structural risk minimization principle and the regularization theory, and can significantly outperform state-of-the-art learning methods on several public text and image datasets.
Abstract: Domain transfer learning, which learns a target classifier using labeled data from a different distribution, has shown promising value in knowledge discovery yet still been a challenging problem. Most previous works designed adaptive classifiers by exploring two learning strategies independently: distribution adaptation and label propagation. In this paper, we propose a novel transfer learning framework, referred to as Adaptation Regularization based Transfer Learning (ARTL), to model them in a unified way based on the structural risk minimization principle and the regularization theory. Specifically, ARTL learns the adaptive classifier by simultaneously optimizing the structural risk functional, the joint distribution matching between domains, and the manifold consistency underlying marginal distribution. Based on the framework, we propose two novel methods using Regularized Least Squares (RLS) and Support Vector Machines (SVMs), respectively, and use the Representer theorem in reproducing kernel Hilbert space to derive corresponding solutions. Comprehensive experiments verify that ARTL can significantly outperform state-of-the-art learning methods on several public text and image datasets.

Journal ArticleDOI
TL;DR: The general idea of open space risk limiting classification is extended to accommodate non-linear classifiers in a multiclass setting and a new open set recognition model called compact abating probability (CAP), where the probability of class membership decreases in value as points move from known data toward open space.
Abstract: Real-world tasks in computer vision often touch upon open set recognition: multi-class recognition with incomplete knowledge of the world and many unknown inputs. Recent work on this problem has proposed a model incorporating an open space risk term to account for the space beyond the reasonable support of known classes. This paper extends the general idea of open space risk limiting classification to accommodate non-linear classifiers in a multiclass setting. We introduce a new open set recognition model called compact abating probability (CAP), where the probability of class membership decreases in value (abates) as points move from known data toward open space. We show that CAP models improve open set recognition for multiple algorithms. Leveraging the CAP formulation, we go on to describe the novel Weibull-calibrated SVM (W-SVM) algorithm, which combines the useful properties of statistical extreme value theory for score calibration with one-class and binary support vector machines. Our experiments show that the W-SVM is significantly better for open set object detection and OCR problems when compared to the state-of-the-art for the same tasks.

Journal ArticleDOI
TL;DR: This paper reviews the state-of-the-art and focuses over a wide range of applications of SVMs in the field of hydrology, providing a brief synopsis of the techniques of SVM and other emerging ones (hybrid models), which have proven useful in the analysis of the various hydrological parameters.

Journal ArticleDOI
21 Jun 2014
TL;DR: The runtime of the framework is analyzed and rates that improve state-of-the-art results for various key machine learning optimization problems including SVM, logistic regression, ridge regression, Lasso, and multiclass SVM are obtained.
Abstract: We introduce a proximal version of the stochastic dual coordinate ascent method and show how to accelerate the method using an inner-outer iteration procedure. We analyze the runtime of the framework and obtain rates that improve state-of-the-art results for various key machine learning optimization problems including SVM, logistic regression, ridge regression, Lasso, and multiclass SVM. Experiments validate our theoretical findings.

Journal ArticleDOI
TL;DR: The gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two, and the general utility of this method is demonstrated using a Naïve-Bayes classifier.
Abstract: Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naive-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem.

Journal ArticleDOI
TL;DR: A hybrid of K-means and support vector machine (K-SVM) algorithms is developed to diagnose breast cancer based on the extracted tumor features and shows time savings during the training phase.
Abstract: With the development of clinical technologies, different tumor features have been collected for breast cancer diagnosis. Filtering all the pertinent feature information to support the clinical disease diagnosis is a challenging and time consuming task. The objective of this research is to diagnose breast cancer based on the extracted tumor features. Feature extraction and selection are critical to the quality of classifiers founded through data mining methods. To extract useful information and diagnose the tumor, a hybrid of K-means and support vector machine (K-SVM) algorithms is developed. The K-means algorithm is utilized to recognize the hidden patterns of the benign and malignant tumors separately. The membership of each tumor to these patterns is calculated and treated as a new feature in the training model. Then, a support vector machine (SVM) is used to obtain the new classifier to differentiate the incoming tumors. Based on 10-fold cross validation, the proposed methodology improves the accuracy to 97.38%, when tested on the Wisconsin Diagnostic Breast Cancer (WDBC) data set from the University of California - Irvine machine learning repository. Six abstract tumor features are extracted from the 32 original features for the training phase. The results not only illustrate the capability of the proposed approach on breast cancer diagnosis, but also shows time savings during the training phase. Physicians can also benefit from the mined abstract tumor features by better understanding the properties of different types of tumors.

Journal ArticleDOI
TL;DR: This paper proposes a novel method called Heterogeneous Feature Augmentation (HFA) based on SVM which can simultaneously learn the target classifier as well as infer the labels of unlabeled target samples and shows that the SHFA and HFA outperform the existing HDA methods.
Abstract: In this paper, we study the heterogeneous domain adaptation (HDA) problem, in which the data from the source domain and the target domain are represented by heterogeneous features with different dimensions. By introducing two different projection matrices, we first transform the data from two domains into a common subspace such that the similarity between samples across different domains can be measured. We then propose a new feature mapping function for each domain, which augments the transformed samples with their original features and zeros. Existing supervised learning methods ( e.g., SVM and SVR) can be readily employed by incorporating our newly proposed augmented feature representations for supervised HDA. As a showcase, we propose a novel method called Heterogeneous Feature Augmentation (HFA) based on SVM. We show that the proposed formulation can be equivalently derived as a standard Multiple Kernel Learning (MKL) problem, which is convex and thus the global solution can be guaranteed. To additionally utilize the unlabeled data in the target domain, we further propose the semi-supervised HFA (SHFA) which can simultaneously learn the target classifier as well as infer the labels of unlabeled target samples. Comprehensive experiments on three different applications clearly demonstrate that our SHFA and HFA outperform the existing HDA methods.

Journal ArticleDOI
TL;DR: A deep learning network (DLN) is proposed to discover unknown feature correlation between input signals that is crucial for the learning task and provides better performance compared to SVM and naive Bayes classifiers.
Abstract: Automatic emotion recognition is one of the most challenging tasks. To detect emotion from nonstationary EEG signals, a sophisticated learning algorithm that can represent high-level abstraction is required. This study proposes the utilization of a deep learning network (DLN) to discover unknown feature correlation between input signals that is crucial for the learning task. The DLN is implemented with a stacked autoencoder (SAE) using hierarchical feature learning approach. Input features of the network are power spectral densities of 32-channel EEG signals from 32 subjects. To alleviate overfitting problem, principal component analysis (PCA) is applied to extract the most important components of initial input features. Furthermore, covariate shift adaptation of the principal components is implemented to minimize the nonstationary effect of EEG signals. Experimental results show that the DLN is capable of classifying three different levels of valence and arousal with accuracy of 49.52% and 46.03%, respectively. Principal component based covariate shift adaptation enhances the respective classification accuracy by 5.55% and 6.53%. Moreover, DLN provides better performance compared to SVM and naive Bayes classifiers.

Journal ArticleDOI
TL;DR: The plain DBN-based model gives a call-routing classification accuracy that is equal to the best of the other models, however, using additional unlabeled data for DBN pre-training and combining Dbn-based learned features with the original features provides significant gains over SVMs, which, in turn, performed better than both MaxEnt and Boosting.
Abstract: Applications of Deep Belief Nets (DBN) to various problems have been the subject of a number of recent studies ranging from image classification and speech recognition to audio classification. In this study we apply DBNs to a natural language understanding problem. The recent surge of activity in this area was largely spurred by the development of a greedy layer-wise pretraining method that uses an efficient learning algorithm called Contrastive Divergence (CD). CD allows DBNs to learn a multi-layer generative model from unlabeled data and the features discovered by this model are then used to initialize a feed-forward neural network which is fine-tuned with backpropagation. We compare a DBN-initialized neural network to three widely used text classification algorithms: Support Vector Machines (SVM), boosting and Maximum Entropy (MaxEnt). The plain DBN-based model gives a call-routing classification accuracy that is equal to the best of the other models. However, using additional unlabeled data for DBN pre-training and combining DBN-based learned features with the original features provides significant gains over SVMs, which, in turn, performed better than both MaxEnt and Boosting.

Journal ArticleDOI
TL;DR: This work identifies Random Forests as a good first choice algorithm for the supervised classification of lithology using remotely sensed geophysical data and indicates that as training data becomes increasingly dispersed across the region under investigation, MLA predictive accuracy improves dramatically.

Journal ArticleDOI
TL;DR: A hybrid model combining with input selected by deep quantitative analysis, Wavelet Transform, Genetic Algorithm,GA and Support Vector Machines (SVM) was proposed, which outperforms the comparison models in predicting wind speed.

Journal ArticleDOI
TL;DR: This discussion is methods‐based and focused on some algorithms that chemoinformatics researchers frequently use, particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k‐Nearest Neighbors and naïve Bayes classifiers.
Abstract: Machine learning algorithms are generally developed in computer science or adjacent disciplines and find their way into chemical modeling by a process of diffusion. Though particular machine learning methods are popular in chemoinformatics and quantitative structure–activity relationships (QSAR), many others exist in the technical literature. This discussion is methods-based and focused on some algorithms that chemoinformatics researchers frequently use. It makes no claim to be exhaustive. We concentrate on methods for supervised learning, predicting the unknown property values of a test set of instances, usually molecules, based on the known values for a training set. Particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k-Nearest Neighbors and naive Bayes classifiers. WIREs Comput Mol Sci 2014, 4:468–481. How to cite this article: WIREs Comput Mol Sci 2014, 4:468–481. doi:10.1002/wcms.1183

Journal ArticleDOI
01 May 2014
TL;DR: In order to reduce the noise caused by feature differences and improve the performance of SVM, an improved kernel function (N-RBF) is proposed by embedding the mean value and the mean square difference values of feature attributes in RBF kernel function.
Abstract: A novel support vector machine (SVM) model combining kernel principal component analysis (KPCA) with genetic algorithm (GA) is proposed for intrusion detection. In the proposed model, a multi-layer SVM classifier is adopted to estimate whether the action is an attack, KPCA is used as a preprocessor of SVM to reduce the dimension of feature vectors and shorten training time. In order to reduce the noise caused by feature differences and improve the performance of SVM, an improved kernel function (N-RBF) is proposed by embedding the mean value and the mean square difference values of feature attributes in RBF kernel function. GA is employed to optimize the punishment factor C, kernel parameters @s and the tube size @? of SVM. By comparison with other detection algorithms, the experimental results show that the proposed model performs higher predictive accuracy, faster convergence speed and better generalization.

Proceedings ArticleDOI
20 Jan 2014
TL;DR: An ensemble of deep learning belief networks (DBN) is proposed for regression and time series forecasting and the advantage of the proposed method on three electricity load demand datasets, one artificial time series dataset and three regression datasets over other benchmark methods is shown.
Abstract: In this paper, for the first time, an ensemble of deep learning belief networks (DBN) is proposed for regression and time series forecasting. Another novel contribution is to aggregate the outputs from various DBNs by a support vector regression (SVR) model. We show the advantage of the proposed method on three electricity load demand datasets, one artificial time series dataset and three regression datasets over other benchmark methods.

Journal ArticleDOI
TL;DR: A novel two-step hierarchical classification approach is proposed where the nonlesions or false positives are rejected in the first step and the bright lesions areclassified as hard exudates and cotton wool spots, and the red lesions are classified as hemorrhages and micro-aneurysms.
Abstract: This paper presents a computer-aided screening system (DREAM) that analyzes fundus images with varying illumination and fields of view, and generates a severity grade for diabetic retinopathy (DR) using machine learning. Classifiers such as the Gaussian Mixture model (GMM), k-nearest neighbor (kNN), support vector machine (SVM), and AdaBoost are analyzed for classifying retinopathy lesions from nonlesions. GMM and kNN classifiers are found to be the best classifiers for bright and red lesion classification, respectively. A main contribution of this paper is the reduction in the number of features used for lesion classification by feature ranking using Adaboost where 30 top features are selected out of 78. A novel two-step hierarchical classification approach is proposed where the nonlesions or false positives are rejected in the first step. In the second step, the bright lesions are classified as hard exudates and cotton wool spots, and the red lesions are classified as hemorrhages and micro-aneurysms. This lesion classification problem deals with unbalanced datasets and SVM or combination classifiers derived from SVM using the Dempster-Shafer theory are found to incur more classification error than the GMM and kNN classifiers due to the data imbalance. The DR severity grading system is tested on 1200 images from the publicly available MESSIDOR dataset. The DREAM system achieves 100% sensitivity, 53.16% specificity, and 0.904 AUC, compared to the best reported 96% sensitivity, 51% specificity, and 0.875 AUC, for classifying images as with or without DR. The feature reduction further reduces the average computation time for DR severity per image from 59.54 to 3.46 s.

Journal ArticleDOI
18 Jun 2014-Sensors
TL;DR: An automated fall detection system with wearable motion sensor units fitted to the subjects' body at six different positions is developed and successfully distinguish falls from ADLs using six machine learning techniques (classifiers): the k-nearest neighbor (k-NN) classifier, least squares method (LSM), support vector machines (SVM), Bayesian decision making (BDM), dynamic time warping (DTW), and artificial neural networks (ANNs).
Abstract: Falls are a serious public health problem and possibly life threatening for people in fall risk groups. We develop an automated fall detection system with wearable motion sensor units fitted to the subjects' body at six different positions. Each unit comprises three tri-axial devices (accelerometer, gyroscope, and magnetometer/compass). Fourteen volunteers perform a standardized set of movements including 20 voluntary falls and 16 activities of daily living (ADLs), resulting in a large dataset with 2520 trials. To reduce the computational complexity of training and testing the classifiers, we focus on the raw data for each sensor in a 4 s time window around the point of peak total acceleration of the waist sensor, and then perform feature extraction and reduction. Most earlier studies on fall detection employ rule-based approaches that rely on simple thresholding of the sensor outputs. We successfully distinguish falls from ADLs using six machine learning techniques (classifiers): the k-nearest neighbor (k-NN) classifier, least squares method (LSM), support vector machines (SVM), Bayesian decision making (BDM), dynamic time warping (DTW), and artificial neural networks (ANNs). We compare the performance and the computational complexity of the classifiers and achieve the best results with the k-NN classifier and LSM, with sensitivity, specificity, and accuracy all above 99%. These classifiers also have acceptable computational requirements for training and testing. Our approach would be applicable in real-world scenarios where data records of indeterminate length, containing multiple activities in sequence, are recorded.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed kernel-based feature selection method with a criterion that is an integration of the previous work and the linear combination of features improves the classification performance of the SVM.
Abstract: Hyperspectral imaging fully portrays materials through numerous and contiguous spectral bands. It is a very useful technique in various fields, including astronomy, medicine, food safety, forensics, and target detection. However, hyperspectral images include redundant measurements, and most classification studies encountered the Hughes phenomenon. Finding a small subset of effective features to model the characteristics of classes represented in the data for classification is a critical preprocessing step required to render a classifier effective in hyperspectral image classification. In our previous work, an automatic method for selecting the radial basis function (RBF) parameter (i.e., σ) for a support vector machine (SVM) was proposed. A criterion that contains the between-class and within-class information was proposed to measure the separability of the feature space with respect to the RBF kernel. Thereafter, the optimal RBF kernel parameter was obtained by optimizing the criterion. This study proposes a kernel-based feature selection method with a criterion that is an integration of the previous work and the linear combination of features. In this new method, two properties can be achieved according to the magnitudes of the coefficients being calculated: the small subset of features and the ranking of features. Experimental results on both one simulated dataset and two hyperspectral images (the Indian Pine Site dataset and the Pavia University dataset) show that the proposed method improves the classification performance of the SVM.

Journal ArticleDOI
TL;DR: This paper presents an unsupervised feature selection method based on ant colony optimization, called UFSACO, which seeks to find the optimal feature subset through several iterations without using any learning algorithms.

Proceedings ArticleDOI
24 Aug 2014
TL;DR: A local skeleton descriptor that encodes the relative position of joint quadruples is proposed that outperforms state-of-the-art algorithms that rely only on joints, while it competes with methods that combine joints with extra cues.
Abstract: Recent advances on human motion analysis have made the extraction of human skeleton structure feasible, even from single depth images. This structure has been proven quite informative for discriminating actions in a recognition scenario. In this context, we propose a local skeleton descriptor that encodes the relative position of joint quadruples. Such a coding implies a similarity normalisation transform that leads to a compact (6D) view-invariant skeletal feature, referred to as skeletal quad. Further, the use of a Fisher kernel representation is suggested to describe the skeletal quads contained in a (sub)action. A Gaussian mixture model is learnt from training data, so that the generation of any set of quads is encoded by its Fisher vector. Finally, a multi-level representation of Fisher vectors leads to an action description that roughly carries the order of sub-action within each action sequence. Efficient classification is here achieved by linear SVMs. The proposed action representation is tested on widely used datasets, MSRAction3D and HDM05. The experimental evaluation shows that the proposed method outperforms state-of-the-art algorithms that rely only on joints, while it competes with methods that combine joints with extra cues.

Journal ArticleDOI
TL;DR: In this paper, the spectral information provided by the Landsat Thematic Mapper (TM) data set and the same classification scheme over Guangzhou City, China, was tested with two unsupervised and 13 supervised classification algorithms, including a number of machine learning algorithms.
Abstract: Although a large number of new image classification algorithms have been developed, they are rarely tested with the same classification task. In this research, with the same Landsat Thematic Mapper (TM) data set and the same classification scheme over Guangzhou City, China, we tested two unsupervised and 13 supervised classification algorithms, including a number of machine learning algorithms that became popular in remote sensing during the past 20 years. Our analysis focused primarily on the spectral information provided by the TM data. We assessed all algorithms in a per-pixel classification decision experiment and all supervised algorithms in a segment-based experiment. We found that when sufficiently representative training samples were used, most algorithms performed reasonably well. Lack of training samples led to greater classification accuracy discrepancies than classification algorithms themselves. Some algorithms were more tolerable to insufficient (less representative) training samples than others. Many algorithms improved the overall accuracy marginally with per-segment decision making.