scispace - formally typeset
Search or ask a question
Journal ArticleDOI

LIBSVM: A library for support vector machines

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Experimental results show that the proposed kernel-based feature selection method with a criterion that is an integration of the previous work and the linear combination of features improves the classification performance of the SVM.
Abstract: Hyperspectral imaging fully portrays materials through numerous and contiguous spectral bands. It is a very useful technique in various fields, including astronomy, medicine, food safety, forensics, and target detection. However, hyperspectral images include redundant measurements, and most classification studies encountered the Hughes phenomenon. Finding a small subset of effective features to model the characteristics of classes represented in the data for classification is a critical preprocessing step required to render a classifier effective in hyperspectral image classification. In our previous work, an automatic method for selecting the radial basis function (RBF) parameter (i.e., σ) for a support vector machine (SVM) was proposed. A criterion that contains the between-class and within-class information was proposed to measure the separability of the feature space with respect to the RBF kernel. Thereafter, the optimal RBF kernel parameter was obtained by optimizing the criterion. This study proposes a kernel-based feature selection method with a criterion that is an integration of the previous work and the linear combination of features. In this new method, two properties can be achieved according to the magnitudes of the coefficients being calculated: the small subset of features and the ranking of features. Experimental results on both one simulated dataset and two hyperspectral images (the Indian Pine Site dataset and the Pavia University dataset) show that the proposed method improves the classification performance of the SVM.

304 citations


Cites background from "LIBSVM: A library for support vecto..."

  • ...According to [19] and [28], the dataset was normalized to the range [0, 1]....

    [...]

Journal ArticleDOI
03 Apr 2012
TL;DR: A comprehensive survey on the recent development of linear classifiers is given, which shows how efficient optimization methods to construct linear classifier methods have applied them to some large-scale applications.
Abstract: Linear classification is a useful tool in machine learning and data mining. For some data in a rich dimensional space, the performance (i.e., testing accuracy) of linear classifiers has shown to be close to that of nonlinear classifiers such as kernel methods, but training and testing speed is much faster. Recently, many research works have developed efficient optimization methods to construct linear classifiers and applied them to some large-scale applications. In this paper, we give a comprehensive survey on the recent development of this active research area.

303 citations

Posted Content
TL;DR: This work casts hyperparameter optimization as an instance of non-stochastic best-arm identification, identifies a known algorithm that is well-suited for this setting, and empirically evaluates its behavior.
Abstract: Motivated by the task of hyperparameter optimization, we introduce the non-stochastic best-arm identification problem Within the multi-armed bandit literature, the cumulative regret objective enjoys algorithms and analyses for both the non-stochastic and stochastic settings while to the best of our knowledge, the best-arm identification framework has only been considered in the stochastic setting We introduce the non-stochastic setting under this framework, identify a known algorithm that is well-suited for this setting, and analyze its behavior Next, by leveraging the iterative nature of standard machine learning algorithms, we cast hyperparameter optimization as an instance of non-stochastic best-arm identification, and empirically evaluate our proposed algorithm on this task Our empirical results show that, by allocating more resources to promising hyperparameter settings, we typically achieve comparable test accuracies an order of magnitude faster than baseline methods

303 citations


Cites background or methods from "LIBSVM: A library for support vecto..."

  • ...We ignore this technicality in our experiments. such as LibSVM (Chang & Lin, 2011), scikit-learn (Pedregosa et al., 2011) and MLlib (Kraska et al., 2013)....

    [...]

  • ...source machine learning packages such as LibSVM [12], scikit-learn [13] and MLlib [14]....

    [...]

  • ...Indeed, most packages including LibSVM [12], scikit-learn [13], and MLlib [14] train each hyperparameter setting to convergence....

    [...]

Journal ArticleDOI
TL;DR: The main results include a simple asymptotic convergence proof, a general explanation of the shrinking and caching techniques, and the linear convergence of the methods.
Abstract: Decomposition methods are currently one of the major methods for training support vector machines. They vary mainly according to different working set selections. Existing implementations and analysis usually consider some specific selection rules. This paper studies sequential minimal optimization type decomposition methods under a general and flexible way of choosing the two-element working set. The main results include: 1) a simple asymptotic convergence proof, 2) a general explanation of the shrinking and caching techniques, and 3) the linear convergence of the methods. Extensions to some support vector machine variants are also discussed.

302 citations

Journal ArticleDOI
TL;DR: This paper explores how to utilize a DCNN to bridge the affective gap in speech signals, and finds that the DCNN model pretrained for image applications performs reasonably good in affective speech feature extraction.
Abstract: Speech emotion recognition is challenging because of the affective gap between the subjective emotions and low-level features. Integrating multilevel feature learning and model training, deep convolutional neural networks (DCNN) has exhibited remarkable success in bridging the semantic gap in visual tasks like image classification, object detection. This paper explores how to utilize a DCNN to bridge the affective gap in speech signals. To this end, we first extract three channels of log Mel-spectrograms (static, delta, and delta delta) similar to the red, green, blue (RGB) image representation as the DCNN input. Then, the AlexNet DCNN model pretrained on the large ImageNet dataset is employed to learn high-level feature representations on each segment divided from an utterance. The learned segment-level features are aggregated by a discriminant temporal pyramid matching (DTPM) strategy. DTPM combines temporal pyramid matching and optimal Lp-norm pooling to form a global utterance-level feature representation, followed by the linear support vector machines for emotion classification. Experimental results on four public datasets, that is, EMO-DB, RML, eNTERFACE05, and BAUM-1s, show the promising performance of our DCNN model and the DTPM strategy. Another interesting finding is that the DCNN model pretrained for image applications performs reasonably good in affective speech feature extraction. Further fine tuning on the target emotional speech datasets substantially promotes recognition performance.

302 citations


Cites methods from "LIBSVM: A library for support vecto..."

  • ...We employ the LIBSVM package [60] with the linear kernel function and the one-versus-...

    [...]

References
More filters
Journal ArticleDOI
TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Abstract: The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

37,861 citations


"LIBSVM: A library for support vecto..." refers background in this paper

  • ...{1,-1}, C-SVC [Boser et al. 1992; Cortes and Vapnik 1995] solves 4LIBSVM Tools: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools. the following primal optimization problem: l t min 1 w T w +C .i (1) w,b,. 2 i=1 subject to yi(w T f(xi) +b) =1 -.i, .i =0,i =1,...,l, where f(xi)maps xi into a…...

    [...]

01 Jan 1998
TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Abstract: A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

26,531 citations


"LIBSVM: A library for support vecto..." refers background in this paper

  • ...Under given parameters C > 0and E> 0, the standard form of support vector regression [Vapnik 1998] is ll tt 1 T min w w + C .i + C .i * w,b,.,. * 2 i=1 i=1 subject to w T f(xi) + b- zi = E + .i, zi - w T f(xi) - b = E + .i * , * .i,.i = 0,i = 1,...,l....

    [...]

  • ...It can be clearly seen that C-SVC and one-class SVM are already in the form of problem (11)....

    [...]

  • ..., l, in two classes, and a vector y ∈ Rl such that yi ∈ {1,−1}, C-SVC (Cortes and Vapnik, 1995; Vapnik, 1998) solves the following primal problem:...

    [...]

  • ...Then, according to the SVM formulation, svm train one calls a corresponding subroutine such as solve c svc for C-SVC and solve nu svc for ....

    [...]

  • ...Note that b of C-SVC and E-SVR plays the same role as -. in one-class SVM, so we de.ne ....

    [...]

Proceedings ArticleDOI
01 Jul 1992
TL;DR: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented, applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions.
Abstract: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to the decision boundary. Bounds on the generalization performance based on the leave-one-out method and the VC-dimension are given. Experimental results on optical character recognition problems demonstrate the good generalization obtained when compared with other learning algorithms.

11,211 citations


"LIBSVM: A library for support vecto..." refers background in this paper

  • ...It can be clearly seen that C-SVC and one-class SVM are already in the form of problem (11)....

    [...]

  • ...Then, according to the SVM formulation, svm train one calls a corresponding subroutine such as solve c svc for C-SVC and solve nu svc for ....

    [...]

  • ...Note that b of C-SVC and E-SVR plays the same role as -. in one-class SVM, so we de.ne ....

    [...]

  • ...In Section 2, we describe SVM formulations sup­ported in LIBSVM: C-Support Vector Classi.cation (C-SVC), ....

    [...]

  • ...{1,-1}, C-SVC [Boser et al. 1992; Cortes and Vapnik 1995] solves 4LIBSVM Tools: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools. the following primal optimization problem: l t min 1 w T w +C .i (1) w,b,. 2 i=1 subject to yi(w T f(xi) +b) =1 -.i, .i =0,i =1,...,l, where f(xi)maps xi into a higher-dimensional space and C > 0 is the regularization parameter....

    [...]

01 Jan 2008
TL;DR: A simple procedure is proposed, which usually gives reasonable results and is suitable for beginners who are not familiar with SVM.
Abstract: Support vector machine (SVM) is a popular technique for classication. However, beginners who are not familiar with SVM often get unsatisfactory results since they miss some easy but signicant steps. In this guide, we propose a simple procedure, which usually gives reasonable results.

7,069 citations


"LIBSVM: A library for support vecto..." refers methods in this paper

  • ...A Simple Example of Running LIBSVM While detailed instructions of using LIBSVM are available in the README file of the package and the practical guide by Hsu et al. [2003], here we give a simple example....

    [...]

  • ...For instructions of using LIBSVM, see the README file included in the package, the LIBSVM FAQ,3 and the practical guide by Hsu et al. [2003]. LIBSVM supports the following learning tasks....

    [...]

Journal ArticleDOI
TL;DR: Decomposition implementations for two "all-together" multiclass SVM methods are given and it is shown that for large problems methods by considering all data at once in general need fewer support vectors.
Abstract: Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary classifiers. Some authors also proposed methods that consider all classes at once. As it is computationally more expensive to solve multiclass problems, comparisons of these methods using large-scale problems have not been seriously conducted. Especially for methods solving multiclass SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets. In this paper we give decomposition implementations for two such "all-together" methods. We then compare their performance with three methods based on binary classifications: "one-against-all," "one-against-one," and directed acyclic graph SVM (DAGSVM). Our experiments indicate that the "one-against-one" and DAG methods are more suitable for practical use than the other methods. Results also show that for large problems methods by considering all data at once in general need fewer support vectors.

6,562 citations