scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Training ν -Support Vector Classifiers: Theory and Algorithms

01 Sep 2001-Neural Computation (MIT Press)-Vol. 13, Iss: 9, pp 2119-2147
TL;DR: A decomposition method for -SVM is proposed that is competitive with existing methods for C-SVM and shows that in general they are two different problems with the same optimal solution set.
Abstract: The ν-support vector machine (ν-SVM) for classification proposed by Scholkopf, Smola, Williamson, and Bartlett (2000) has the advantage of using a parameter ν on controlling the number of support vectors. In this article, we investigate the relation between ν-SVM and C-SVM in detail. We show that in general they are two different problems with the same optimal solution set. Hence, we may expect that many numerical aspects of solving them are similar. However, compared to regular C-SVM, the formulation of ν-SVM is more complicated, so up to now there have been no effective methods for solving large-scale ν-SVM. We propose a decomposition method for ν-SVM that is competitive with existing methods for C-SVM. We also discuss the behavior of ν-SVM by some numerical experiments.
Citations
More filters
Journal ArticleDOI
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations


Cites background from "Training ν -Support Vector Classifi..."

  • ...The decision function is l t sgn yiaiK(xi,x) +b . i=1 It is shown that eT a =. can be replaced by eT a =. [Crisp and Burges 2000; Chang and Lin 2001]....

    [...]

  • ...In (Crisp and Burges, 2000; Chang and Lin, 2001), it has been shown that eα ≥ ν can be replaced by eα = ν....

    [...]

Journal ArticleDOI
TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Abstract: In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.

10,696 citations

Journal ArticleDOI
TL;DR: A new class of support vector algorithms for regression and classification that eliminates one of the other free parameters of the algorithm: the accuracy parameter in the regression case, and the regularization constant C in the classification case.
Abstract: We propose a new class of support vector algorithms for regression and classification. In these algorithms, a parameter ν lets one effectively control the number of support vectors. While this can be useful in its own right, the parameterization has the additional benefit of enabling us to eliminate one of the other free parameters of the algorithm: the accuracy parameter epsilon in the regression case, and the regularization constant C in the classification case. We describe the algorithms, give some theoretical results concerning the meaning and the choice of ν, and report experimental results.

2,737 citations

Journal ArticleDOI
TL;DR: The behavior of the SVM classifier when these hyper parameters take very small or very large values is analyzed, which helps in understanding thehyperparameter space that leads to an efficient heuristic method of searching for hyperparameter values with small generalization errors.
Abstract: Support vector machines (SVMs) with the gaussian (RBF) kernel have been popular for practical use. Model selection in this class of SVMs involves two hyperparameters: the penalty parameter C and the kernel width σ. This letter analyzes the behavior of the SVM classifier when these hyperparameters take very small or very large values. Our results help in understanding the hyperparameter space that leads to an efficient heuristic method of searching for hyperparameter values with small generalization errors. The analysis also indicates that if complete model selection using the gaussian kernel has been conducted, there is no need to consider linear SVM.

1,586 citations


Cites background or methods from "Training ν -Support Vector Classifi..."

  • ...The results also apply if Q is a bounded function of C since theorem 5 of Chang and Lin (2001b) holds for this case....

    [...]

  • ...function of C since Theorem 5 of ( Chang and Lin 2001b ) holds for this case....

    [...]

  • ...( Chang and Lin 2001b, Lemma 4)), A ≤ 0. However, A cannot be negative as otherwise Pl i=1 ξi goes to −∞ as C increases....

    [...]

  • ...By corollary 1 of Chang and Lin (2001b), we get linear separability in z-space....

    [...]

  • ...It can be shown (see the proof of Theorem 5 in ( Chang and Lin 2001b ) for...

    [...]

Journal ArticleDOI
TL;DR: A brief introduction of SVMs is provided, many applications are described and challenges and trends are summarized, especially in the some fields.

611 citations

References
More filters
Book
Vladimir Vapnik1
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations

01 Jan 1998
TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Abstract: A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

26,531 citations


"Training ν -Support Vector Classifi..." refers methods in this paper

  • ...This formulation is different from the original C-SVM (Vapnik, 1998): (PC) min 1 2 wTw + C l∑ i=1 ξi yi(wTφ(xi) + b) ≥ 1 − ξi, ξi ≥ 0, i = 1, . . . , l. (1.2) Neural Computation 13, 2119–2147 (2001) c© 2001 Massachusetts Institute of Technology In equation 1.2, a parameter C is used to penalize…...

    [...]

  • ...This formulation is different from the original C-SVM (Vapnik, 1998):...

    [...]

01 Jan 1998

12,940 citations


"Training ν -Support Vector Classifi..." refers background in this paper

  • ...Problems adult4 and web7 are compiled by Platt (1998) from the UCI Machine Learning Repository (Blake & Merz, 1998; Murphy & Aha 1994)....

    [...]

Proceedings ArticleDOI
08 Feb 1999
TL;DR: Support vector machines for dynamic reconstruction of a chaotic system, Klaus-Robert Muller et al pairwise classification and support vector machines, Ulrich Kressel.
Abstract: Introduction to support vector learning roadmap. Part 1 Theory: three remarks on the support vector method of function estimation, Vladimir Vapnik generalization performance of support vector machines and other pattern classifiers, Peter Bartlett and John Shawe-Taylor Bayesian voting schemes and large margin classifiers, Nello Cristianini and John Shawe-Taylor support vector machines, reproducing kernel Hilbert spaces, and randomized GACV, Grace Wahba geometry and invariance in kernel based methods, Christopher J.C. Burges on the annealed VC entropy for margin classifiers - a statistical mechanics study, Manfred Opper entropy numbers, operators and support vector kernels, Robert C. Williamson et al. Part 2 Implementations: solving the quadratic programming problem arising in support vector classification, Linda Kaufman making large-scale support vector machine learning practical, Thorsten Joachims fast training of support vector machines using sequential minimal optimization, John C. Platt. Part 3 Applications: support vector machines for dynamic reconstruction of a chaotic system, Davide Mattera and Simon Haykin using support vector machines for time series prediction, Klaus-Robert Muller et al pairwise classification and support vector machines, Ulrich Kressel. Part 4 Extensions of the algorithm: reducing the run-time complexity in support vector machines, Edgar E. Osuna and Federico Girosi support vector regression with ANOVA decomposition kernels, Mark O. Stitson et al support vector density estimation, Jason Weston et al combining support vector and mathematical programming methods for classification, Bernhard Scholkopf et al.

5,506 citations

01 Jan 1999
TL;DR: SMO breaks this large quadratic programming problem into a series of smallest possible QP problems, which avoids using a time-consuming numerical QP optimization as an inner loop and hence SMO is fastest for linear SVMs and sparse data sets.

5,350 citations


"Training ν -Support Vector Classifi..." refers background or methods in this paper

  • ...Currently major methods of solving large DC (for example, decomposition methods (Osuna, Freund, & Girosi, 1997; Joachims, 1998; Platt, 1998; Saunders et al., 1998) and the method of nearest points (Keerthi, Shevade, & Murthy, 2000)) use the simple structure of constraints....

    [...]

  • ...Problems adult4 and web7 are compiled by Platt (1998) from the UCI Machine Learning Repository (Blake & Merz, 1998; Murphy & Aha 1994)....

    [...]

Trending Questions (1)
How to build SVM model in R?

We propose a decomposition method for -SVM that is competitive with existing methods for C-SVM.