scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Support-Vector Networks

15 Sep 1995-Machine Learning (Kluwer Academic Publishers)-Vol. 20, Iss: 3, pp 273-297
TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Abstract: The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations


Cites background from "Support-Vector Networks"

  • ...{1,-1}, C-SVC [Boser et al. 1992; Cortes and Vapnik 1995] solves 4LIBSVM Tools: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools. the following primal optimization problem: l t min 1 w T w +C .i (1) w,b,. 2 i=1 subject to yi(w T f(xi) +b) =1 -.i, .i =0,i =1,...,l, where f(xi)maps xi into a…...

    [...]

Book
18 Nov 2016
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

38,208 citations

Book
25 Oct 1999
TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
Abstract: Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining. Thorough updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including new material on Data Transformations, Ensemble Learning, Massive Data Sets, Multi-instance Learning, plus a new version of the popular Weka machine learning software developed by the authors. Witten, Frank, and Hall include both tried-and-true techniques of today as well as methods at the leading edge of contemporary research. *Provides a thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques to your data mining projects *Offers concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods *Includes downloadable Weka software toolkit, a collection of machine learning algorithms for data mining tasks-in an updated, interactive interface. Algorithms in toolkit cover: data pre-processing, classification, regression, clustering, association rules, visualization

20,196 citations

Journal ArticleDOI
TL;DR: There are several arguments which support the observed high accuracy of SVMs, which are reviewed and numerous examples and proofs of most of the key theorems are given.
Abstract: The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.

15,696 citations


Cites background or methods from "Support-Vector Networks"

  • ...Keywords: support vector machines, statistical learning theory, VC dimension, pattern recognition...

    [...]

  • ...For the pattern recognition case, SVMs have been used for isolated handwritten digit recognition (Cortes and Vapnik, 1995; Schölkopf, Burges and Vapnik, 1995; Schölkopf, Burges and Vapnik, 1996; Burges and Schölkopf, 1997), object recognition (Blanz et al., 1996), speaker identification…...

    [...]

  • ...This can be done by introducing positive slack variables ξi, i = 1, · · · , l in the constraints (Cortes and Vapnik, 1995), which then become: xi · w+ b ≥ +1 − ξi for yi = +1 (40) xi · w+ b ≤ −1 + ξi for yi = −1 (41) ξi ≥ 0 ∀i. (42) Thus, for an error to occur, the corresponding ξi must exceed…...

    [...]

Journal ArticleDOI
TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Abstract: In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.

10,696 citations


Cites methods from "Support-Vector Networks"

  • ...Analogously to the “soft margin” loss function (Bennett and Mangasarian 1992) which was used in SV machines by Cortes and Vapnik (1995), one can introduce slack variables ξi , ξ ∗ i to cope with otherwise infeasible constraints of the optimization problem (2). Hence we arrive at the formulation stated in Vapnik (1995)....

    [...]

  • ...In its present form, the SV machine was largely developed at AT&T Bell Laboratories by Vapnik and co-workers (Boser, Guyon and Vapnik 1992, Guyon, Boser and Vapnik 1993, Cortes and Vapnik, 1995, Schölkopf, Burges and Vapnik 1995, 1996, Vapnik, Golowich and Smola 1997)....

    [...]

  • ...Analogously to the “soft margin” loss function (Bennett and Mangasarian 1992) which was used in SV machines by Cortes and Vapnik (1995), one can introduce slack variables ξi , ξ ∗ i to cope with otherwise infeasible constraints of the optimization problem (2)....

    [...]

  • ...Analogously to the “soft margin” loss function (Bennett and Mangasarian 1992) which was used in SV machines by Cortes and Vapnik (1995), one can introduce slack variables ξi , ξ ∗i to cope with otherwise infeasible constraints of the optimization problem (2)....

    [...]

References
More filters
Journal ArticleDOI
01 Jan 1988-Nature
TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.
Abstract: We describe a new learning procedure, back-propagation, for networks of neurone-like units. The procedure repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector. As a result of the weight adjustments, internal ‘hidden’ units which are not part of the input or output come to represent important features of the task domain, and the regularities in the task are captured by the interactions of these units. The ability to create useful new features distinguishes back-propagation from earlier, simpler methods such as the perceptron-convergence procedure1.

23,814 citations

Book ChapterDOI
01 Jan 1988
TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.
Abstract: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion

17,604 citations

Journal ArticleDOI

14,009 citations


"Support-Vector Networks" refers background in this paper

  • ...More than 60 years ago R.A. Fisher (Fisher, 1936) suggested the first algorithm for pattern recognition....

    [...]

Proceedings ArticleDOI
01 Jul 1992
TL;DR: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented, applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions.
Abstract: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to the decision boundary. Bounds on the generalization performance based on the leave-one-out method and the VC-dimension are given. Experimental results on optical character recognition problems demonstrate the good generalization obtained when compared with other learning algorithms.

11,211 citations

Book
01 Jan 1947
TL;DR: In this paper, the authors present an algebraic extension of LINEAR TRANSFORMATIONS and QUADRATIC FORMS, and apply it to EIGEN-VARIATIONS.
Abstract: Partial table of contents: THE ALGEBRA OF LINEAR TRANSFORMATIONS AND QUADRATIC FORMS. Transformation to Principal Axes of Quadratic and Hermitian Forms. Minimum-Maximum Property of Eigenvalues. SERIES EXPANSION OF ARBITRARY FUNCTIONS. Orthogonal Systems of Functions. Measure of Independence and Dimension Number. Fourier Series. Legendre Polynomials. LINEAR INTEGRAL EQUATIONS. The Expansion Theorem and Its Applications. Neumann Series and the Reciprocal Kernel. The Fredholm Formulas. THE CALCULUS OF VARIATIONS. Direct Solutions. The Euler Equations. VIBRATION AND EIGENVALUE PROBLEMS. Systems of a Finite Number of Degrees of Freedom. The Vibrating String. The Vibrating Membrane. Green's Function (Influence Function) and Reduction of Differential Equations to Integral Equations. APPLICATION OF THE CALCULUS OF VARIATIONS TO EIGENVALUE PROBLEMS. Completeness and Expansion Theorems. Nodes of Eigenfunctions. SPECIAL FUNCTIONS DEFINED BY EIGENVALUE PROBLEMS. Bessel Functions. Asymptotic Expansions. Additional Bibliography. Index.

7,426 citations