scispace - formally typeset
Search or ask a question

Showing papers on "Support vector machine published in 1997"


Proceedings ArticleDOI
17 Jun 1997
TL;DR: A decomposition algorithm that guarantees global optimality, and can be used to train SVM's over very large data sets is presented, and the feasibility of the approach on a face detection problem that involves a data set of 50,000 data points is demonstrated.
Abstract: We investigate the application of Support Vector Machines (SVMs) in computer vision. SVM is a learning technique developed by V. Vapnik and his team (AT&T Bell Labs., 1985) that can be seen as a new method for training polynomial, neural network, or Radial Basis Functions classifiers. The decision surfaces are found by solving a linearly constrained quadratic programming problem. This optimization problem is challenging because the quadratic form is completely dense and the memory requirements grow with the square of the number of data points. We present a decomposition algorithm that guarantees global optimality, and can be used to train SVM's over very large data sets. The main idea behind the decomposition is the iterative solution of sub-problems and the evaluation of optimality conditions which are used both to generate improved iterative values, and also establish the stopping criteria for the algorithm. We present experimental results of our implementation of SVM, and demonstrate the feasibility of our approach on a face detection problem that involves a data set of 50,000 data points.

2,764 citations


Proceedings Article
08 Jul 1997
TL;DR: In this paper, the authors show that the test error of the generated classifier usually does not increase as its size becomes very large, and often is observed to decrease even after the training error reaches zero.
Abstract: One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and often is observed to decrease even after the training error reaches zero. In this paper, we show that this phenomenon is related to the distribution of margins of the training examples with respect to the generated voting classification rule, where the margin of an example is simply the difference between the number of correct votes and the maximum number of votes received by any incorrect label. We show that techniques used in the analysis of Vapnik's support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error. We also show theoretically and experimentally that boosting is especially effective at increasing the margins of the training examples. Finally, we compare our explanation to those based on the bias-variance

2,423 citations


Book ChapterDOI
08 Oct 1997
TL;DR: A new method for performing a nonlinear form of Principal Component Analysis by the use of integral operator kernel functions is proposed and experimental results on polynomial feature extraction for pattern recognition are presented.
Abstract: A new method for performing a nonlinear form of Principal Component Analysis is proposed. By the use of integral operator kernel functions, one can efficiently compute principal components in highdimensional feature spaces, related to input space by some nonlinear map; for instance the space of all possible d-pixel products in images. We give the derivation of the method and present experimental results on polynomial feature extraction for pattern recognition.

2,223 citations


Journal ArticleDOI
TL;DR: The results show that on the United States postal service database of handwritten digits, the SV machine achieves the highest recognition accuracy, followed by the hybrid system, and the SV approach is thus not only theoretically well-founded but also superior in a practical application.
Abstract: The support vector (SV) machine is a novel type of learning machine, based on statistical learning theory, which contains polynomial classifiers, neural networks, and radial basis function (RBF) networks as special cases. In the RBF case, the SV algorithm automatically determines centers, weights, and threshold that minimize an upper bound on the expected test error. The present study is devoted to an experimental comparison of these machines with a classical approach, where the centers are determined by X-means clustering, and the weights are computed using error backpropagation. We consider three machines, namely, a classical RBF machine, an SV machine with Gaussian kernel, and a hybrid system with the centers determined by the SV method and the weights trained by error backpropagation. Our results show that on the United States postal service database of handwritten digits, the SV machine achieves the highest recognition accuracy, followed by the hybrid system. The SV approach is thus not only theoretically well-founded but also superior in a practical application.

1,385 citations


Proceedings ArticleDOI
24 Sep 1997
TL;DR: This paper presents a decomposition algorithm that is guaranteed to solve the QP problem and that does not make assumptions on the expected number of support vectors.
Abstract: We investigate the problem of training a support vector machine (SVM) on a very large database in the case in which the number of support vectors is also very large. Training a SVM is equivalent to solving a linearly constrained quadratic programming (QP) problem in a number of variables equal to the number of data points. This optimization problem is known to be challenging when the number of data points exceeds few thousands. In previous work done by us as well as by other researchers, the strategy used to solve the large scale QP problem takes advantage of the fact that the expected number of support vectors is small (<3,000). Therefore, the existing algorithms cannot deal with more than a few thousand support vectors. In this paper we present a decomposition algorithm that is guaranteed to solve the QP problem and that does not make assumptions on the expected number of support vectors. In order to present the feasibility of our approach we consider a foreign exchange rate time series database with 110,000 data points that generates 100,000 support vectors.

1,173 citations


Book ChapterDOI
08 Oct 1997
TL;DR: Two different cost functions for Support Vectors are made use: training with an e insensitive loss and Huber's robust loss function and how to choose the regularization parameters in these models are discussed.
Abstract: Support Vector Machines are used for time series prediction and compared to radial basis function networks. We make use of two different cost functions for Support Vectors: training with (i) an e insensitive loss and (ii) Huber's robust loss function and discuss how to choose the regularization parameters in these models. Two applications are considered: data from (a) a noisy (normal and uniform noise) Mackey Glass equation and (b) the Santa Fe competition (set D). In both cases Support Vector Machines show an excellent performance. In case (b) the Support Vector approach improves the best known result on the benchmark by a factor of 29%.

988 citations


Dissertation
01 Dec 1997
TL;DR: Preliminary results are presented obtained applying SVM to the problem of detecting frontal human faces in real images, and the main idea behind the decomposition is the iterative solution of sub-problems and the evaluation of, and also establish the stopping criteria for the algorithm.
Abstract: The Support Vector Machine (SVM) is a new and very promising classification technique developed by Vapnik and his group at AT\&T Bell Labs. This new learning algorithm can be seen as an alternative training technique for Polynomial, Radial Basis Function and Multi-Layer Perceptron classifiers. An interesting property of this approach is that it is an approximate implementation of the Structural Risk Minimization (SRM) induction principle. The derivation of Support Vector Machines, its relationship with SRM, and its geometrical insight, are discussed in this paper. Training a SVM is equivalent to solve a quadratic programming problem with linear and box constraints in a number of variables equal to the number of data points. When the number of data points exceeds few thousands the problem is very challenging, because the quadratic form is completely dense, so the memory needed to store the problem grows with the square of the number of data points. Therefore, training problems arising in some real applications with large data sets are impossible to load into memory, and cannot be solved using standard non-linear constrained optimization algorithms. We present a decomposition algorithm that can be used to train SVM''s over large data sets. The main idea behind the decomposition is the iterative solution of sub-problems and the evaluation of, and also establish the stopping criteria for the algorithm. We present previous approaches, as well as results and important details of our implementation of the algorithm using a second-order variant of the Reduced Gradient Method as the solver of the sub-problems. As an application of SVM''s, we present preliminary results we obtained applying SVM to the problem of detecting frontal human faces in real images.

804 citations


01 Jan 1997
TL;DR: This book provides a comprehensive analysis of what can be done using Support vector Machines, achieving record results in real-life pattern recognition problems, and proposes a new form of nonlinear Principal Component Analysis using Support Vector kernel techniques, which it is considered as the most natural and elegant way for generalization of classical Principal Component analysis.
Abstract: Foreword The Support Vector Machine has recently been introduced as a new technique for solving various function estimation problems, including the pattern recognition problem. To develop such a technique, it was necessary to rst extract factors responsible for future generalization, to obtain bounds on generalization that depend on these factors, and lastly to develop a technique that constructively minimizes these bounds. The subject of this book are methods based on combining advanced branches of statistics and functional analysis, developing these theories into practical algorithms that perform better than existing heuristic approaches. The book provides a comprehensive analysis of what can be done using Support Vector Machines, achieving record results in real-life pattern recognition problems. In addition, it proposes a new form of nonlinear Principal Component Analysis using Support Vector kernel techniques, which I consider as the most natural and elegant way for generalization of classical Principal Component Analysis. In many ways the Support Vector machine became so popular thanks to works of Bernhard Schh olkopf. The work, submitted for the title of Doktor der Naturwis-senschaften, appears as excellent. It is a substantial contribution to Machine Learning technology.

603 citations


Proceedings ArticleDOI
24 Sep 1997
TL;DR: The SVM is implemented and tested on a database of chaotic time series previously used to compare the performances of different approximation techniques, including polynomial and rational approximation, localPolynomial techniques, radial basis functions, and neural networks; the SVM performs better than the other approaches.
Abstract: A novel method for regression has been recently proposed by Vapnik et al. (1995, 1996). The technique, called support vector machine (SVM), is very well founded from the mathematical point of view and seems to provide a new insight in function approximation. We implemented the SVM and tested it on a database of chaotic time series previously used to compare the performances of different approximation techniques, including polynomial and rational approximation, local polynomial techniques, radial basis functions, and neural networks. The SVM performs better than the other approaches. We also study, for a particular time series, the variability in performance with respect to the few free parameters of SVM.

554 citations


Proceedings Article
01 Dec 1997
TL;DR: It is shown that both invariances under group transformations and prior knowledge about locality in images can be incorporated by constructing appropriate kernel functions by exploring methods for incorporating prior knowledge in Support Vector learning machines.
Abstract: We explore methods for incorporating prior knowledge about a problem at hand in Support Vector learning machines. We show that both invariances under group transformations and prior knowledge about locality in images can be incorporated by constructing appropriate kernel functions.

336 citations



Proceedings Article
01 Dec 1997
TL;DR: A strategy for polychotomous classification that involves estimating class probabilities for each pair of classes, and then coupling the estimates together is discussed, similar to the Bradley-Terry method for paired comparisons.
Abstract: We discuss a strategy for polychotomous classification that involves estimating class probabilities for each pair of classes, and then coupling the estimates together. The coupling model is similar to the Bradley-Terry method for paired comparisons. We study the nature of the class probability estimates that arise, and examine the performance of the procedure in simulated datasets. The classifiers used include linear discriminants and nearest neighbors: application to support vector machines is also briefly described.

Book ChapterDOI
Vladimir Vapnik1
08 Oct 1997
TL;DR: The general idea of the Support Vector method is described and theorems demonstrating that the generalization ability of the SV method is based on factors which classical statistics do not take into account are presented.
Abstract: The Support Vector (SV) method is a new general method of function estimation which does not depend explicitly on the dimensionality of input space It was applied for pattern recognition, regression estimation, and density estimation problems as well as for problems of solving linear operator equations In this article we describe the general idea of the SV method and present theorems demonstrating that the generalization ability of the SV method is based on factors which classical statistics do not take into account We also describe the SV method for density estimation in a set of functions defined by a mixture of an infinite number of Gaussians

Proceedings Article
01 Dec 1997
TL;DR: It is proved that the Green's Functions associated with regularization operators are suitable Support Vector Kernels with equivalent regularization properties and a large number of Radial Basis Functions namely conditionally positive definite functions may be used as Support Vector kernels.
Abstract: We derive the correspondence between regularization operators used in Regularization Networks and Hilbert Schmidt Kernels appearing in Support Vector Machines More specifically, we prove that the Green's Functions associated with regularization operators are suitable Support Vector Kernels with equivalent regularization properties As a by-product we show that a large number of Radial Basis Functions namely conditionally positive definite functions may be used as Support Vector kernels

Journal ArticleDOI
TL;DR: This paper discusses the possibility to construct classifiers entirely based on distances or similarities, without a relation with the feature space, as an alternative to automatic pattern recognition.


Proceedings ArticleDOI
09 Nov 1997
TL;DR: Simulation results indicate that the choice of the SVM technique to be used will depend on the optimized criteria under consideration, whether it is the torque/current ripple, the harmonic loss or the switching loss.
Abstract: Space vector modulation (SVM) techniques are becoming an industry standard, especially when it comes to fully digital AC motor drives. In this paper, four different SVM techniques are evaluated based on defined performance indexes. A new performance index is defined for comparison based on switching loss; comparison based the on method of hardware implementation is also discussed. Simulation results indicate that the choice of the SVM technique to be used will depend on the optimized criteria under consideration, whether it is the torque/current ripple, the harmonic loss or the switching loss.

Book ChapterDOI
08 Oct 1997
TL;DR: Different selection strategies are presented to reduce the complete quadratic classifier, which lower the required computing and memory resources by a factor of more than ten without affecting the generalization performance.
Abstract: Polynomial support vector machines have shown a competitive performance for the problem of handwritten digit recognition. However, there is a large gap in performance vs. computing resources between the linear and the quadratic approach. By computing the complete quadratic classifier out of the quadratic support vector machine, a pivot point is found to trade between performance and effort. Different selection strategies are presented to reduce the complete quadratic classifier, which lower the required computing and memory resources by a factor of more than ten without affecting the generalization performance.

Proceedings ArticleDOI
25 May 1997
TL;DR: In this article, a modified Kohonen's layer in its recalling mode was used to calculate the on duration of the adjacent switching state vectors, which was then used for the space vector modulation (SVM) implementation.
Abstract: This paper presents a new algorithm for the implementation of the space vector modulation (SVM). The approach employs a modified Kohonen's layer in its recalling mode to calculate the on duration of the adjacent switching state vectors. When compared with conventional implementation techniques, accurate results are obtained with less computing time. The proposed scheme is implemented on a DSP controlled 3 kVA unit (voltage source inverter) and experimental results confirm the validity of the proposed approach.

Book ChapterDOI
17 Sep 1997
TL;DR: The excellent recognition rates achieved in all the performed experiments indicate that the method is well-suited for aspect-based recognition.
Abstract: In this paper a method for 3-D object recognition based on Support Vector Machines (SVM) is proposed. Given a set of points which belong to either of two classes, a SVM finds the hyperplane that leaves the largest possible fraction of points of the same class on the same side, while maximizing the distance of the closest point. Recognition with SVMs does not require feature extraction and can be performed directly on images regarded as points of an N-dimensional object space. The potential of the proposed method is illustrated on a database of 7200 images of 100 different objects. The excellent recognition rates achieved in all the performed experiments indicate that the method is well-suited for aspect-based recognition.