scispace - formally typeset
Search or ask a question

Showing papers on "Support vector machine published in 2000"


Book
01 Jan 2000
TL;DR: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory, and will guide practitioners to updated literature, new applications, and on-line software.
Abstract: From the publisher: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc., and are now established as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and its applications. The concepts are introduced gradually in accessible and self-contained stages, while the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally, the book and its associated web site will guide practitioners to updated literature, new applications, and on-line software.

13,736 citations


Book
01 Mar 2000
TL;DR: This book is the first comprehensive introduction to Support Vector Machines, a new generation learning system based on recent advances in statistical learning theory, and introduces Bayesian analysis of learning and relates SVMs to Gaussian Processes and other kernel based learning methods.
Abstract: This book is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. The book also introduces Bayesian analysis of learning and relates SVMs to Gaussian Processes and other kernel based learning methods. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc. Their first introduction in the early 1990s lead to a recent explosion of applications and deepening theoretical analysis, that has now established Support Vector Machines along with neural networks as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and application of these techniques. The concepts are introduced gradually in accessible and self-contained stages, though in each stage the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally the book will equip the practitioner to apply the techniques and an associated web site will provide pointers to updated literature, new applications, and on-line software.

4,327 citations


Journal ArticleDOI
TL;DR: A new class of support vector algorithms for regression and classification that eliminates one of the other free parameters of the algorithm: the accuracy parameter in the regression case, and the regularization constant C in the classification case.
Abstract: We propose a new class of support vector algorithms for regression and classification. In these algorithms, a parameter ν lets one effectively control the number of support vectors. While this can be useful in its own right, the parameterization has the additional benefit of enabling us to eliminate one of the other free parameters of the algorithm: the accuracy parameter epsilon in the regression case, and the regularization constant C in the classification case. We describe the algorithms, give some theoretical results concerning the meaning and the choice of ν, and report experimental results.

2,737 citations


Journal ArticleDOI
TL;DR: In this paper, a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments is introduced based on the theory of support vector machines (SVMs).
Abstract: We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and self-organizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers. We test several SVMs that use different similarity metrics, as well as some other supervised learning methods, and find that the SVMs best identify sets of genes with a common function using expression data. Finally, we use SVMs to predict functional roles for uncharacterized yeast ORFs based on their expression data.

2,395 citations


Journal ArticleDOI
TL;DR: A new method that is close to the support vector machines insofar as the GDA method provides a mapping of the input vectors into high-dimensional feature space to deal with nonlinear discriminant analysis using kernel function operator.
Abstract: We present a new method that we call generalized discriminant analysis (GDA) to deal with nonlinear discriminant analysis using kernel function operator. The underlying theory is close to the support vector machines (SVM) insofar as the GDA method provides a mapping of the input vectors into high-dimensional feature space. In the transformed space, linear properties make it easy to extend and generalize the classical linear discriminant analysis (LDA) to nonlinear discriminant analysis. The formulation is expressed as an eigenvalue problem resolution. Using a different kernel, one can cover a wide class of nonlinearities. For both simulated data and alternate kernels, we give classification results, as well as the shape of the decision function. The results are confirmed using real data to perform seed classification.

1,743 citations


Proceedings Article
01 Jan 2000
TL;DR: An on-line recursive algorithm for training support vector machines, one vector at a time, is presented and interpretation of decremental unlearning in feature space sheds light on the relationship between generalization and geometry of the data.
Abstract: An on-line recursive algorithm for training support vector machines, one vector at a time, is presented. Adiabatic increments retain the Kuhn-Tucker conditions on all previously seen training data, in a number of steps each computed analytically. The incremental procedure is reversible, and decremental "unlearning" offers an efficient method to exactly evaluate leave-one-out generalization performance. Interpretation of decremental unlearning in feature space sheds light on the relationship between generalization and geometry of the data.

1,319 citations


Journal ArticleDOI
TL;DR: Both formulations of regularization and Support Vector Machines are reviewed in the context of Vapnik's theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics.
Abstract: Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular, the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization and Support Vector Machines. We review both formulations in the context of Vapnik's theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics. The emphasis is on regression: classification is treated as a special case.

1,305 citations


Proceedings Article
01 Jan 2000
TL;DR: The resulting algorithms are shown to be superior to some standard feature selection algorithms on both toy data and real-life problems of face recognition, pedestrian detection and analyzing DNA microarray data.
Abstract: We introduce a method of feature selection for Support Vector Machines. The method is based upon finding those features which minimize bounds on the leave-one-out error. This search can be efficiently performed via gradient descent. The resulting algorithms are shown to be superior to some standard feature selection algorithms on both toy data and real-life problems of face recognition, pedestrian detection and analyzing DNA microarray data.

1,112 citations


Book
01 Oct 2000
TL;DR: This book provides an overview of recent developments in large margin classifiers, examines connections with other methods, and identifies strengths and weaknesses of the method, as well as directions for future research.
Abstract: From the Publisher: The concept of large margins is a unifying principle for the analysis of many different approaches to the classification of data from examples, including boosting, mathematical programming, neural networks, and support vector machines. The fact that it is the margin, or confidence level, of a classification--that is, a scale parameter--rather than a raw training error that matters has become a key tool for dealing with classifiers. This book shows how this idea applies to both the theoretical analysis and the design of algorithms. The book provides an overview of recent developments in large margin classifiers, examines connections with other methods (e.g., Bayesian inference), and identifies strengths and weaknesses of the method, as well as directions for future research. Among the contributors are Manfred Opper, Vladimir Vapnik, and Grace Wahba.

1,059 citations


Proceedings ArticleDOI
01 Jul 2000
TL;DR: This paper explores the use of hierarchical structure for classifying a large, heterogeneous collection of web content using support vector machine (SVM) classifiers, which have been shown to be efficient and effective for classification, but not previously explored in the context of hierarchical classification.
Abstract: This paper explores the use of hierarchical structure for classifying a large, heterogeneous collection of web content. The hierarchical structure is initially used to train different second-level classifiers. In the hierarchical case, a model is learned to distinguish a second-level category from other categories within the same top level. In the flat non-hierarchical case, a model distinguishes a second-level category from all other second-level categories. Scoring rules can further take advantage of the hierarchy by considering only second-level categories that exceed a threshold at the top level.We use support vector machine (SVM) classifiers, which have been shown to be efficient and effective for classification, but not previously explored in the context of hierarchical classification. We found small advantages in accuracy for hierarchical models over flat models. For the hierarchical approach, we found the same accuracy using a sequential Boolean decision rule and a multiplicative decision rule. Since the sequential approach is much more efficient, requiring only 14%-16% of the comparisons used in the other approaches, we find it to be a good choice for classifying text into large hierarchical structures.

946 citations


Proceedings Article
29 Jun 2000
TL;DR: A simple active learning heuristic is described which greatly enhances the generalization behavior of support vector machines (SVMs) on several practical document classification tasks and frequently does so in less time than the naive approach of training on all available data.
Abstract: We describe a simple active learning heuristic which greatly enhances the generalization behavior of support vector machines (SVMs) on several practical document classification tasks. We observe a number of benefits, the most surprising of which is that a SVM trained on a wellchosen subset of the available corpus frequently performs better than one trained on all available data. The heuristic for choosing this subset is simple to compute, and makes no use of information about the test set. Given that the training time of SVMs depends heavily on the training set size, our heuristic not only offers better performance with fewer data, it frequently does so in less time than the naive approach of training on all available data.

01 Jan 2000
TL;DR: In this paper, learning reference EPFL-REPORT-82604 is used to learn Reference EPFL this paper. But learning reference is not considered in this paper. http://publications.idiap.ch/downloads/reports/2000/rr00-17.pdf Record created on 2006-03-10, modified on 2017-05-10
Abstract: Keywords: learning Reference EPFL-REPORT-82604 URL: http://publications.idiap.ch/downloads/reports/2000/rr00-17.pdf Record created on 2006-03-10, modified on 2017-05-10

Book
01 Jan 2000
TL;DR: Data Fitting with Linear Models, Designing and Training MLPs, and Function Approximation withMLPs, Radial Basis Functions, and Support Vector Machines.
Abstract: Data Fitting with Linear Models Pattern Recognition Multilayer Perceptrons Designing and Training MLPs Function Approximation with MLPs, Radial Basis Functions, and Support Vector Machines Hebbian Learning and Principal Component Analysis Competitive and Kohonen Networks Principles of Digital Signal Processing Adaptive Filters Temporal Processing with Neural Networks Training and Using Recurrent Networks Appendices Glossary Index

Journal ArticleDOI
TL;DR: An intuitive explanation of SVMs from a geometric perspective is provided and the classification problem is used to investigate the basic concepts behind SVMs and to examine their strengths and weaknesses from a data mining perspective.
Abstract: Support Vector Machines (SVMs) and related kernel methods have become increasingly popular tools for data mining tasks such as classification, regression, and novelty detection. The goal of this tutorial is to provide an intuitive explanation of SVMs from a geometric perspective. The classification problem is used to investigate the basic concepts behind SVMs and to examine their strengths and weaknesses from a data mining perspective. While this overview is not comprehensive, it does provide resources for those interested in further exploring SVMs.

Journal ArticleDOI
TL;DR: It is proved that the value of the span is always smaller (and can be much smaller) than the diameter of the smallest sphere containing the support vectors, used in previous bounds.
Abstract: We introduce the concept of span of support vectors (SV) and show that the generalization ability of support vector machines (SVM) depends on this new geometrical concept. We prove that the value of the span is always smaller (and can be much smaller) than the diameter of the smallest sphere containing the support vectors, used in previous bounds (Vapnik, 1998). We also demonstate experimentally that the prediction of the test error given by the span is very accurate and has direct application in model selection (choice of the optimal parameters of the SVM).

Proceedings ArticleDOI
26 Mar 2000
TL;DR: The potential of SVM on the Cambridge ORL face database, which consists of 400 images of 40 individuals, containing quite a high degree of variability in expression, pose, and facial details, is illustrated.
Abstract: Support vector machines (SVM) have been recently proposed as a new technique for pattern recognition. SVM with a binary tree recognition strategy are used to tackle the face recognition problem. We illustrate the potential of SVM on the Cambridge ORL face database, which consists of 400 images of 40 individuals, containing quite a high degree of variability in expression, pose, and facial details. We also present the recognition experiment on a larger face database of 1079 images of 137 individuals. We compare the SVM-based recognition with the standard eigenface approach using the nearest center classification (NCC) criterion.

OtherDOI
01 Jan 2000
TL;DR: Support vector machines (SVM) as discussed by the authors are a new generation learning system based on recent advances in statistical learning theory and have achieved state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc.
Abstract: From the publisher: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc., and are now established as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and its applications. The concepts are introduced gradually in accessible and self-contained stages, while the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally, the book and its associated web site will guide practitioners to updated literature, new applications, and on-line software.

Journal ArticleDOI
TL;DR: A new method for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily using a new kernel function derived from a generative statistical model for a protein family, in this case a hidden Markov model.
Abstract: A new method for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily. The method is a variant of support vector machines using a new kernel function. The kernel function is derived from a generative statistical model for a protein family, in this case a hidden Markov model. This general approach of combining generative models like HMMs with discriminative methods such as support vector machines may have applications in other areas of biosequence analysis as well.

Proceedings Article
29 Jun 2000
TL;DR: A new method to recognize and handle concept changes with support vector machines that maintains a window on the training data and can eeectively select an appropriate window size in a robust way is proposed.
Abstract: For many learning tasks where data is collected over an extended period of time, its underlying distribution is likely to change. A typical example is information ltering, i.e. the adaptive classiication of documents with respect to a particular user interest. Both the interest of the user and the document content change over time. A ltering system should be able to adapt to such concept changes. This paper proposes a new method to recognize and handle concept changes with support vector machines. The method maintains a window on the training data. The key idea is to automatically adjust the window size so that the estimated generalization error is minimized. The new approach is both theoretically well-founded as well as eeective and eecient in practice. Since it does not require complicated parameterization, it is simpler to use and more robust than comparable heuristics. Experiments with simulated concept drift scenarios based on real-world text data compare the new method with other window management approaches. We show that it can eeectively select an appropriate window size in a robust way.

Journal ArticleDOI
01 Sep 2000
TL;DR: Zien et al. as discussed by the authors used support vector machines (SVM) to identify the translation initiation sites (TIS) in protein sequences from nucleotide sequences, which is an important step to recognize points at which regions start that code for proteins.
Abstract: Motivation: In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called translation initiation sites (TIS). Results: The task of finding TIS can be modeled as a classification problem. We demonstrate the applicability of support vector machines for this task, and show how to incorporate prior biological knowledge by engineering an appropriate kernel function. With the described techniques the recognition performance can be improved by 26% over leading existing approaches. We provide evidence that existing related methods (e.g.ESTScan) could profit from advanced TIS recognition. Contact: {Alexander.Zien,Gunnar.Raetsch,Sebastian.

Journal ArticleDOI
TL;DR: In this article, a discrete space vector modulation (DSVM) was proposed for direct torque control of induction machines in order to emphasize the effects produced by a given voltage vector on stator flux and torque variations.
Abstract: The basic concept of direct torque control of induction machines is investigated in order to emphasize the effects produced by a given voltage vector on stator flux and torque variations. The low number of voltage vectors which can be applied to the machine using the basic DTC scheme may cause undesired torque and current ripple. An improvement of the drive performance can be obtained using a new DTC algorithm based on the application of the space vector modulation (SVM) for prefixed time intervals. In this way a sort of discrete space vector modulation (DSVM) is introduced. Numerical simulations and experimental tests have been carried out to validate the proposed method.


Proceedings ArticleDOI
29 Jun 2000
TL;DR: Without any computation-intensive resampling, the new estimators developed here are computationally much more e cient than cross-validation or bootstrapping and address the special performancemeasures needed for evaluating text classi ers.

Journal ArticleDOI
TL;DR: Comparative computational evaluation of the new fast iterative algorithm against powerful SVM methods such as Platt's sequential minimal optimization shows that the algorithm is very competitive.
Abstract: In this paper we give a new fast iterative algorithm for support vector machine (SVM) classifier design. The basic problem treated is one that does not allow classification violations. The problem is converted to a problem of computing the nearest point between two convex polytopes. The suitability of two classical nearest point algorithms, due to Gilbert, and Mitchell et al., is studied. Ideas from both these algorithms are combined and modified to derive our fast algorithm. For problems which require classification violations to be allowed, the violations are quadratically penalized and an idea due to Cortes and Vapnik and Friess is used to convert it to a problem in which there are no classification violations. Comparative computational evaluation of our algorithm against powerful SVM methods such as Platt's sequential minimal optimization shows that our algorithm is very competitive.


Proceedings Article
30 Jun 2000
TL;DR: This paper shows how the RVM can be formulated and solved within a completely Bayesian paradigm through the use of variational inference, thereby giving a posterior distribution over both parameters and hyperparameters.
Abstract: The Support Vector Machine (SVM) of Vapnik [9] has become widely established as one of the leading approaches to pattern recognition and machine learning. It expresses predictions in terms of a linear combination of kernel functions centred on a subset of the training data, known as support vectors. Despite its widespread success, the SVM suffers from some important limitations, one of the most significant being that it makes point predictions rather than generating predictive distributions. Recently Tipping [8] has formulated the Relevance Vector Machine (RVM), a probabilistic model whose functional form is equivalent to the SVM. It achieves comparable recognition accuracy to the SVM, yet provides a full predictive distribution, and also requires substantially fewer kernel functions. The original treatment of the RVM relied on the use of type II maximum likelihood (the 'evidence framework') to provide point estimates of the hyperparameters which govern model sparsity. In this paper we show how the RVM can be formulated and solved within a completely Bayesian paradigm through the use of variational inference, thereby giving a posterior distribution over both parameters and hyperparameters. We demonstrate the practicality and performance of the variational RVM using both synthetic and real world examples.

Proceedings ArticleDOI
26 Mar 2000
TL;DR: Support vector machines (SVM) are investigated for visual gender classification with low-resolution "thumbnail" faces processed from 1755 images from the FERET face database, demonstrating robustness and relative scale invariance for visual classification.
Abstract: Support vector machines (SVM) are investigated for visual gender classification with low-resolution "thumbnail" faces (21-by-12 pixels) processed from 1755 images from the FERET face database. The performance of SVM (3.4% error) is shown to be superior to traditional pattern classifiers (linear, quadratic, Fisher linear discriminant, nearest-neighbor) as well as more modern techniques such as radial basis function (RBF) classifiers and large ensemble-RBF networks. SVM also out-performed human test subjects at the same task: in a perception study with 30 human test subjects, ranging in age from mid-20s to mid-40s, the average error rate was found to be 32% for the "thumbnails" and 6.7% with higher resolution images. The difference in performance between low- and high-resolution tests with SVM was only 1%, demonstrating robustness and relative scale invariance for visual classification.

Proceedings ArticleDOI
13 Sep 2000
TL;DR: This paper investigates how SVMs with a very large number of features perform with the classification task of chunk labelling, CoNLL-2000 shared task, chunk identification.
Abstract: In this paper, we explore the use of Support Vector Machines (SVMs) for CoNLL-2000 shared task, chunk identification. SVMs are so-called large margin classifiers and are well-known as their good generalization performance. We investigate how SVMs with a very large number of features perform with the classification task of chunk labelling.

Journal ArticleDOI
TL;DR: This paper introduces SVM's within the context of recurrent neural networks and considers a least squares version of Vapnik's epsilon insensitive loss function related to a cost function with equality constraints for a recurrent network.
Abstract: The method of support vector machines (SVM's) has been developed for solving classification and static function approximation problems. In this paper we introduce SVM's within the context of recurrent neural networks. Instead of Vapnik's epsilon insensitive loss function, we consider a least squares version related to a cost function with equality constraints for a recurrent network. Essential features of SVM's remain, such as Mercer's condition and the fact that the output weights are a Lagrange multiplier weighted sum of the data points. The solution to recurrent least squares (LS-SVM's) is characterized by a set of nonlinear equations. Due to its high computational complexity, we focus on a limited case of assigning the squared error an infinitely large penalty factor with early stopping as a form of regularization. The effectiveness of the approach is demonstrated on trajectory learning of the double scroll attractor in Chua's circuit.

Proceedings ArticleDOI
28 May 2000
TL;DR: This paper investigates imposing sparseness by pruning support values from the sorted support value spectrum which results from the solution to the linear system.
Abstract: In least squares support vector machines (LS-SVMs) for function estimation Vapnik's /spl epsiv/-insensitive loss function has been replaced by a cost function which corresponds to a form of ridge regression. In this way nonlinear function estimation is done by solving a linear set of equations instead of solving a quadratic programming problem. The LS-SVM formulation also involves less tuning parameters. However, a drawback is that sparseness is lost in the LS-SVM case. In this paper we investigate imposing sparseness by pruning support values from the sorted support value spectrum which results from the solution to the linear system.