scispace - formally typeset
Search or ask a question

Showing papers on "Statistical learning theory published in 2002"


Journal ArticleDOI
TL;DR: The result shows that the prediction accuracy of SVM is at least as good as and in some cases actually better than that of ANN, yet it offers advantages over many of the limitations of ANN in arriving at ANN's optimal network architecture and choosing useful training set.
Abstract: Machine learning techniques are finding more and more applications in the field of forecasting. A novel regression technique, called Support Vector Machine (SVM), based on the statistical learning theory is explored in this study. SVM is based on the principle of Structural Risk Minimization as opposed to the principle of Empirical Risk Minimization espoused by conventional regression techniques. The flood data at Dhaka, Bangladesh, are used in this study to demonstrate the forecasting capabilities of SVM. The result is compared with that of Artificial Neural Network (ANN) based model for one-lead day to seven-lead day forecasting. The improvements in maximum predicted water level errors by SVM over ANN for four-lead day to seven-lead day are 9.6 cm, 22.6 cm, 4.9 cm and 15.7 cm, respectively. The result shows that the prediction accuracy of SVM is at least as good as and in some cases (particularly at higher lead days) actually better than that of ANN, yet it offers advantages over many of the limitations of ANN, for example in arriving at ANN's optimal network architecture and choosing useful training set. Thus, SVM appears to be a very promising prediction tool.

298 citations


Journal ArticleDOI
TL;DR: Support vector machines (SVM) as a recent approach to classification implement classifiers of an adjustable flexibility, which are automatically and in a principled way optimised on the training data for a good generalisation performance.

234 citations


Journal ArticleDOI
TL;DR: This review shows that the solution of the classical Tikhonov regularization problem can be derived from the regularized functional defined by a linear differential (integral) operator in the spatial (Fourier) domain.
Abstract: This review provides a comprehensive understanding of regularization theory from different perspectives, emphasizing smoothness and simplicity principles. Using the tools of operator theory and Fourier analysis, it is shown that the solution of the classical Tikhonov regularization problem can be derived from the regularized functional defined by a linear differential (integral) operator in the spatial (Fourier) domain. State-of-the-art research relevant to the regularization theory is reviewed, covering Occam's razor, minimum length description, Bayesian theory, pruning algorithms, informational (entropy) theory, statistical learning theory, and equivalent regularization. The universal principle of regularization in terms of Kolmogorov complexity is discussed. Finally, some prospective studies on regularization theory and beyond are suggested.

132 citations


Journal ArticleDOI
TL;DR: Experimental evidence highlights the gain in quality resulting from combining some of the most widely used prediction methods with the authors' SVMs rather than with the ensemble methods traditionally used in the field, which increases when the outputs of the combiners are post-processed with a DP algorithm.
Abstract: The idea of performing model combination, instead of model selection, has a long theoretical background in statistics. However, making use of theoretical results is ordinarily subject to the satisfaction of strong hypotheses (weak error correlation, availability of large training sets, possibility to rerun the training procedure an arbitrary number of times, etc.). In contrast, the practitioner is frequently faced with the problem of combining a given set of pre-trained classifiers, with highly correlated errors, using only a small training sample. Overfitting is then the main risk, which cannot be overcome but with a strict complexity control of the combiner selected. This suggests that SVMs should be well suited for these difficult situations. Investigating this idea, we introduce a family of multi-class SVMs and assess them as ensemble methods on a real-world problem. This task, protein secondary structure prediction, is an open problem in biocomputing for which model combination appears to be an issue of central importance. Experimental evidence highlights the gain in quality resulting from combining some of the most widely used prediction methods with our SVMs rather than with the ensemble methods traditionally used in the field. The gain increases when the outputs of the combiners are post-processed with a DP algorithm.

127 citations


Proceedings Article
01 Jan 2002
TL;DR: A new computational model is proposed that is based on principles of high dimensional dynamical systems in combination with statistical learning theory and can be implemented on generic evolved or found recurrent circuitry.
Abstract: A key challenge for neural modeling is to explain how a continuous stream of multi-modal input from a rapidly changing environment can be processed by stereotypical recurrent circuits of integrate-and-fire neurons in real-time. We propose a new computational model that is based on principles of high dimensional dynamical systems in combination with statistical learning theory. It can be implemented on generic evolved or found recurrent circuitry.

97 citations


Journal ArticleDOI
TL;DR: Simulation results for both artificial and real data show that the generalization performance of the method is a good approximation of SVMs and the computation complex is largely reduced by the method.

95 citations


Journal ArticleDOI
TL;DR: Techniques, like support vector machines and regularization networks, which can be justified in this theoretical framework and proved to be useful in a number of image analysis applications are discussed.

75 citations


Proceedings ArticleDOI
07 Aug 2002
TL;DR: The experimental comparison between the support vector machine and the classical radial basis function (RBF) network demonstrates that the SVM is superior to conventional RBF in predicting air quality parameters with different time series.
Abstract: Forecasting of air quality parameters is an important topic of atmospheric and environmental research today due to the health impact caused by airborne pollutants existing in urban areas. The support vector machine (SVM), as a novel type of learning machine based on statistical learning theory, can be used for regression and time series prediction and has been reported to perform well with some promising results. The work presented examines the feasibility of applying SVM to predict pollutant concentrations. The functional characteristics of the SVM are also investigated. The experimental comparison between the SVM and the classical radial basis function (RBF) network demonstrates that the SVM is superior to conventional RBF in predicting air quality parameters with different time series.

62 citations


Journal ArticleDOI
TL;DR: Empirical comparisons of different methods for complexity control suggest practical advantages of using VC-based modelselection in settings where VC generalization bounds can be rigorously applied, and argues that VC-theory provides methodological framework for complexity Control even when itstechnical results can not be directly applied.
Abstract: We discuss the problem of model complexity control also known as model selection. This problem frequently arises in the context of predictive learning and adaptive estimation of dependencies from finite data. First we review the problem of predictive learning as it relates to model complexity control. Then we discuss several issues important for practical implementation of complexity control, using the framework provided by Statistical Learning Theory (or Vapnik-Chervonenkis theory). Finally, we show practical applications of Vapnik-Chervonenkis (VC) generalization bounds for model complexity control. Empirical comparisons of different methods for complexity control suggest practical advantages of using VC-based model selection in settings where VC generalization bounds can be rigorously applied. We also argue that VC-theory provides methodological framework for complexity control even when its technical results can not be directly applied.

55 citations


Journal Article
TL;DR: In this paper, a progressive transductive support vector machine (PSVM) was proposed to handle different class distributions. But the experimental results show that the proposed algorithm is not suitable for the case when the training sets are small or when there is a significant deviation between the training and working sets subsamples of the total population.
Abstract: Support vector machine (SVM) is a new learning method developed in recent years based on the foundations of statistical learning theory. By taking a transductive approach instead of an inductive one in support vector classifiers, the working set can be used as an additional source of information about margins. Compared with traditional inductive support vector machines, transductive support vector machine is often more powerful and can give better performance. In transduction, one estimates the classification function at points within the working set using information from both the training and the working set data. This will help to improve the generalization performance of SVMs, especially when training data is inadequate. Intuitively, we would expect transductive learning to yield improvements when the training sets are small or when there is a significant deviation between the training and working set subsamples of the total population. In this paper, a progressive transductive support vector machine is addressed to extend Joachims' transductive SVM to handle different class distributions. It solves the problem of having to estimate the ratio of positive/negative examples from the working set. The experimental results show the algorithm is very promising.

49 citations


Proceedings ArticleDOI
07 Nov 2002
TL;DR: In the paper, the SVM nonlinear classification algorithm is reviewed and the S VM nonlinear classifier is applied to deal with fault diagnosis.
Abstract: Support vector machine (SVM) is a novel machine learning method based on statistical learning theory. SVM is a powerful tool for solving problems with small samples, nonlinearities and local minima, and is of excellent performance in classification. In the paper, the SVM nonlinear classification algorithm is reviewed. The SVM nonlinear classifier is applied to deal with fault diagnosis. SVM is easy to implement for fault diagnosis. Effective results are obtained of using the SVM for fault diagnosis.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: This paper examines SVM from the statistical learning theory, the convex hull problem from the computational geometry's point of view, and Gabriel's graph from the Computational geometry perspective to describe their theoretical connections and practical implementation implications.
Abstract: One of the major tasks in the support vector machine (SVM) algorithm is to locate the discriminant boundary in classification task. It is crucial to understand various approaches to this particular task. In this paper, we survey several different methods of finding the boundary from different disciplines. In particular, we examine SVM from the statistical learning theory, the convex hull problem from the computational geometry's point of view, and Gabriel's graph from the computational geometry perspective to describe their theoretical connections and practical implementation implications. Moreover, we implement these methods and demonstrate their respective results on the classification accuracy and run time complexity. Finally, we conclude with some discussions about these three different techniques.

Proceedings ArticleDOI
08 Jul 2002
TL;DR: This paper uses support vector machines, through statistical learning theory, to compress the probabilistic information available in the form of IID samples and apply it to solve the Bayesian data fusion problem.
Abstract: The basic quantity to be estimated in the Bayesian approach to data fusion is the conditional probability density function (CPDF). Computationally efficient particle filtering approaches are becoming more important in estimating these CPDFs. In this approach, IID samples are used to represent the conditional probability densities. However, their application in data fusion is severely limited due to the fact that the information is stored in the form of a large set of samples. In all practical data fusion systems that have limited communication bandwidth, broadcasting this probabilistic information, available as a set of samples, to the fusion center is impractical. Support vector machines, through statistical learning theory, provide a way of compressing information by generating optimal kernal based representations. In this paper we use SVM to compress the probabilistic information available in the form of IID samples and apply it to solve the Bayesian data fusion problem. We demonstrate this technique on a multi-sensor tracking example.

Journal ArticleDOI
TL;DR: In this article, the authors present preliminary results for a new framework in identification of predictor models for unknown systems, which builds on recent developments of statistical learning theory, and the three key elements of their approach are: the unknown mechanism that generates the observed data (referred to as the remote data generation mechanism -DGM), a selected family of models, with which they want to describe the observations (the data descriptor model -DDM), and a consistency criterion, which serves to assess whether a given observation is compatible with the selected model.

Journal Article
TL;DR: This is a first work proposed to use SVM in keyword spotting, in order to improve recognition and rejection accuracy and the obtained results are very promising.
Abstract: Support Vector Machines is a new and promising technique in statistical learning theory. Recently, this technique produced very interesting results in pattern recognition [1,2,3]. In this paper, one of the first application of Support Vector Machines (SVM) technique for the problem of keyword spotting is presented. It classifies the correct and the incorrect keywords by using linear and Radial Basis Function kernels. This is a first work proposed to use SVM in keyword spotting, in order to improve recognition and rejection accuracy. The obtained results are very promising.

Proceedings ArticleDOI
10 Dec 2002
TL;DR: In this article, a new approach that combines the system identification technique and the SVM learning algorithm for fault detection and isolation in dynamic systems is presented, where a conventional heat exchanger dynamics is used to illustrate the technique.
Abstract: Support vector machines (SVMs), based on Vapnik's statistical learning theory is a new tool that can be used for fault detection and isolation in dynamic systems. This paper presents a new approach that combines the system identification technique and the SVM learning algorithm for fault detection and isolation in dynamic systems. A conventional heat exchanger dynamics is used to illustrate the technique.

01 Jan 2002
TL;DR: In this contribution, connections between conventional techniques of pattern recognition, evolutionary approaches, and newer results from computational and statistical learning theory are brought together in the context of the automatic design of RBF regression networks.
Abstract: While amazing applications have been demonstrated in di+erent science and engineering ,elds using neural networks and evolutionary approaches, one of the key elements of their further acceptance and proliferation is the study and provision of procedures for the automatic design of neural architectures and associated learning methods, i.e., in general, the study of the systematic and automatic design of arti,cial brains. In this contribution, connections between conventional techniques of pattern recognition, evolutionary approaches, and newer results from computational and statistical learning theory are brought together in the context of the automatic design of RBF regression networks. c 2002 Elsevier Science B.V. All rights reserved.

Book ChapterDOI
09 Sep 2002
TL;DR: In this article, one of the first application of Support Vector Machines (SVM) technique for the problem of keyword spotting is presented, which classifies the correct and the incorrect keywords by using linear and Radial Basis Function kernels.
Abstract: Support Vector Machines is a new and promising technique in statistical learning theory. Recently, this technique produced very interesting results in pattern recognition [1,2,3].In this paper, one of the first application of Support Vector Machines (SVM) technique for the problem of keyword spotting is presented. It classifies the correct and the incorrect keywords by using linear and Radial Basis Function kernels. This is a first work proposed to use SVM in keyword spotting, in order to improve recognition and rejection accuracy. The obtained results are very promising.

Journal Article
TL;DR: This article introduced the training algorithm for the newest branch of statistic learning theory, SVM(Support Vector Machine), which can be classified into three categories: the first is the Decomposition Algorithm, whose delegate is SVMlight, the second is sequence algorithm, the third is online training algorithm.
Abstract: This article introduced the training algorithm for the newest branch of statistic learning theory, SVM(Support Vector Machine), which can be classified into three categories: the first is the Decomposition Algorithm, whose delegate is SVMlight , the second is sequence algorithm, the third is online training algorithm. All the three kinds of algorithms' advantages and disadvantages were analysed. And other algorithms and multi class algorithms are introduced too. The future direction and application of SVM in pattern recognition and data mining, and so on were introduced.

Journal Article
TL;DR: Three support vector machine models were established for VCR stope, carbon leader stope and tunnel in deep gold mines,respectively, and they give accurate predictions for the novel conditions.
Abstract: A new method based on support vector machine for predicting rockbursts was proposed.It uses support vector machine to represent nonlinear relationship between rockburst and its factors.This model learns from case histories and then can predict fast the rockburst for similar conditions.Three support vector machine models were established for VCR stope,carbon leader stope and tunnel in deep gold mines,respectively. The models give accurate predictions for the novel conditions.

Proceedings Article
21 Jul 2002
TL;DR: An algorithm for learning stable machines which is motivated by recent results in statistical learning theory and is similar to Breiman's bagging despite some important differences in that it computes an ensemble combination of machines trained on small random sub-samples of an initial training set.
Abstract: We present an algorithm for learning stable machines which is motivated by recent results in statistical learning theory. The algorithm is similar to Breiman's bagging despite some important differences in that it computes an ensemble combination of machines trained on small random sub-samples of an initial training set. A remarkable property is that it is often possible to just use the empirical error of these combinations of machines for model selection. We report experiments using support vector machines and neural networks validating the theory.

Proceedings ArticleDOI
Milind Naphade1, Sankar Basu1, John R. Smith1, Ching-Yung Lin1, Belle L. Tseng1 
10 Dec 2002
TL;DR: The use of active learning for the annotation engine that minimizes the number of training samples to be labeled for satisfactory performance is explored in the context of recent TREC Video benchmark exercise.
Abstract: Statistical: modeling for content based retrieval is examined in the context of recent TREC Video benchmark exercise. The TREC Video exercise can be viewed as a test bed for evaluation and comparison of a variety of different algorithms on a set of high-level queries for multimedia retrieval. We report on the use of techniques adopted from statistical learning theory. Our method depends on training of models based on large data sets. Particularly, we use statistical models such as Gaussian mixture models to build computational representations for a variety of semantic concepts including rocket-launch, outdoor greenery, sky etc. Training requires a large amount of annotated (labeled) data. Thus, we explore the use of active learning for the annotation engine that minimizes the number of training samples to be labeled for satisfactory performance.

Journal Article
TL;DR: This paper aimed at introducing the principle of SLT and SVM algorithm and prospecting their applications in the fields of chemistry and chemical industry.
Abstract: The great achievements have been approached in the development of statistical learning theory (STL) and support vector machine (SVM) as well as kemel techniques. This paper aimed at introducing the principle of SLT and SVM algorithm and prospecting their applica-tions in the fields of chemistry and chemical industry.

Proceedings ArticleDOI
09 Dec 2002
TL;DR: A progressive transductive support vector machine is addressed to extend Joachims' Transductive SVM to handle different class distributions and solves the problem of having to estimate the ratio of positive/negative examples from the working set.
Abstract: Support Vector Machine (SVM) is a new learning method developed in recent years based on the foundations of statistical learning theory. By taking a transductive approach instead of an inductive one in support vector classifiers, the test set can be used as an additional source of information about margins. Intuitively, we would expect transductive learning to yield improvements when the training sets are small or when there is a significant deviation between the training and working set subsamples of the total population. In this paper, a progressive transductive support vector machine is addressed to extend Joachims' Transductive SVM to handle different class distributions. It solves the problem of having to estimate the ratio of positive/negative examples from the working set. The experimental results show that the algorithm is very promising.

Journal ArticleDOI
TL;DR: The problem studied is the behavior of a discrete classifier on a finite learning sample and the suggested approach frequently recommends significantly smaller learning sample size.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: This work proposes to apply the Benders decomposition technique to the resulting LP for the regression case, and preliminary results show that this technique is much faster than the QP formulation.
Abstract: The theory of the support vector machine (SVM) algorithm is based on the statistical learning theory. Training of SVMs leads to either a quadratic programming (QP) problem, or linear programming (LP) problem. This depends on the specific norm that is used when the distance between the convex hulls of two classes are computed. The l/sub 1/ norm distance leads to a large scale linear programming problem in the case where the sample size is very large. We propose to apply the Benders decomposition technique to the resulting LP for the regression case. Preliminary results show that this technique is much faster than the QP formulation.

Book ChapterDOI
01 Jan 2002
TL;DR: Support vector machines (SVM) are a new machine learning approach based on statistical learning theory (Vapnik-Chervonenkis theory), which has a solid mathematical background for dependencies estimation and predictive learning from finite data sets as discussed by the authors.
Abstract: Support vector machines (SVM) is a new machine learning approach based on statistical learning theory (Vapnik-Chervonenkis theory). VC theory has a solid mathematical background for dependencies estimation and predictive learning from finite data sets. SVM is based on the structural risk minimisation principle, aiming to minimise both the empirical risk and the complexity of the model, thereby providing high generalisation abilities. SVM provides non-linear classification and regression by mapping the input space into a higher-dimensional feature space using kernel functions, where the optimal solutions are constructed. This paper presents a review on the use of SVM for the analysis and modelling of spatially distributed information. The methodology developed here combines the power of SVM with well known geostatistical approaches such as exploratory data analysis and exploratory variography. A case study (classification and regression) based on reservoir data with 294 vertically averaged porosity values and 2D seismic velocity and amplitude is presented. Such results are also compared with geostatistical models.

Book ChapterDOI
17 Jun 2002
TL;DR: Lagrangian differential gradient method is applied for training and pruning RBF network and shows better generalization performance and computationally faster than SVM with Gaussian kernel, specially for large training data sets.
Abstract: The Support Vector Machine (SVM) has recently been introduced as a new learning technique for solving variety of real-world applications based on statistical learning theory. The classical Radial Basis Function (RBF) network has similar structure as SVM with Gaussian kernel. In this paper we have compared the generalization performance of RBF network and SVM in classification problems. We applied Lagrangian differential gradient method for training and pruning RBF network. RBF network shows better generalization performance and computationally faster than SVM with Gaussian kernel, specially for large training data sets.

01 Jan 2002
TL;DR: Extending several standard results, among which a famous theorem by Bartlett is derived, distribution-free uniform strong laws of large numbers devoted to multi-class large margin discriminant models are derived.
Abstract: Vapnik's statistical learning theory has mainly been developed for two types of problems: pattern recognition (computation of dichotomies) and regression (estimation of real-valued functions). Only in recent years has multi-class discriminant analysis been studied independently. Extending several standard results, among which a famous theorem by Bartlett, we have derived distribution-free uniform strong laws of large numbers devoted to multi-class large margin discriminant models. This technical report deals with the computation of the capacity measures involved in these bounds on the expected risk. Straightforward extensions of results regarding large margin classifiers highlight the central role played by a new generalized VC dimension, which can be seen either as an extension of the fat-shattering dimension to the multivariate case, or as a scale-sensitive version of the graph dimension. The theorems derived are applied to the architecture shared by all the multi-class SVMs proposed so far, which provides us with a simple theoretical framework to study them, compare their performance and design new machines.

Journal Article
TL;DR: A soft sensor model based on the SVM that features high learning speed, good approximation, well generalization ability, and little dependence on the sample set is presented.
Abstract: Support vector machine (SVM) is a new learning machine based on the statistical learning theory. This paper presents a soft sensor model based on the SVM. Theoretical and simulation analysis indicates that this method features high learning speed, good approximation, well generalization ability, and little dependence on the sample set. It has the better performance than the soft sensor modeling based on the RBF neural network.