scispace - formally typeset
Search or ask a question
Topic

Statistical learning theory

About: Statistical learning theory is a research topic. Over the lifetime, 1618 publications have been published within this topic receiving 158033 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Results suggest that the infinite ensemble approach provides a significant increase in the classification accuracy in comparison to the radial basis function kernel‐based support vector machines.
Abstract: Much research effort in the past ten years has been devoted to analysis of theperformance of artificial neural networks in image classification (Benediktsson etal., 1990; Heermann and Khazenie, 1992). The preferred algorithm is feed-forward multi-layer perceptron using back-propagation, due to its ability to handleany kind of numerical data, and to its freedom from distributional assumptions.Although neural networks may generally be used to classify data at least asaccurately as statistical classification approaches a number of studies havereported that users of neural classifiers have problems in setting the choice ofvarious parameters during training (Wilkinson, 1997). The choice of architectureof the network, the sample size for training, learning algorithms, and number ofiterations required for training are some of these problems. A new classificationsystem based on statistical learning theory (Vapnik, 1995), called the supportvector machine has recently been applied to the problem of remote sensing dataclassification (Huang et al., 2002; Zhu and Blumberg, 2002; Gualtieri and Cromp,1998). This technique is said to be independent of the dimensionality of featurespace as the main idea behind this classification technique is to separate theclasses with a surface that maximise the margin between them, using boundarypixels to create the decision surface. The data points that are closest to thehyperplane are termed "support vectors". The number of support vectors is thussmall as they are points close to the class boundaries (Vapnik, 1995). One majoradvantage of support vector classifiers is the use of quadratic programming, whichprovides global minima only. The absence of local minima is a significantdifference from the neural network classifiers.

89 citations

Proceedings ArticleDOI
10 Sep 2000
TL;DR: Empirical results show that the proposed wavelet thresholding for image denoising under the framework provided by statistical learning theory outperforms Donoho's level dependent thresholding techniques and the advantages become more significant under finite sample and non-Gaussian noise settings.
Abstract: This paper describes wavelet thresholding for image denoising under the framework provided by statistical learning theory a.k.a. Vapnik-Chervonenkis (VC) theory. Under the framework of VC-theory, wavelet thresholding amounts to ordering of wavelet coefficients according to their relevance to accurate function estimation, followed by discarding insignificant coefficients. Existing wavelet thresholding methods specify an ordering based on the coefficient magnitude, and use threshold(s) derived under Gaussian noise assumption and asymptotic settings. In contrast, the proposed approach uses orderings better reflecting the statistical properties of natural images, and VC-based thresholding developed for finite sample settings under very general noise assumptions. A tree structure is proposed to order the wavelet coefficients based on its magnitude, scale and spatial location. The choice of a threshold is based on the general VC method for model complexity control. Empirical results show that the proposed method outperforms Donoho's (1992, 1995) level dependent thresholding techniques and the advantages become more significant under finite sample and non-Gaussian noise settings.

89 citations

Journal Article
TL;DR: A general framework for sparse semi-supervised learning, which concerns using a small portion of unlabeled data and a few labeled data to represent target functions and thus has the merit of accelerating function evaluations when predicting the output of a new example.
Abstract: In this paper, we propose a general framework for sparse semi-supervised learning, which concerns using a small portion of unlabeled data and a few labeled data to represent target functions and thus has the merit of accelerating function evaluations when predicting the output of a new example. This framework makes use of Fenchel-Legendre conjugates to rewrite a convex insensitive loss involving a regularization with unlabeled data, and is applicable to a family of semi-supervised learning methods such as multi-view co-regularized least squares and single-view Laplacian support vector machines (SVMs). As an instantiation of this framework, we propose sparse multi-view SVMs which use a squared e-insensitive loss. The resultant optimization is an inf-sup problem and the optimal solutions have arguably saddle-point properties. We present a globally optimal iterative algorithm to optimize the problem. We give the margin bound on the generalization error of the sparse multi-view SVMs, and derive the empirical Rademacher complexity for the induced function class. Experiments on artificial and real-world data show their effectiveness. We further give a sequential training approach to show their possibility and potential for uses in large-scale problems and provide encouraging experimental results indicating the efficacy of the margin bound and empirical Rademacher complexity on characterizing the roles of unlabeled data for semi-supervised learning.

88 citations

01 Jan 2001
TL;DR: This chapter contains sections titled: Data Representation and Similarity, A Simple Pattern Recognition Algorithm, Some Insights From Statistical Learning Theory, Hyperplane Classifiers, support Vector Classification, Support Vector Regression, Kernel Principal Component Analysis, Empirical Results and Implementations.
Abstract: This chapter contains sections titled: Data Representation and Similarity, A Simple Pattern Recognition Algorithm, Some Insights From Statistical Learning Theory, Hyperplane Classifiers, Support Vector Classification, Support Vector Regression, Kernel Principal Component Analysis, Empirical Results and Implementations

88 citations

Proceedings ArticleDOI
06 Aug 2001
TL;DR: Support vector regression techniques for black-box system identification are demonstrated and the theory underpinning SVR is described, and support vector methods with other approaches using radial basis networks are compared.
Abstract: We demonstrate the use of support vector regression (SVR) techniques for black-box system identification. These methods derive from statistical learning theory, and are of great theoretical and practical interest. We describe the theory underpinning SVR, and compare support vector methods with other approaches using radial basis networks. Finally, we apply SVR to modeling the behaviour of a hydraulic robot arm, and show that SVR improves on previously published results.

87 citations


Network Information
Related Topics (5)
Artificial neural network
207K papers, 4.5M citations
86% related
Cluster analysis
146.5K papers, 2.9M citations
82% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Optimization problem
96.4K papers, 2.1M citations
80% related
Fuzzy logic
151.2K papers, 2.3M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202219
202159
202069
201972
201847