scispace - formally typeset
Search or ask a question
Topic

Statistical learning theory

About: Statistical learning theory is a research topic. Over the lifetime, 1618 publications have been published within this topic receiving 158033 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, the authors provide a tutorial overview of some aspects of statistical learning theory, which also goes by other names such as statistical pattern recognition, nonparametric classification and estimation, and supervised learning.
Abstract: In this article, we provide a tutorial overview of some aspects of statistical learning theory, which also goes by other names such as statistical pattern recognition, nonparametric classification and estimation, and supervised learning. We focus on the problem of two-class pattern classification for various reasons. This problem is rich enough to capture many of the interesting aspects that are present in the cases of more than two classes and in the problem of estimation, and many of the results can be extended to these cases. Focusing on two-class pattern classification simplifies our discussion, and yet it is directly applicable to a wide range of practical settings. We begin with a description of the two-class pattern recognition problem. We then discuss various classical and state-of-the-art approaches to this problem, with a focus on fundamental formulations, algorithms, and theoretical results. In particular, we describe nearest neighbor methods, kernel methods, multilayer perceptrons, Vapnik-Chervonenkis theory, support vector machines, and boosting. WIREs Comp Stat 2011 3 543-556 DOI: 10.1002/wics.179

46 citations

Journal ArticleDOI
TL;DR: The work presented here examines the feasibility of applying SVMs to high angle-of-attack unsteady aerodynamic modeling field and concludes that the least squares SVM models are in good agreement with the test data, which indicates the satisfying learning and generalization performance of LS-SVMs.

46 citations

Journal ArticleDOI
TL;DR: The results based on support vector regression machine learning confirm that this approach provides a framework for general, accurate and computationally acceptable multi-layer buildup factor model.

46 citations

Posted Content
TL;DR: A new framework, termed Bayes-Stability, is developed for proving algorithm-dependent generalization error bounds for learning general non-convex objectives and it is demonstrated that the data-dependent bounds can distinguish randomly labelled data from normal data.
Abstract: Generalization error (also known as the out-of-sample error) measures how well the hypothesis learned from training data generalizes to previously unseen data. Proving tight generalization error bounds is a central question in statistical learning theory. In this paper, we obtain generalization error bounds for learning general non-convex objectives, which has attracted significant attention in recent years. We develop a new framework, termed Bayes-Stability, for proving algorithm-dependent generalization error bounds. The new framework combines ideas from both the PAC-Bayesian theory and the notion of algorithmic stability. Applying the Bayes-Stability method, we obtain new data-dependent generalization bounds for stochastic gradient Langevin dynamics (SGLD) and several other noisy gradient methods (e.g., with momentum, mini-batch and acceleration, Entropy-SGD). Our result recovers (and is typically tighter than) a recent result in Mou et al. (2018) and improves upon the results in Pensia et al. (2018). Our experiments demonstrate that our data-dependent bounds can distinguish randomly labelled data from normal data, which provides an explanation to the intriguing phenomena observed in Zhang et al. (2017a). We also study the setting where the total loss is the sum of a bounded loss and an additional \ell_2 regularization term. We obtain new generalization bounds for the continuous Langevin dynamic in this setting by developing a new Log-Sobolev inequality for the parameter distribution at any time. Our new bounds are more desirable when the noisy level of the process is not small, and do not become vacuous even when T tends to infinity.

45 citations

Proceedings Article
03 Jan 2001
TL;DR: In contrast to standard statistical learning theory which studies uniform bounds on the expected error, the authors presented a framework that exploits the specific learning algorithm used. But the main difference to previous approaches lies in the complexity measure; rather than covering all hypotheses in a given hypothesis space, it is only necessary to cover the functions which could have been learned using the fixed learning algorithm.
Abstract: In contrast to standard statistical learning theory which studies uniform bounds on the expected error we present a framework that exploits the specific learning algorithm used. Motivated by the luckiness framework [8] we are also able to exploit the serendipity of the training sample. The main difference to previous approaches lies in the complexity measure; rather than covering all hypotheses in a given hypothesis space it is only necessary to cover the functions which could have been learned using the fixed learning algorithm. We show how the resulting framework relates to the VC, luckiness and compression frameworks. Finally, we present an application of this framework to the maximum margin algorithm for linear classifiers which results in a bound that exploits both the margin and the distribution of the data in feature space.

45 citations


Network Information
Related Topics (5)
Artificial neural network
207K papers, 4.5M citations
86% related
Cluster analysis
146.5K papers, 2.9M citations
82% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Optimization problem
96.4K papers, 2.1M citations
80% related
Fuzzy logic
151.2K papers, 2.3M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202219
202159
202069
201972
201847