scispace - formally typeset
Search or ask a question
Topic

Statistical learning theory

About: Statistical learning theory is a research topic. Over the lifetime, 1618 publications have been published within this topic receiving 158033 citations.


Papers
More filters
Book ChapterDOI
14 Sep 2020
TL;DR: In this article, the authors adopt an alternative perspective, viewing the neural network as a dynamical system displacing input particles over time, and show the presence of a low kinetic energy bias in the transport map of the network, and link this bias with generalization performance.
Abstract: Neural networks have been achieving high generalization performance on many tasks despite being highly over-parameterized. Since classical statistical learning theory struggles to explain this behaviour, much effort has recently been focused on uncovering the mechanisms behind it, in the hope of developing a more adequate theoretical framework and having a better control over the trained models. In this work, we adopt an alternative perspective, viewing the neural network as a dynamical system displacing input particles over time. We conduct a series of experiments and, by analyzing the network’s behaviour through its displacements, we show the presence of a low kinetic energy bias in the transport map of the network, and link this bias with generalization performance. From this observation, we reformulate the learning problem as follows: find neural networks that solve the task while transporting the data as efficiently as possible. This offers a novel formulation of the learning problem which allows us to provide regularity results for the solution network, based on Optimal Transport theory. From a practical viewpoint, this allows us to propose a new learning algorithm, which automatically adapts to the complexity of the task, and leads to networks with a high generalization ability even in low data regimes.

6 citations

Posted Content
TL;DR: From the theoretical formulation, the conditions which Deep Neural Networks learn are shown as well as point out another issue: DL benchmarks may be strictly driven by empirical risks, disregarding the complexity of algorithms biases.
Abstract: Deep Learning (DL) is one of the most common subjects when Machine Learning and Data Science approaches are considered. There are clearly two movements related to DL: the first aggregates researchers in quest to outperform other algorithms from literature, trying to win contests by considering often small decreases in the empirical risk; and the second investigates overfitting evidences, questioning the learning capabilities of DL classifiers. Motivated by such opposed points of view, this paper employs the Statistical Learning Theory (SLT) to study the convergence of Deep Neural Networks, with particular interest in Convolutional Neural Networks. In order to draw theoretical conclusions, we propose an approach to estimate the Shattering coefficient of those classification algorithms, providing a lower bound for the complexity of their space of admissible functions, a.k.a. algorithm bias. Based on such estimator, we generalize the complexity of network biases, and, next, we study AlexNet and VGG16 architectures in the point of view of their Shattering coefficients, and number of training examples required to provide theoretical learning guarantees. From our theoretical formulation, we show the conditions which Deep Neural Networks learn as well as point out another issue: DL benchmarks may be strictly driven by empirical risks, disregarding the complexity of algorithms biases.

6 citations

Proceedings ArticleDOI
05 Dec 2005
TL;DR: A novel learning-based nonlinear classifier, i.e., the support vector machines for TSA was presented here, and the feature variables, which describe the system state before and after the occurrence of a fault, were selected for TSA.
Abstract: One of the most challenging problems in real-time operation of power system is the assessment of transient stability. Fast and accurate techniques are imperative to achieve on-line transient stability assessment (TSA). Based on the statistical learning theory, a novel learning-based nonlinear classifier, i.e., the support vector machines (SVMs) for TSA was presented here. In the approach, the feature variables, which describe the system state before and after the occurrence of a fault, were selected for TSA. Abundance of initial data was preprocessed by feature extraction to improve the data quality. By using SVM training, models were built and used to predict the operation state whether is stable or not for given operation data. The validity of the approach was verified by the simulation for the 4933-bus state grid of China system

6 citations

Book ChapterDOI
01 Jan 2018
TL;DR: This chapter starts by describing the necessary concepts and assumptions to ensure supervised learning, and details the Empirical Risk Minimization (ERM) principle, which is the key point for the Statistical Learning Theory (SLT).
Abstract: This chapter starts by describing the necessary concepts and assumptions to ensure supervised learning. Later on, it details the Empirical Risk Minimization (ERM) principle, which is the key point for the Statistical Learning Theory (SLT). The ERM principle provides upper bounds to make the empirical risk a good estimator for the expected risk, given the bias of some learning algorithm. This bound is the main theoretical tool to provide learning guarantees for classification tasks. Afterwards, other useful tools and concepts are introduced.

6 citations

01 Jan 2002
TL;DR: Multi SVM decision model (MSDM), which consists of multiple SVMs and makes decision by synthetic information based on multi SVMs, somewhat inproves the robust of decision system.
Abstract: Support Vector Machines(SVM) is a powerful machine learning method developed from statistical learning theory and is currently an active field in artificial intelligent technology SVM is sensitive to noise vectors near hyperplane since it is determined only by few support vectors In this paper, Multi SVM decision model(MSDM) was proposed MSDM consists of multiple SVMs and makes decision by synthetic information based on multi SVMs MSDM is applied to heart disease diagnoses based on UCI benchmark data set MSDM somewhat inproves the robust of decision system

6 citations


Network Information
Related Topics (5)
Artificial neural network
207K papers, 4.5M citations
86% related
Cluster analysis
146.5K papers, 2.9M citations
82% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Optimization problem
96.4K papers, 2.1M citations
80% related
Fuzzy logic
151.2K papers, 2.3M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202219
202159
202069
201972
201847