Topic

Statistical learning theory

About: Statistical learning theory is a research topic. Over the lifetime, 1618 publications have been published within this topic receiving 158033 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

A Principle of Least Action for the Training of Neural Networks

[...]

Skander Karkar, Ibrahhim Ayed¹, Emmanuel de Bézenac¹, Patrick Gallinari•Institutions (1)

University of Paris¹

14 Sep 2020

TL;DR: In this article, the authors adopt an alternative perspective, viewing the neural network as a dynamical system displacing input particles over time, and show the presence of a low kinetic energy bias in the transport map of the network, and link this bias with generalization performance.

...read moreread less

Abstract: Neural networks have been achieving high generalization performance on many tasks despite being highly over-parameterized. Since classical statistical learning theory struggles to explain this behaviour, much effort has recently been focused on uncovering the mechanisms behind it, in the hope of developing a more adequate theoretical framework and having a better control over the trained models. In this work, we adopt an alternative perspective, viewing the neural network as a dynamical system displacing input particles over time. We conduct a series of experiments and, by analyzing the network’s behaviour through its displacements, we show the presence of a low kinetic energy bias in the transport map of the network, and link this bias with generalization performance. From this observation, we reformulate the learning problem as follows: find neural networks that solve the task while transporting the data as efficiently as possible. This offers a novel formulation of the learning problem which allows us to provide regularity results for the solution network, based on Optimal Transport theory. From a practical viewpoint, this allows us to propose a new learning algorithm, which automatically adapts to the complexity of the task, and leads to networks with a high generalization ability even in low data regimes.

...read moreread less

6 citations

Posted Content•

Providing theoretical learning guarantees to Deep Learning Networks.

[...]

Rodrigo Fernandes de Mello, Martha Dais Ferreira, Moacir Antonelli Ponti

28 Nov 2017-arXiv: Learning

TL;DR: From the theoretical formulation, the conditions which Deep Neural Networks learn are shown as well as point out another issue: DL benchmarks may be strictly driven by empirical risks, disregarding the complexity of algorithms biases.

...read moreread less

Abstract: Deep Learning (DL) is one of the most common subjects when Machine Learning and Data Science approaches are considered. There are clearly two movements related to DL: the first aggregates researchers in quest to outperform other algorithms from literature, trying to win contests by considering often small decreases in the empirical risk; and the second investigates overfitting evidences, questioning the learning capabilities of DL classifiers. Motivated by such opposed points of view, this paper employs the Statistical Learning Theory (SLT) to study the convergence of Deep Neural Networks, with particular interest in Convolutional Neural Networks. In order to draw theoretical conclusions, we propose an approach to estimate the Shattering coefficient of those classification algorithms, providing a lower bound for the complexity of their space of admissible functions, a.k.a. algorithm bias. Based on such estimator, we generalize the complexity of network biases, and, next, we study AlexNet and VGG16 architectures in the point of view of their Shattering coefficients, and number of training examples required to provide theoretical learning guarantees. From our theoretical formulation, we show the conditions which Deep Neural Networks learn as well as point out another issue: DL benchmarks may be strictly driven by empirical risks, disregarding the complexity of algorithms biases.

...read moreread less

6 citations

Proceedings Article•DOI•

Fast Transient Stability Assessment Based on Data Mining for Large-Scale Power System

[...]

Zhonghong Yu, Xiaoxin Zhou, Zhongxi Wu

05 Dec 2005

TL;DR: A novel learning-based nonlinear classifier, i.e., the support vector machines for TSA was presented here, and the feature variables, which describe the system state before and after the occurrence of a fault, were selected for TSA.

...read moreread less

Abstract: One of the most challenging problems in real-time operation of power system is the assessment of transient stability. Fast and accurate techniques are imperative to achieve on-line transient stability assessment (TSA). Based on the statistical learning theory, a novel learning-based nonlinear classifier, i.e., the support vector machines (SVMs) for TSA was presented here. In the approach, the feature variables, which describe the system state before and after the occurrence of a fault, were selected for TSA. Abundance of initial data was preprocessed by feature extraction to improve the data quality. By using SVM training, models were built and used to predict the operation state whether is stable or not for given operation data. The validity of the approach was verified by the simulation for the 4933-bus state grid of China system

...read moreread less

6 citations

Book Chapter•DOI•

Statistical Learning Theory

[...]

Rodrigo Fernandes de Mello¹, Moacir Antonelli Ponti¹•Institutions (1)

University of São Paulo¹

01 Jan 2018

TL;DR: This chapter starts by describing the necessary concepts and assumptions to ensure supervised learning, and details the Empirical Risk Minimization (ERM) principle, which is the key point for the Statistical Learning Theory (SLT).

...read moreread less

Abstract: This chapter starts by describing the necessary concepts and assumptions to ensure supervised learning. Later on, it details the Empirical Risk Minimization (ERM) principle, which is the key point for the Statistical Learning Theory (SLT). The ERM principle provides upper bounds to make the empirical risk a good estimator for the expected risk, given the bias of some learning algorithm. This bound is the main theoretical tool to provide learning guarantees for classification tasks. Afterwards, other useful tools and concepts are introduced.

...read moreread less

6 citations

Multi support vector machines decision model and its application

[...]

Yan Wei

01 Jan 2002

TL;DR: Multi SVM decision model (MSDM), which consists of multiple SVMs and makes decision by synthetic information based on multi SVMs, somewhat inproves the robust of decision system.

...read moreread less

Abstract: Support Vector Machines(SVM) is a powerful machine learning method developed from statistical learning theory and is currently an active field in artificial intelligent technology SVM is sensitive to noise vectors near hyperplane since it is determined only by few support vectors In this paper, Multi SVM decision model(MSDM) was proposed MSDM consists of multiple SVMs and makes decision by synthetic information based on multi SVMs MSDM is applied to heart disease diagnoses based on UCI benchmark data set MSDM somewhat inproves the robust of decision system

...read moreread less

6 citations

Collapse

Network Information

Performance

Metrics

1,647

Papers

173,903

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	19
2021	59
2020	69
2019	72
2018	47

Statistical learning theory

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics