Topic
Statistical learning theory
About: Statistical learning theory is a research topic. Over the lifetime, 1618 publications have been published within this topic receiving 158033 citations.
Papers published on a yearly basis
Papers
More filters
••
14 Sep 2020TL;DR: In this article, the authors adopt an alternative perspective, viewing the neural network as a dynamical system displacing input particles over time, and show the presence of a low kinetic energy bias in the transport map of the network, and link this bias with generalization performance.
Abstract: Neural networks have been achieving high generalization performance on many tasks despite being highly over-parameterized. Since classical statistical learning theory struggles to explain this behaviour, much effort has recently been focused on uncovering the mechanisms behind it, in the hope of developing a more adequate theoretical framework and having a better control over the trained models. In this work, we adopt an alternative perspective, viewing the neural network as a dynamical system displacing input particles over time. We conduct a series of experiments and, by analyzing the network’s behaviour through its displacements, we show the presence of a low kinetic energy bias in the transport map of the network, and link this bias with generalization performance. From this observation, we reformulate the learning problem as follows: find neural networks that solve the task while transporting the data as efficiently as possible. This offers a novel formulation of the learning problem which allows us to provide regularity results for the solution network, based on Optimal Transport theory. From a practical viewpoint, this allows us to propose a new learning algorithm, which automatically adapts to the complexity of the task, and leads to networks with a high generalization ability even in low data regimes.
6 citations
•
TL;DR: From the theoretical formulation, the conditions which Deep Neural Networks learn are shown as well as point out another issue: DL benchmarks may be strictly driven by empirical risks, disregarding the complexity of algorithms biases.
Abstract: Deep Learning (DL) is one of the most common subjects when Machine Learning and Data Science approaches are considered. There are clearly two movements related to DL: the first aggregates researchers in quest to outperform other algorithms from literature, trying to win contests by considering often small decreases in the empirical risk; and the second investigates overfitting evidences, questioning the learning capabilities of DL classifiers. Motivated by such opposed points of view, this paper employs the Statistical Learning Theory (SLT) to study the convergence of Deep Neural Networks, with particular interest in Convolutional Neural Networks. In order to draw theoretical conclusions, we propose an approach to estimate the Shattering coefficient of those classification algorithms, providing a lower bound for the complexity of their space of admissible functions, a.k.a. algorithm bias. Based on such estimator, we generalize the complexity of network biases, and, next, we study AlexNet and VGG16 architectures in the point of view of their Shattering coefficients, and number of training examples required to provide theoretical learning guarantees. From our theoretical formulation, we show the conditions which Deep Neural Networks learn as well as point out another issue: DL benchmarks may be strictly driven by empirical risks, disregarding the complexity of algorithms biases.
6 citations
••
05 Dec 2005
TL;DR: A novel learning-based nonlinear classifier, i.e., the support vector machines for TSA was presented here, and the feature variables, which describe the system state before and after the occurrence of a fault, were selected for TSA.
Abstract: One of the most challenging problems in real-time operation of power system is the assessment of transient stability. Fast and accurate techniques are imperative to achieve on-line transient stability assessment (TSA). Based on the statistical learning theory, a novel learning-based nonlinear classifier, i.e., the support vector machines (SVMs) for TSA was presented here. In the approach, the feature variables, which describe the system state before and after the occurrence of a fault, were selected for TSA. Abundance of initial data was preprocessed by feature extraction to improve the data quality. By using SVM training, models were built and used to predict the operation state whether is stable or not for given operation data. The validity of the approach was verified by the simulation for the 4933-bus state grid of China system
6 citations
••
01 Jan 2018TL;DR: This chapter starts by describing the necessary concepts and assumptions to ensure supervised learning, and details the Empirical Risk Minimization (ERM) principle, which is the key point for the Statistical Learning Theory (SLT).
Abstract: This chapter starts by describing the necessary concepts and assumptions to ensure supervised learning. Later on, it details the Empirical Risk Minimization (ERM) principle, which is the key point for the Statistical Learning Theory (SLT). The ERM principle provides upper bounds to make the empirical risk a good estimator for the expected risk, given the bias of some learning algorithm. This bound is the main theoretical tool to provide learning guarantees for classification tasks. Afterwards, other useful tools and concepts are introduced.
6 citations
01 Jan 2002
TL;DR: Multi SVM decision model (MSDM), which consists of multiple SVMs and makes decision by synthetic information based on multi SVMs, somewhat inproves the robust of decision system.
Abstract: Support Vector Machines(SVM) is a powerful machine learning method developed from statistical learning theory and is currently an active field in artificial intelligent technology SVM is sensitive to noise vectors near hyperplane since it is determined only by few support vectors In this paper, Multi SVM decision model(MSDM) was proposed MSDM consists of multiple SVMs and makes decision by synthetic information based on multi SVMs MSDM is applied to heart disease diagnoses based on UCI benchmark data set MSDM somewhat inproves the robust of decision system
6 citations