scispace - formally typeset
Open AccessPosted Content

Quantitative Weak Convergence for Discrete Stochastic Processes

TLDR
This work shows that the iterates of these stochastic processes converge to an invariant distribution at a rate of $\tilde{O}\lrp{1/\sqrt{k}}$ where $k$ is the number of steps; this rate is provably tight up to log factors.
Abstract
In this paper, we quantitative convergence in $W_2$ for a family of Langevin-like stochastic processes that includes stochastic gradient descent and related gradient-based algorithms. Under certain regularity assumptions, we show that the iterates of these stochastic processes converge to an invariant distribution at a rate of $\tilde{O}\lrp{1/\sqrt{k}}$ where $k$ is the number of steps; this rate is provably tight up to log factors. Our result reduces to a quantitative form of the classical Central Limit Theorem in the special case when the potential is quadratic.

read more

Citations
More filters
Posted Content

Where is the Information in a Deep Neural Network

TL;DR: A novel notion of effective information in the activations of a deep network is established, which is used to show that models with low (information) complexity not only generalize better, but are bound to learn invariant representations of future inputs.
Posted Content

Quantitative W 1 Convergence of Langevin-Like Stochastic Processes with Non-Convex Potential State-Dependent Noise.

TL;DR: In this article, the authors prove quantitative convergence rates at which discrete Langevin-like processes converge to the invariant distribution of a related stochastic differential equation and apply their theoretical findings to studying the convergence of Stochastic Gradient Descent (SGD) for non-convex problems.
Posted Content

Analytic expressions for the output evolution of a deep neural network

TL;DR: A novel methodology based on a Taylor expansion of the network output for obtaining analytical expressions for the expected value of thenetwork weights and output under stochastic training is presented.
References
More filters
Journal ArticleDOI

Acceleration of stochastic approximation by averaging

TL;DR: Convergence with probability one is proved for a variety of classical optimization and identification problems and it is demonstrated for these problems that the proposed algorithm achieves the highest possible rate of convergence.
Journal ArticleDOI

Generalization of an Inequality by Talagrand and Links with the Logarithmic Sobolev Inequality

TL;DR: In this paper, it was shown that transport inequalities, similar to the one derived by M. Talagrand (1996, Geom. Funct. Anal. 6, 587-600) for the Gaussian measure, are implied by logarithmic Sobolev inequalities.
Journal ArticleDOI

Stochastic Gradient Descent as Approximate Bayesian Inference

TL;DR: It is demonstrated that constant SGD gives rise to a new variational EM algorithm that optimizes hyperparameters in complex probabilistic models and a scalable approximate MCMC algorithm, the Averaged Stochastic Gradient Sampler is proposed.
Posted Content

Theoretical guarantees for approximate sampling from smooth and log-concave densities

TL;DR: This work establishes non‐asymptotic bounds for the error of approximating the target distribution by the distribution obtained by the Langevin Monte Carlo method and its variants and illustrates the effectiveness of the established guarantees.
Related Papers (5)