scispace - formally typeset
A

Ashia C. Wilson

Researcher at Microsoft

Publications -  22
Citations -  2187

Ashia C. Wilson is an academic researcher from Microsoft. The author has contributed to research in topics: Gradient descent & Rate of convergence. The author has an hindex of 14, co-authored 22 publications receiving 1765 citations. Previous affiliations of Ashia C. Wilson include University of California, Berkeley.

Papers
More filters
Proceedings Article

The Marginal Value of Adaptive Gradient Methods in Machine Learning

TL;DR: It is observed that the solutions found by adaptive methods generalize worse (often significantly worse) than SGD, even when these solutions have better training performance, suggesting that practitioners should reconsider the use of adaptive methods to train neural networks.
Journal ArticleDOI

A variational perspective on accelerated methods in optimization

TL;DR: In this article, the authors show that there is a Lagrangian functional that can generate a large class of accelerated methods in continuous time, including (but not limited to) accelerated gradient descent, its non-Euclidean extension, and accelerated higher-order gradient methods.
Posted Content

Streaming Variational Bayes

TL;DR: SDA-Bayes as mentioned in this paper is a framework for streaming and distributed computation of a Bayesian posterior, which makes streaming updates to the estimated posterior according to a user-specified approximation batch primitive.
Posted Content

The Marginal Value of Adaptive Gradient Methods in Machine Learning

TL;DR: This article showed that adaptive methods often find drastically different solutions than gradient descent or stochastic gradient descent (SGD) for simple overparameterized problems, and that the solutions found by adaptive methods generalize worse (often significantly worse) than SGD, even when these solutions have better training performance.
Posted Content

A Lyapunov Analysis of Momentum Methods in Optimization

TL;DR: There is an equivalence between the technique of estimate sequences and a family of Lyapunov functions in both continuous and discrete time, which allows for a simple and unified analysis of many existing momentum algorithms.