Pratik Chaudhari

Researcher at University of Pennsylvania

Publications - 84

Citations - 2435

Pratik Chaudhari is an academic researcher from University of Pennsylvania. The author has contributed to research in topics: Computer science & Stochastic gradient descent. The author has an hindex of 18, co-authored 65 publications receiving 1730 citations. Previous affiliations of Pratik Chaudhari include Indian Institute of Technology Bombay & Massachusetts Institute of Technology.

Papers

PDF

Open Access

More filters

Posted Content

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

Pratik Chaudhari, +12 more

- 06 Nov 2016 -

arXiv: Learning

TL;DR: This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape and compares favorably to state-of-the-art techniques in terms of generalization error and training time.

...read moreread less

Proceedings Article

A Baseline for Few-Shot Image Classification

Guneet S. Dhillon, +3 more

TL;DR: This work performs extensive studies on benchmark datasets to propose a metric that quantifies the "hardness" of a few-shot episode and finds that using a large number of meta-training classes results in high few- shot accuracies even for a largeNumber of few-shots classes.

...read moreread less

Proceedings ArticleDOI

Stochastic Gradient Descent Performs Variational Inference, Converges to Limit Cycles for Deep Networks

Pratik Chaudhari, +1 more

TL;DR: The authors showed that SGD does not converge in the classical sense: the most likely trajectories of SGD for deep networks do not behave like Brownian motion around critical points, instead, they resemble closed loops with deterministic components.

...read moreread less

Proceedings Article

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys.

Pratik Chaudhari, +8 more

TL;DR: In this article, a local-entropy-based objective function is proposed for training deep neural networks that is motivated by the local geometry of the energy landscape, where the gradient of the local entropy is computed before each update of the weights.

...read moreread less

Posted Content

Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks

Pratik Chaudhari, +1 more

- 30 Oct 2017 -

arXiv: Learning

TL;DR: This paper showed that the most likely trajectories of SGD for deep networks do not behave like Brownian motion around critical points, instead, they resemble closed loops with deterministic components, and showed that such "out-of-equilibrium" behavior is a consequence of highly nonisotropic gradient noise in SGD.

...read moreread less

Collapse