scispace - formally typeset
J

Jeffrey Pennington

Researcher at Google

Publications -  84
Citations -  37425

Jeffrey Pennington is an academic researcher from Google. The author has contributed to research in topics: Artificial neural network & Deep learning. The author has an hindex of 32, co-authored 75 publications receiving 28787 citations. Previous affiliations of Jeffrey Pennington include University of Southern California & Princeton University.

Papers
More filters
Proceedings Article

The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network

TL;DR: This work extends a recently-developed framework for studying spectra of nonlinear random matrices to characterize an important measure of curvature, namely the eigenvalues of the Fisher information matrix and finds that linear networks suffer worse conditioning than nonlinear networks and that non linear networks are generically non-degenerate.
Proceedings Article

Understanding Double Descent Requires a Fine-Grained Bias-Variance Decomposition

TL;DR: This work describes an interpretable, symmetric decomposition of the variance into terms associated with the randomness from sampling, initialization, and the labels, and compute the high-dimensional asymptotic behavior of this decomposition for random feature kernel regression, and analyzes the strikingly rich phenomenology that arises.
Posted Content

The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization

TL;DR: This work provides a precise high-dimensional asymptotic analysis of generalization under kernel regression with the Neural Tangent Kernel, which characterizes the behavior of wide neural networks optimized with gradient descent.
Journal ArticleDOI

Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent

TL;DR: This work shows that for wide neural networks the learning dynamics simplify considerably and that, in the infinite width limit, they are governed by a linear model obtained from the first-order Taylor expansion of the network around its initial parameters.
Posted Content

Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks

TL;DR: The results demonstrate how the benefits of a good initialization can persist throughout learning, suggesting an explanation for the recent empirical successes found by initializing very deep non-linear networks according to the principle of dynamical isometry.