J
Jeffrey Pennington
Researcher at Google
Publications - 84
Citations - 37425
Jeffrey Pennington is an academic researcher from Google. The author has contributed to research in topics: Artificial neural network & Deep learning. The author has an hindex of 32, co-authored 75 publications receiving 28787 citations. Previous affiliations of Jeffrey Pennington include University of Southern California & Princeton University.
Papers
More filters
Proceedings Article
The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network
Jeffrey Pennington,Pratik Worah +1 more
TL;DR: This work extends a recently-developed framework for studying spectra of nonlinear random matrices to characterize an important measure of curvature, namely the eigenvalues of the Fisher information matrix and finds that linear networks suffer worse conditioning than nonlinear networks and that non linear networks are generically non-degenerate.
Proceedings Article
Understanding Double Descent Requires a Fine-Grained Bias-Variance Decomposition
Ben Adlam,Jeffrey Pennington +1 more
TL;DR: This work describes an interpretable, symmetric decomposition of the variance into terms associated with the randomness from sampling, initialization, and the labels, and compute the high-dimensional asymptotic behavior of this decomposition for random feature kernel regression, and analyzes the strikingly rich phenomenology that arises.
Posted Content
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization
Ben Adlam,Jeffrey Pennington +1 more
TL;DR: This work provides a precise high-dimensional asymptotic analysis of generalization under kernel regression with the Neural Tangent Kernel, which characterizes the behavior of wide neural networks optimized with gradient descent.
Journal ArticleDOI
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee,Lechao Xiao,Samuel S. Schoenholz,Yasaman Bahri,Roman Novak,Jascha Sohl-Dickstein,Jeffrey Pennington +6 more
TL;DR: This work shows that for wide neural networks the learning dynamics simplify considerably and that, in the infinite width limit, they are governed by a linear model obtained from the first-order Taylor expansion of the network around its initial parameters.
Posted Content
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks
TL;DR: The results demonstrate how the benefits of a good initialization can persist throughout learning, suggesting an explanation for the recent empirical successes found by initializing very deep non-linear networks according to the principle of dynamical isometry.