scispace - formally typeset
J

Jeffrey Pennington

Researcher at Google

Publications -  84
Citations -  37425

Jeffrey Pennington is an academic researcher from Google. The author has contributed to research in topics: Artificial neural network & Deep learning. The author has an hindex of 32, co-authored 75 publications receiving 28787 citations. Previous affiliations of Jeffrey Pennington include University of Southern California & Princeton University.

Papers
More filters
Journal ArticleDOI

The six-point remainder function to all loop orders in the multi-Regge limit

TL;DR: In this article, an all-orders formula for the six-point amplitude of planar maximally supersymmetric N=4 Yang-Mills theory in the leading-logarithmic approximation of multi-Regge kinematics is presented.
Posted Content

Disentangling Trainability and Generalization in Deep Learning

TL;DR: This paper discusses challenging issues in the context of wide neural networks at large depths and finds that there are large regions of hyperparameter space where networks can only memorize the training set in the sense they reach perfect training accuracy but completely fail to generalize outside the trainingSet.
Posted Content

Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs.

TL;DR: This work develops a mean field theory of signal propagation in LSTMs and GRUs that enables it to calculate the time scales for signal propagation as well as the spectral properties of the state-to-state Jacobians, and derives a novel initialization scheme that eliminates or reduces training instabilities.
Posted Content

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks

TL;DR: It is formally proved that, for a class of well-behaved input distributions, the early-time learning dynamics of a two-layer fully-connected neural network can be mimicked by training a simple linear model on the inputs.
Posted Content

Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice

TL;DR: In this paper, the authors explore the dependence of the singular value distribution on the depth of the network, the weight initialization, and the choice of nonlinearity, and show that properly initialized deep sigmoidal networks consistently outperform deep ReLU networks.