scispace - formally typeset
R

Roman Novak

Researcher at Google

Publications -  24
Citations -  3332

Roman Novak is an academic researcher from Google. The author has contributed to research in topics: Artificial neural network & Gaussian process. The author has an hindex of 16, co-authored 20 publications receiving 2106 citations.

Papers
More filters
Proceedings Article

Deep Neural Networks as Gaussian Processes

TL;DR: The exact equivalence between infinitely wide deep networks and GPs is derived and it is found that test performance increases as finite-width trained networks are made wider and more similar to a GP, and thus that GP predictions typically outperform those of finite- width networks.
Journal ArticleDOI

Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent

TL;DR: In this article, the authors show that for wide neural networks the learning dynamics simplify considerably and that, in the infinite width limit, they are governed by a linear model obtained from the first-order Taylor expansion of the network around its initial parameters.
Journal Article

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava, +439 more
- 09 Jun 2022 - 
TL;DR: Evaluation of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters finds that model performance and calibration both improve with scale, but are poor in absolute terms.
Proceedings Article

Sensitivity and Generalization in Neural Networks: an Empirical Study

TL;DR: In this article, the authors investigate the tension between complexity and generalization through an extensive empirical exploration of two natural metrics of complexity related to sensitivity to input perturbations, and demonstrate how the input-output Jacobian norm can be predictive of generalization at the level of individual test points.
Proceedings Article

Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes

TL;DR: This work derives an analogous equivalence for multi-layer convolutional neural networks (CNNs) both with and without pooling layers, and introduces a Monte Carlo method to estimate the GP corresponding to a given neural network architecture, even in cases where the analytic form has too many terms to be computationally feasible.