scispace - formally typeset
T

Thomas Unterthiner

Researcher at Google

Publications -  51
Citations -  31915

Thomas Unterthiner is an academic researcher from Google. The author has contributed to research in topics: Convolutional neural network & Deep learning. The author has an hindex of 26, co-authored 47 publications receiving 15696 citations. Previous affiliations of Thomas Unterthiner include Johannes Kepler University of Linz & University of Göttingen.

Papers
More filters
Posted Content

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.
Posted Content

GANs Trained by a Two Time-Scale Update Rule Converge to a Nash Equilibrium

TL;DR: In this article, a two time-scale update rule (TTUR) was proposed for training GANs with stochastic gradient descent on arbitrary GAN loss functions, which has an individual learning rate for both the discriminator and the generator.
Proceedings Article

GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

TL;DR: In this paper, a two time-scale update rule (TTUR) was proposed for training GANs with stochastic gradient descent on arbitrary GAN loss functions, which has an individual learning rate for both the discriminator and the generator.
Posted Content

Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)

TL;DR: The Exponential Linear Unit (ELU) as mentioned in this paper was proposed to alleviate the vanishing gradient problem via the identity for positive values, which has improved learning characteristics compared to the units with other activation functions.
Posted Content

Self-Normalizing Neural Networks

TL;DR: Self-normalizing neural networks (SNNs) are introduced to enable high-level abstract representations and it is proved that activations close to zero mean and unit variance that are propagated through many network layers will converge towards zero meanand unit variance -- even under the presence of noise and perturbations.