Lukasz Kaiser

Researcher at Google

Publications - 67

Citations - 76504

Lukasz Kaiser is an academic researcher from Google. The author has contributed to research in topics: Artificial neural network & Machine translation. The author has an hindex of 28, co-authored 67 publications receiving 43227 citations. Previous affiliations of Lukasz Kaiser include University of Paris & Paris Diderot University.

Papers

PDF

Open Access

More filters

Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

Posted Content

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Martín Abadi, +39 more

- 01 Jan 2015 -

arXiv: Distributed, Parallel, and Cluste...

TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.

...read moreread less

Posted Content

Attention Is All You Need

Ashish Vaswani, +7 more

- 12 Jun 2017 -

arXiv: Computation and Language

TL;DR: A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

...read moreread less

Proceedings Article

Grammar as a foreign language

Oriol Vinyals, +5 more

TL;DR: The domain agnostic attention-enhanced sequence-to-sequence model achieves state-of-the-art results on the most widely used syntactic constituency parsing dataset, when trained on a large synthetic corpus that was annotated using existing parsers.

...read moreread less

Posted Content

Rethinking Attention with Performers

Krzysztof Choromanski, +12 more

- 30 Sep 2020 -

arXiv: Learning

TL;DR: Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear space and time complexity, without relying on any priors such as sparsity or low-rankness are introduced.

...read moreread less

Collapse