A
Ashish Vaswani
Researcher at Google
Publications - 73
Citations - 70493
Ashish Vaswani is an academic researcher from Google. The author has contributed to research in topics: Machine translation & Transformer (machine learning model). The author has an hindex of 34, co-authored 70 publications receiving 35599 citations. Previous affiliations of Ashish Vaswani include Information Sciences Institute & University of Southern California.
Papers
More filters
Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Posted Content
Attention Is All You Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Posted Content
Relational inductive biases, deep learning, and graph networks
Peter W. Battaglia,Jessica B. Hamrick,Victor Bapst,Alvaro Sanchez-Gonzalez,Vinicius Zambaldi,Mateusz Malinowski,Andrea Tacchetti,David Raposo,Adam Santoro,Ryan Faulkner,Caglar Gulcehre,H. Francis Song,Andrew J. Ballard,Justin Gilmer,George E. Dahl,Ashish Vaswani,Kelsey R. Allen,Charlie Nash,Victoria Langston,Chris Dyer,Nicolas Heess,Daan Wierstra,Pushmeet Kohli,Matthew Botvinick,Oriol Vinyals,Yujia Li,Razvan Pascanu +26 more
TL;DR: It is argued that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective.
Proceedings ArticleDOI
Self-Attention with Relative Position Representations
TL;DR: This article extended the self-attention mechanism to consider representations of the relative positions, or distances between sequence elements, and showed that relative and absolute position representations yields no further improvement in translation quality.
Posted Content
Image Transformer
Niki Parmar,Ashish Vaswani,Jakob Uszkoreit,Łukasz Kaiser,Noam Shazeer,Alexander Ku,Dustin Tran +6 more
TL;DR: In this article, a self-attention mechanism is used to attend to local neighborhoods to increase the size of images generated by the model, despite maintaining significantly larger receptive fields per layer than typical CNNs.