N
Niki Parmar
Researcher at Google
Publications - 41
Citations - 66115
Niki Parmar is an academic researcher from Google. The author has contributed to research in topics: Transformer (machine learning model) & Machine translation. The author has an hindex of 22, co-authored 39 publications receiving 31763 citations. Previous affiliations of Niki Parmar include University of Southern California.
Papers
More filters
Posted Content
Mesh-TensorFlow: Deep Learning for Supercomputers
Noam Shazeer,Youlong Cheng,Niki Parmar,Dustin Tran,Ashish Vaswani,Penporn Koanantakool,Peter Hawkins,HyoukJoong Lee,Mingsheng Hong,Cliff Young,Ryan Sepassi,Blake A. Hechtman +11 more
TL;DR: Mesh-TensorFlow is introduced, a language for specifying a general class of distributed tensor computations and used to implement an efficient data-parallel, model-Parallel version of the Transformer sequence-to-sequence model, surpassing state of the art results on WMT'14 English- to-French translation task and the one-billion-word language modeling benchmark.
Journal ArticleDOI
Purity homophily in social networks.
Morteza Dehghani,Kate M. Johnson,Joe Hoover,Eyal Sagi,Justin Garten,Niki Parmar,Stephen Vaisey,Rumen Iliev,Jesse Graham +8 more
TL;DR: Results indicate that social network processes reflect moral selection, and both online and offline differences in moral purity concerns are particularly predictive of social distance.
Proceedings Article
Mesh-TensorFlow: Deep Learning for Supercomputers
Noam Shazeer,Youlong Cheng,Niki Parmar,Dustin Tran,Ashish Vaswani,Penporn Koanantakool,Peter Hawkins,HyoukJoong Lee,Mingsheng Hong,Cliff Young,Ryan Sepassi,Blake A. Hechtman +11 more
TL;DR: Mesh-TensorFlow as mentioned in this paper is a language for specifying a general class of distributed tensor computations, where the user can specify any tensor-dimensions to be split across any dimensions of a multi-dimensional mesh of processors.
Patent
Fast decoding in sequence models using discrete latent variables
Posted Content
Theory and Experiments on Vector Quantized Autoencoders
TL;DR: This work investigates an alternate training technique for VQ-VAE, inspired by its connection to the Expectation Maximization (EM) algorithm, and develops a non-autoregressive machine translation model whose accuracy almost matches a strong greedy autoregressive baseline Transformer, while being 3.3 times faster at inference.