A
Azalia Mirhoseini
Researcher at Google
Publications - Â 78
Citations - Â 3351
Azalia Mirhoseini is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Computer science. The author has an hindex of 19, co-authored 67 publications receiving 2118 citations. Previous affiliations of Azalia Mirhoseini include Microsoft & Rice University.
Papers
More filters
Posted Content
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Noam Shazeer,Azalia Mirhoseini,Krzysztof Maziarz,Andy Davis,Quoc V. Le,Geoffrey E. Hinton,Jeffrey Dean +6 more
TL;DR: This work introduces a Sparsely-Gated Mixture-of-Experts layer (MoE), consisting of up to thousands of feed-forward sub-networks, and applies the MoE to the tasks of language modeling and machine translation, where model capacity is critical for absorbing the vast quantities of knowledge available in the training corpora.
Posted Content
Device Placement Optimization with Reinforcement Learning
Azalia Mirhoseini,Hieu Pham,Quoc V. Le,Benoit Steiner,R. M. Larsen,Yuefeng Zhou,Naveen Kumar,Mohammad Norouzi,Samy Bengio,Jeffrey Dean +9 more
TL;DR: A method which learns to optimize device placement for TensorFlow computational graphs using a sequence-to-sequence model, which finds non-trivial device placements that outperform hand-crafted heuristics and traditional algorithmic methods.
Proceedings Article
Device placement optimization with reinforcement learning
Azalia Mirhoseini,Hieu Pham,Quoc V. Le,Benoit Steiner,R. M. Larsen,Yuefeng Zhou,Naveen Kumar,Mohammad Norouzi,Samy Bengio,Jeffrey Dean +9 more
TL;DR: In this article, a sequence-to-sequence model is used to predict which subsets of operations in a TensorFlow graph should run on which of the available devices, and the execution time of the predicted subsets is then used as the reward signal to optimize the parameters of the sequence to sequence model.
Proceedings Article
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Noam Shazeer,Azalia Mirhoseini,Krzysztof Maziarz,Andy Davis,Quoc V. Le,Geoffrey E. Hinton,Jeffrey Dean +6 more
TL;DR: In this paper, a sparsely-gated mixture-of-experts (MoE) layer is proposed to increase the capacity of a neural network to absorb information without a proportional increase in computation.
Posted Content
Chip Placement with Deep Reinforcement Learning
Azalia Mirhoseini,Anna Goldie,Mustafa Yazgan,Joe Jiang,Ebrahim M. Songhori,Shen Wang,Young-Joon Lee,Eric Johnson,Omkar Pathak,Sungmin Bae,Azade Nazi,Jiwoo Pak,Andy Tong,Kavya Srinivasa,William Hang,Emre Tuncer,Anand Babu,Quoc V. Le,James Laudon,C. Richard Ho,Roger Carpenter,Jeffrey Dean +21 more
TL;DR: This work presents a learning-based approach to chip placement, and shows that, in under 6 hours, this method can generate placements that are superhuman or comparable on modern accelerator netlists, whereas existing baselines require human experts in the loop and take several weeks.