A
Anmol Gulati
Researcher at Google
Publications - 21
Citations - 2514
Anmol Gulati is an academic researcher from Google. The author has contributed to research in topics: Computer science & Word error rate. The author has an hindex of 9, co-authored 19 publications receiving 605 citations.
Papers
More filters
Posted Content
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati,James Qin,Chung-Cheng Chiu,Niki Parmar,Yu Zhang,Jiahui Yu,Wei Han,Shibo Wang,Zhengdong Zhang,Yonghui Wu,Ruoming Pang +10 more
TL;DR: This work proposes the convolution-augmented transformer for speech recognition, named Conformer, which significantly outperforms the previous Transformer and CNN based models achieving state-of-the-art accuracies.
Proceedings ArticleDOI
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati,James Qin,Chung-Cheng Chiu,Niki Parmar,Yu Zhang,Jiahui Yu,Wei Han,Shibo Wang,Zhengdong Zhang,Yonghui Wu,Ruoming Pang +10 more
TL;DR: Conformer as mentioned in this paper combines convolution neural networks and transformers to model both local and global dependencies of an audio sequence in a parameter-efficient way, achieving state-of-the-art accuracies.
Posted Content
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
Wei Han,Zhengdong Zhang,Yu Zhang,Jiahui Yu,Chung-Cheng Chiu,James Qin,Anmol Gulati,Ruoming Pang,Yonghui Wu +8 more
TL;DR: This paper proposes a simple scaling method that scales the widths of ContextNet that achieves good trade-off between computation and accuracy and demonstrates that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate of 2.1%/4.6%.
Proceedings ArticleDOI
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
Wei Han,Zhengdong Zhang,Yu Zhang,Jiahui Yu,Chung-Cheng Chiu,James Qin,Anmol Gulati,Ruoming Pang,Yonghui Wu +8 more
TL;DR: ContextNet as mentioned in this paper incorporates global context information into convolution layers by adding squeeze-and-excitation modules, and proposes a simple scaling method that scales the widths of ContextNet that achieves good trade-off between computation and accuracy.
Posted Content
A Better and Faster End-to-End Model for Streaming ASR
Bo Li,Anmol Gulati,Jiahui Yu,Tara N. Sainath,Chung-Cheng Chiu,Arun Narayanan,Shuo-Yiin Chang,Ruoming Pang,Yanzhang He,James Qin,Wei Han,Qiao Liang,Yu Zhang,Trevor Strohman,Yonghui Wu +14 more
TL;DR: The Conformer RNN-T with Cascaded Encoders offers a better quality and latency tradeoff for streaming ASR.