R
Rohit Prabhavalkar
Researcher at Google
Publications - 105
Citations - 5764
Rohit Prabhavalkar is an academic researcher from Google. The author has contributed to research in topics: Word error rate & Computer science. The author has an hindex of 31, co-authored 86 publications receiving 3931 citations. Previous affiliations of Rohit Prabhavalkar include Ohio State University.
Papers
More filters
Proceedings ArticleDOI
State-of-the-Art Speech Recognition with Sequence-to-Sequence Models
Chung-Cheng Chiu,Tara N. Sainath,Yonghui Wu,Rohit Prabhavalkar,Patrick Nguyen,Zhifeng Chen,Anjuli Kannan,Ron Weiss,Kanishka Rao,Ekaterina Gonina,Navdeep Jaitly,Bo Li,Jan Chorowski,Michiel Bacchiani +13 more
TL;DR: In this article, the authors explore a variety of structural and optimization improvements to the Listen, Attend, and Spell (LAS) encoder-decoder architecture, which significantly improves performance.
Proceedings ArticleDOI
Streaming End-to-end Speech Recognition for Mobile Devices
Yanzhang He,Tara N. Sainath,Rohit Prabhavalkar,Ian McGraw,Raziel Alvarez,Ding Zhao,David Rybach,Anjuli Kannan,Yonghui Wu,Ruoming Pang,Qiao Liang,Deepti Bhatia,Yuan Shangguan,Bo Li,Golan Pundak,Khe Chai Sim,Tom Bagby,Shuo-Yiin Chang,Kanishka Rao,Alexander H. Gruenstein +19 more
TL;DR: This work describes its efforts at building an E2E speech recog-nizer using a recurrent neural network transducer and finds that the proposed approach can outperform a conventional CTC-based model in terms of both latency and accuracy.
Proceedings ArticleDOI
A Comparison of Sequence-to-Sequence Models for Speech Recognition
TL;DR: It is found that the sequence-to-sequence models are competitive with traditional state-of-the-art approaches on dictation test sets, although the baseline, which uses a separate pronunciation and language model, outperforms these models on voice-search test sets.
Proceedings ArticleDOI
Exploring architectures, data and units for streaming end-to-end speech recognition with RNN-transducer
TL;DR: In this article, a recurrent neural network transducer (RNN-T) is proposed to jointly learn acoustic and language model components from transcribed acoustic data, which achieves state-of-the-art performance for end-to-end speech recognition.
Posted Content
Exploring Architectures, Data and Units For Streaming End-to-End Speech Recognition with RNN-Transducer
TL;DR: This work investigates training end-to-end speech recognition models with the recurrent neural network transducer (RNN-T) and finds that performance can be improved further through the use of sub-word units ('wordpieces') which capture longer context and significantly reduce substitution errors.