Y
Yanzhang He
Researcher at Google
Publications - 54
Citations - 2161
Yanzhang He is an academic researcher from Google. The author has contributed to research in topics: Computer science & Language model. The author has an hindex of 14, co-authored 48 publications receiving 1248 citations. Previous affiliations of Yanzhang He include Ohio State University.
Papers
More filters
Proceedings ArticleDOI
Streaming End-to-end Speech Recognition for Mobile Devices
Yanzhang He,Tara N. Sainath,Rohit Prabhavalkar,Ian McGraw,Raziel Alvarez,Ding Zhao,David Rybach,Anjuli Kannan,Yonghui Wu,Ruoming Pang,Qiao Liang,Deepti Bhatia,Yuan Shangguan,Bo Li,Golan Pundak,Khe Chai Sim,Tom Bagby,Shuo-Yiin Chang,Kanishka Rao,Alexander H. Gruenstein +19 more
TL;DR: This work describes its efforts at building an E2E speech recog-nizer using a recurrent neural network transducer and finds that the proposed approach can outperform a conventional CTC-based model in terms of both latency and accuracy.
Posted Content
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Jonathan Shen,Patrick Nguyen,Yonghui Wu,Zhifeng Chen,Mia Xu Chen,Ye Jia,Anjuli Kannan,Tara N. Sainath,Yuan Cao,Chung-Cheng Chiu,Yanzhang He,Jan Chorowski,Smit Hinsu,Stella Marie Laurenzo,James Qin,Orhan Firat,Wolfgang Macherey,Suyog Gupta,Ankur Bapna,Shuyuan Zhang,Ruoming Pang,Ron Weiss,Rohit Prabhavalkar,Qiao Liang,Benoit Jacob,Bowen Liang,HyoukJoong Lee,Ciprian Chelba,Sébastien Jean,Bo Li,Melvin Johnson,Rohan Anil,Rajat Tibrewal,Xiaobing Liu,Akiko Eriguchi,Navdeep Jaitly,Naveen Ari,Colin Cherry,Parisa Haghani,Otavio Good,Youlong Cheng,Raziel Alvarez,Isaac Caswell,Wei-Ning Hsu,Zongheng Yang,Kuan-Chieh Wang,Ekaterina Gonina,Katrin Tomanek,Ben Vanik,Zelin Wu,Llion Jones,Mike Schuster,Yanping Huang,Dehao Chen,Kazuki Irie,George Foster,John Richardson,Klaus Macherey,Antoine Bruguier,Heiga Zen,Colin Raffel,Shankar Kumar,Kanishka Rao,David Rybach,Matthew Murray,Vijayaditya Peddinti,Maxim Krikun,Michiel Bacchiani,Thomas B. Jablin,Robert Suderman,Ian Williams,Benjamin N. Lee,Deepti Bhatia,Justin Carlson,Semih Yavuz,Yu Zhang,Ian McGraw,Max Galkin,Qi Ge,Golan Pundak,Chad Whipkey,Todd Wang,Uri Alon,Dmitry Lepikhin,Ye Tian,Sara Sabour,William Chan,Shubham Toshniwal,Baohua Liao,Michael Nirschl,Pat Rondon +90 more
TL;DR: This document outlines the underlying design of Lingvo and serves as an introduction to the various pieces of the framework, while also offering examples of advanced features that showcase the capabilities of the Framework.
Proceedings ArticleDOI
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency
Tara N. Sainath,Yanzhang He,Bo Li,Arun Narayanan,Ruoming Pang,Antoine Bruguier,Shuo-Yiin Chang,Wei Li,Raziel Alvarez,Zhifeng Chen,Chung-Cheng Chiu,David Garcia,Alex Gruenstein,Ke Hu,Anjuli Kannan,Qiao Liang,Ian McGraw,Cal Peyser,Rohit Prabhavalkar,Golan Pundak,David Rybach,Yuan Shangguan,Yash Sheth,Trevor Strohman,Mirko Visontai,Yonghui Wu,Yu Zhang,Ding Zhao +27 more
TL;DR: In this article, a first-pass Recurrent Neural Network Transducer (RNN-T) model and a second-pass Listen, Attend, Spell (LAS) rescorer were developed.
Proceedings ArticleDOI
Two-Pass End-to-End Speech Recognition
Tara N. Sainath,Ruoming Pang,David Rybach,Yanzhang He,Rohit Prabhavalkar,Wei Li,Mirko Visontai,Qiao Liang,Trevor Strohman,Yonghui Wu,Ian McGraw,Chung-Cheng Chiu +11 more
TL;DR: In this paper, two-pass automatic speech recognition (ASR) models are used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data.
Proceedings ArticleDOI
Towards Fast and Accurate Streaming End-To-End ASR
TL;DR: This work proposes to reduce E2E model’s latency by extending the RNN-T endpointer (RNN- T EP) model with additional early and late penalties and achieves 8.0% relative word error rate (WER) reduction and 130ms 90-percentile latency reduction over [2] on a Voice Search test set.