R
Ryan Prenger
Researcher at Nvidia
Publications - 36
Citations - 7639
Ryan Prenger is an academic researcher from Nvidia. The author has contributed to research in topics: Deep learning & Computer science. The author has an hindex of 18, co-authored 32 publications receiving 6277 citations. Previous affiliations of Ryan Prenger include University of California & Baidu.
Papers
More filters
End to end speech recognition in English and Mandarin
Dario Amodei,Rishita Anubhai,Eric Battenberg,Carl Case,Jared Casper,Bryan Catanzaro,Jingdong Chen,Mike Chrzanowski,Adam Coates,Greg Diamos,Erich Elsen,Jesse Engel,Linxi Fan,Christopher Fougner,Tony X. Han,Awni Hannun,Billy Jun,Patrick LeGresley,Libby Lin,Sharan Narang,Andrew Y. Ng,Sherjil Ozair,Ryan Prenger,Jonathan Raiman,Sanjeev Satheesh,David Seetapun,Shubho Sengupta,Yi Wang,Zhiqian Wang,Chong Wang,Bo Xiao,Dani Yogatama,Jun Zhan,Zhenyao Zhu +33 more
TL;DR: It is shown that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech-two vastly different languages, and is competitive with the transcription of human workers when benchmarked on standard datasets.
Posted Content
Deep Speech: Scaling up end-to-end speech recognition
Awni Hannun,Carl Case,Jared Casper,Bryan Catanzaro,Greg Diamos,Erich Elsen,Ryan Prenger,Sanjeev Satheesh,Shubho Sengupta,Adam Coates,Andrew Y. Ng +10 more
TL;DR: Deep Speech, a state-of-the-art speech recognition system developed using end-to-end deep learning, outperforms previously published results on the widely studied Switchboard Hub5'00, achieving 16.0% error on the full test set.
Proceedings Article
Deep speech 2: end-to-end speech recognition in English and mandarin
Dario Amodei,Sundaram Ananthanarayanan,Rishita Anubhai,Jingliang Bai,Eric Battenberg,Carl Case,Jared Casper,Bryan Catanzaro,Qiang Cheng,Guoliang Chen,Jie Chen,Jingdong Chen,Zhijie Chen,Mike Chrzanowski,Adam Coates,Greg Diamos,Ke Ding,Niandong Du,Erich Elsen,Jesse Engel,Weiwei Fang,Linxi Fan,Christopher Fougner,Liang Gao,Caixia Gong,Awni Hannun,Tony X. Han,Lappi Vaino Johannes,Bing Jiang,Cai Ju,Billy Jun,Patrick LeGresley,Libby Lin,Junjie Liu,Yang Liu,Weigao Li,Xiangang Li,Dongpeng Ma,Sharan Narang,Andrew Y. Ng,Sherjil Ozair,Yiping Peng,Ryan Prenger,Sheng Qian,Zongfeng Quan,Jonathan Raiman,Vinay Rao,Sanjeev Satheesh,David Seetapun,Shubho Sengupta,Kavya Srinet,Anuroop Sriram,Haiyuan Tang,Liliang Tang,Chong Wang,Jidong Wang,Kaifu Wang,Yi Wang,Zhijian Wang,Zhiqian Wang,Shuang Wu,Likai Wei,Bo Xiao,Wen Xie,Yan Xie,Dani Yogatama,Bin Yuan,Jun Zhan,Zhenyao Zhu +68 more
TL;DR: In this article, an end-to-end deep learning approach was used to recognize either English or Mandarin Chinese speech-two vastly different languages-using HPC techniques, enabling experiments that previously took weeks to now run in days.
Proceedings ArticleDOI
Waveglow: A Flow-based Generative Network for Speech Synthesis
TL;DR: WaveGlow as mentioned in this paper is a flow-based network capable of generating high quality speech from mel-spectrograms without the need for auto-regression, and it is implemented using only a single network, trained using a single cost function: maximizing the likelihood of the training data.
Posted Content
WaveGlow: A Flow-based Generative Network for Speech Synthesis
TL;DR: WaveGlow is a flow-based network capable of generating high quality speech from mel-spectrograms, implemented using only a single network, trained using a single cost function: maximizing the likelihood of the training data, which makes the training procedure simple and stable.