Ryan Prenger

End to end speech recognition in English and Mandarin

TL;DR: It is shown that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech-two vastly different languages, and is competitive with the transcription of human workers when benchmarked on standard datasets.

...read moreread less

Posted Content

Deep Speech: Scaling up end-to-end speech recognition

Awni Hannun, +10 more

- 17 Dec 2014 -

arXiv: Computation and Language

TL;DR: Deep Speech, a state-of-the-art speech recognition system developed using end-to-end deep learning, outperforms previously published results on the widely studied Switchboard Hub5'00, achieving 16.0% error on the full test set.

...read moreread less

Proceedings Article

Deep speech 2: end-to-end speech recognition in English and mandarin

Dario Amodei, +68 more

TL;DR: In this article, an end-to-end deep learning approach was used to recognize either English or Mandarin Chinese speech-two vastly different languages-using HPC techniques, enabling experiments that previously took weeks to now run in days.

...read moreread less

Proceedings ArticleDOI

Waveglow: A Flow-based Generative Network for Speech Synthesis

Ryan Prenger, +2 more

TL;DR: WaveGlow as mentioned in this paper is a flow-based network capable of generating high quality speech from mel-spectrograms without the need for auto-regression, and it is implemented using only a single network, trained using a single cost function: maximizing the likelihood of the training data.

...read moreread less

Posted Content

WaveGlow: A Flow-based Generative Network for Speech Synthesis

Ryan Prenger, +2 more

- 31 Oct 2018 -

arXiv: Sound

TL;DR: WaveGlow is a flow-based network capable of generating high quality speech from mel-spectrograms, implemented using only a single network, trained using a single cost function: maximizing the likelihood of the training data, which makes the training procedure simple and stable.

...read moreread less

Papers

End to end speech recognition in English and Mandarin

Deep Speech: Scaling up end-to-end speech recognition

Deep speech 2: end-to-end speech recognition in English and mandarin

Waveglow: A Flow-based Generative Network for Speech Synthesis

WaveGlow: A Flow-based Generative Network for Speech Synthesis