Y
Yannis Agiomyrgiannakis
Researcher at Google
Publications - 28
Citations - 3222
Yannis Agiomyrgiannakis is an academic researcher from Google. The author has contributed to research in topics: Speech synthesis & Speech coding. The author has an hindex of 15, co-authored 28 publications receiving 2538 citations. Previous affiliations of Yannis Agiomyrgiannakis include University of Crete.
Papers
More filters
Proceedings ArticleDOI
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang,RJ Skerry-Ryan,Daisy Stanton,Yonghui Wu,Ron Weiss,Navdeep Jaitly,Zongheng Yang,Ying Xiao,Zhifeng Chen,Samy Bengio,Quoc V. Le,Yannis Agiomyrgiannakis,Robert A. J. Clark,Rif A. Saurous +13 more
TL;DR: Tacotron as mentioned in this paper is an end-to-end generative text to speech model that synthesizes speech directly from characters, given pairs, the model can be trained completely from scratch with random initialization.
Posted Content
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen,Ruoming Pang,Ron Weiss,Mike Schuster,Navdeep Jaitly,Zongheng Yang,Zhifeng Chen,Yu Zhang,Yuxuan Wang,RJ Skerry-Ryan,Rif A. Saurous,Yannis Agiomyrgiannakis,Yonghui Wu +12 more
TL;DR: Tacotron 2 as mentioned in this paper uses a recurrent sequence-to-sequence feature prediction network that maps character embeddings to mel-scale spectrograms, followed by a modified WaveNet model acting as a vocoder to synthesize timedomain waveforms.
Posted Content
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang,RJ Skerry-Ryan,Daisy Stanton,Yonghui Wu,Ron Weiss,Navdeep Jaitly,Zongheng Yang,Ying Xiao,Zhifeng Chen,Samy Bengio,Quoc V. Le,Yannis Agiomyrgiannakis,Robert A. J. Clark,Rif A. Saurous +13 more
TL;DR: Tacotron is presented, an end-to-end generative text- to-speech model that synthesizes speech directly from characters that achieves a 3.82 subjective 5-scale mean opinion score on US English, outperforming a production parametric system in terms of naturalness.
Posted Content
Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.
Yuxuan Wang,RJ Skerry-Ryan,Daisy Stanton,Yonghui Wu,Ron Weiss,Navdeep Jaitly,Zongheng Yang,Ying Xiao,Zhifeng Chen,Samy Bengio,Quoc V. Le,Yannis Agiomyrgiannakis,Robert A. J. Clark,Rif A. Saurous +13 more
TL;DR: This paper presents Tacotron, an end- to-end generative text-to-speech model that synthesizes speech directly from characters, and presents several key techniques to make the sequence-tosequence framework perform well for this challenging task.
Proceedings ArticleDOI
Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices
TL;DR: Further optimizations of LSTM-RNN-based SPSS for deployment on mobile devices; weight quantization, multi-frame inference, and robust inference using an {\epsilon}-contaminated Gaussian loss function are described.