Y
Yan Deng
Researcher at Microsoft
Publications - 22
Citations - 199
Yan Deng is an academic researcher from Microsoft. The author has contributed to research in topics: Computer science & Prosody. The author has an hindex of 5, co-authored 19 publications receiving 142 citations. Previous affiliations of Yan Deng include Tsinghua University.
Papers
More filters
Proceedings ArticleDOI
Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS
Mutian He,Yan Deng,Lei He +2 more
TL;DR: The experimental results show that the proposed stepwise monotonic attention method could achieve significant improvements in robustness on out-of-domain scenarios for phoneme-based models, without any regression on the in-domain naturalness test.
Posted Content
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Yan Deng,Lei He,Frank K. Soong +2 more
TL;DR: The multi-speaker latent space is investigated to improve neural TTS for adapting the system to new speakers with only several minutes of speech or enhancing a premium voice by utilizing the data from other speakers for richer contextual coverage and better generalization.
Journal ArticleDOI
Time–Frequency Cepstral Features and Heteroscedastic Linear Discriminant Analysis for Language Recognition
TL;DR: Experiments on NIST 2003 and 2007 LRE evaluation corpora show that TFC is more effective than SDC, and that the GMM-based BDHLDA results in lower equal error rate (EER) and minimum average cost (Cavg) than either TFC or SDC approaches.
Proceedings ArticleDOI
Speech Bert Embedding for Improving Prosody in Neural TTS
TL;DR: In this article, a speech BERT model was proposed to extract embedded prosody information in speech segments for improving the prosody of synthesized speech in neural text-to-speech (TTS).
Proceedings ArticleDOI
Automatic language identification using support vector machines and phonetic N-gram
TL;DR: A new effective normalization method is proposed for language identification using support vector machines (SVM) and phonetic n-gram and it shows a relative reduction in terms of equal error rate (EER) compared with the traditional one.