scispace - formally typeset
D

Dong Yu

Researcher at Tencent

Publications -  389
Citations -  45733

Dong Yu is an academic researcher from Tencent. The author has contributed to research in topics: Artificial neural network & Word error rate. The author has an hindex of 72, co-authored 339 publications receiving 39098 citations. Previous affiliations of Dong Yu include Peking University & Microsoft.

Papers
More filters
Proceedings ArticleDOI

Deep bi-directional recurrent networks over spectral windows

TL;DR: This paper applies a windowed (truncated) LSTM to conversational speech transcription, and finds that a limited context is adequate, and that it is not necessaary to scan the entire utterance.
Proceedings ArticleDOI

Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis

TL;DR: The proposed method can significantly improve the naturalness of the conventional HMM-based speech synthesis system using mel-cepstra and is able to model the distribution of the spectral envelopes with better accuracy and generalization ability than the Gaussian mixture model.
Proceedings ArticleDOI

Language recognition using deep-structured conditional random fields

TL;DR: An unsupervised algorithm to pre-train the intermediate layers by casting it as a multi-objective programming problem that is aimed at minimizing the average frame-level conditional entropy while maximizing the state occupation entropy is proposed.
Proceedings ArticleDOI

Deep segmental neural networks for speech recognition.

TL;DR: The deep segmental neural network (DSNN) is proposed, a segmental model that uses DNNs to estimate the acoustic scores of phonemic or sub-phonemic segments with variable lengths, which allows the DSNN to represent each segment as a single unit, in which frames are made dependent on each other.
Proceedings ArticleDOI

A deep architecture with bilinear modeling of hidden representations: Applications to phonetic recognition

TL;DR: A novel deep architecture, the Tensor Deep Stacking Network (T-DSN), where multiple blocks are stacked one on top of another and where a bilinear mapping from hidden representations to the output in each block is used to incorporate higher-order statistics of the input features.