D
Dong Yu
Researcher at Tencent
Publications - 389
Citations - 45733
Dong Yu is an academic researcher from Tencent. The author has contributed to research in topics: Artificial neural network & Word error rate. The author has an hindex of 72, co-authored 339 publications receiving 39098 citations. Previous affiliations of Dong Yu include Peking University & Microsoft.
Papers
More filters
Journal ArticleDOI
Single-channel multi-talker speech recognition with permutation invariant training
TL;DR: In this article, the authors extend permutation invariant training (PIT) by introducing the front-end feature separation module with the minimum mean square error (MSE) criterion and the back-end recognition module with minimum cross entropy (CE) criterion.
Proceedings ArticleDOI
Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning
TL;DR: This work proposes an integrated architecture for learning spatial features directly from the multi-channel speech waveforms within an end-to-end speech separation framework using a 2d convolution layer and designs a conv2d kernel to compute the inter-channel convolution differences (ICDs), which are expected to provide the spatial cues that help to distinguish the directional sources.
Patent
Incrementally regulated discriminative margins in mce training for speech recognition
TL;DR: In this paper, a method and apparatus for training an acoustic model are disclosed, where a training corpus is accessed and converted into an initial acoustic model, and scores are calculated for a correct class and competitive classes, respectively, for each token given the acoustic model.
Journal ArticleDOI
Sequential Labeling Using Deep-Structured Conditional Random Fields
Dong Yu,Shizhen Wang,Li Deng +2 more
TL;DR: The experimental results demonstrate that the deep-structured CRF achieves word labeling accuracies that are significantly higher than the best results reported on these tasks using the same labeled training set.
Proceedings ArticleDOI
Use of Differential Cepstra as Acoustic Features in Hidden Trajectory Modeling for Phonetic Recognition
TL;DR: The earlier version of the hidden trajectory model (HTM) for speech dynamics which predicts the "static" cepstra as the observed acoustic feature is generalized to one which predicts joint Static/delta-cepstra HTM, enabling efficient computation of the joint likelihood for both static and delta cepstral sequences as the acoustic features given the model.