scispace - formally typeset
D

Dong Yu

Researcher at Tencent

Publications -  389
Citations -  45733

Dong Yu is an academic researcher from Tencent. The author has contributed to research in topics: Artificial neural network & Word error rate. The author has an hindex of 72, co-authored 339 publications receiving 39098 citations. Previous affiliations of Dong Yu include Peking University & Microsoft.

Papers
More filters
Journal ArticleDOI

Single-channel multi-talker speech recognition with permutation invariant training

TL;DR: In this article, the authors extend permutation invariant training (PIT) by introducing the front-end feature separation module with the minimum mean square error (MSE) criterion and the back-end recognition module with minimum cross entropy (CE) criterion.
Proceedings ArticleDOI

Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning

TL;DR: This work proposes an integrated architecture for learning spatial features directly from the multi-channel speech waveforms within an end-to-end speech separation framework using a 2d convolution layer and designs a conv2d kernel to compute the inter-channel convolution differences (ICDs), which are expected to provide the spatial cues that help to distinguish the directional sources.
Patent

Incrementally regulated discriminative margins in mce training for speech recognition

TL;DR: In this paper, a method and apparatus for training an acoustic model are disclosed, where a training corpus is accessed and converted into an initial acoustic model, and scores are calculated for a correct class and competitive classes, respectively, for each token given the acoustic model.
Journal ArticleDOI

Sequential Labeling Using Deep-Structured Conditional Random Fields

TL;DR: The experimental results demonstrate that the deep-structured CRF achieves word labeling accuracies that are significantly higher than the best results reported on these tasks using the same labeled training set.
Proceedings ArticleDOI

Use of Differential Cepstra as Acoustic Features in Hidden Trajectory Modeling for Phonetic Recognition

Li Deng, +1 more
TL;DR: The earlier version of the hidden trajectory model (HTM) for speech dynamics which predicts the "static" cepstra as the observed acoustic feature is generalized to one which predicts joint Static/delta-cepstra HTM, enabling efficient computation of the joint likelihood for both static and delta cepstral sequences as the acoustic features given the model.