scispace - formally typeset
D

Dong Yu

Researcher at Tencent

Publications -  389
Citations -  45733

Dong Yu is an academic researcher from Tencent. The author has contributed to research in topics: Artificial neural network & Word error rate. The author has an hindex of 72, co-authored 339 publications receiving 39098 citations. Previous affiliations of Dong Yu include Peking University & Microsoft.

Papers
More filters
Proceedings ArticleDOI

A comparative analytic study on the Gaussian mixture and context dependent deep neural network hidden Markov models.

TL;DR: The robustness remains to be a major challenge in the deep learning acoustic model and speech enhancement, channel normalization, and speaking rate compensation are important research areas in order to further improve the DNN model accuracy.
Proceedings ArticleDOI

Seq2Seq Attentional Siamese Neural Networks for Text-dependent Speaker Verification

TL;DR: Experimental results show that the proposed model outperforms various baseline methods, including the traditional i-Vector/PLDA method, multi-enrollment end-to-end speaker verification models, d-vector approaches, and a self attention model, for text-dependent speaker verification on a Tencent internal voice wake-up dataset.
Patent

Multilingual deep neural network

TL;DR: In this article, various technologies pertaining to a multilingual deep neural network (MDNN) are described, wherein values for weight parameters of the plurality of hidden layers are learned during a training phase based upon training data in terms of acoustic raw features for multiple languages.
Proceedings ArticleDOI

Joint Training of Complex Ratio Mask Based Beamformer and Acoustic Model for Noise Robust Asr

TL;DR: The complex ratio mask (CRM) is proposed to estimate the covariance matrix for the beamformer and a long short-term memory (LSTM) based language model is utilized to re-score hypotheses which further improves the overall performance.
Proceedings ArticleDOI

Single-channel mixed speech recognition using deep neural networks

TL;DR: This work investigates several different training setups that enable the DNN to generalize to corresponding similar patterns in the test data, and introduces a WFST-based two-talker decoder to work with the trained DNNs.