Yanmin Qian

Researcher at Shanghai Jiao Tong University

Publications - 222

Citations - 10610

Yanmin Qian is an academic researcher from Shanghai Jiao Tong University. The author has contributed to research in topics: Computer science & Word error rate. The author has an hindex of 33, co-authored 172 publications receiving 8419 citations. Previous affiliations of Yanmin Qian include Tencent & Chinese Academy of Sciences.

Papers

PDF

Open Access

More filters

Proceedings Article

The Kaldi Speech Recognition Toolkit

Daniel Povey, +12 more

TL;DR: The design of Kaldi is described, a free, open-source toolkit for speech recognition research that provides a speech recognition system based on finite-state automata together with detailed documentation and a comprehensive set of scripts for building complete recognition systems.

...read moreread less

Journal ArticleDOI

Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition

Yanmin Qian, +3 more

- 01 Dec 2016 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: The proposed very deep CNNs can significantly reduce word error rate (WER) for noise robust speech recognition and are competitive with the long short-term memory recurrent neural networks (LSTM-RNN) acoustic model.

...read moreread less

Journal ArticleDOI

Deep feature for text-dependent speaker verification

Yuan Liu, +5 more

- 01 Oct 2015 -

Speech Communication

TL;DR: Experiments showed that deep feature based methods can obtain significant performance improvements compared to the traditional baselines, no matter if they are directly applied in the GMM-UBM system or utilized as identity vectors.

...read moreread less

Proceedings ArticleDOI

Generating exact lattices in the WFST framework

Daniel Povey, +12 more

TL;DR: A lattice generation method that is exact, i.e. it satisfies all the natural properties the authors would want from a lattice of alternative transcriptions of an utterance, and does not introduce substantial overhead above one-best decoding.

...read moreread less

Posted Content

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

Sanyuan Chen, +16 more

- 26 Oct 2021 -

arXiv: Computation and Language

TL;DR: WavLM as mentioned in this paper proposes a pre-trained model to solve full-stack downstream speech tasks and achieves state-of-the-art performance on the SUPERB speech recognition task.

...read moreread less

Collapse