Y
Yanmin Qian
Researcher at Shanghai Jiao Tong University
Publications - 222
Citations - 10610
Yanmin Qian is an academic researcher from Shanghai Jiao Tong University. The author has contributed to research in topics: Computer science & Word error rate. The author has an hindex of 33, co-authored 172 publications receiving 8419 citations. Previous affiliations of Yanmin Qian include Tencent & Chinese Academy of Sciences.
Papers
More filters
Proceedings Article
The Kaldi Speech Recognition Toolkit
Daniel Povey,Arnab Ghoshal,Gilles Boulianne,Lukas Burget,Ondrej Glembek,Nagendra Kumar Goel,Mirko Hannemann,Petr Motlicek,Yanmin Qian,Petr Schwarz,Jan Silovsky,Georg Stemmer,Karel Vesely +12 more
TL;DR: The design of Kaldi is described, a free, open-source toolkit for speech recognition research that provides a speech recognition system based on finite-state automata together with detailed documentation and a comprehensive set of scripts for building complete recognition systems.
Journal ArticleDOI
Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition
TL;DR: The proposed very deep CNNs can significantly reduce word error rate (WER) for noise robust speech recognition and are competitive with the long short-term memory recurrent neural networks (LSTM-RNN) acoustic model.
Journal ArticleDOI
Deep feature for text-dependent speaker verification
TL;DR: Experiments showed that deep feature based methods can obtain significant performance improvements compared to the traditional baselines, no matter if they are directly applied in the GMM-UBM system or utilized as identity vectors.
Proceedings ArticleDOI
Generating exact lattices in the WFST framework
Daniel Povey,Mirko Hannemann,Gilles Boulianne,Lukas Burget,Arnab Ghoshal,Milos Janda,Martin Karafiat,Stefan Kombrink,Petr Motlicek,Yanmin Qian,Korbinian Riedhammer,Karel Vesely,Ngoc Thang Vu +12 more
TL;DR: A lattice generation method that is exact, i.e. it satisfies all the natural properties the authors would want from a lattice of alternative transcriptions of an utterance, and does not introduce substantial overhead above one-best decoding.
Posted Content
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen,Chengyi Wang,Zhengyang Chen,Yu Wu,Shujie Liu,Zhuo Chen,Jinyu Li,Naoyuki Kanda,Takuya Yoshioka,Xiong Xiao,Jian Wu,Long Zhou,Shuo Ren,Yanmin Qian,Yao Qian,Michael Zeng,Furu Wei +16 more
TL;DR: WavLM as mentioned in this paper proposes a pre-trained model to solve full-stack downstream speech tasks and achieves state-of-the-art performance on the SUPERB speech recognition task.