scispace - formally typeset
Z

Zhiying Huang

Researcher at University of Science and Technology of China

Publications -  5
Citations -  40

Zhiying Huang is an academic researcher from University of Science and Technology of China. The author has contributed to research in topics: Speaker recognition & Engineering. The author has an hindex of 1, co-authored 3 publications receiving 24 citations.

Papers
More filters
Proceedings ArticleDOI

Speaker adaptation OF RNN-BLSTM for speech recognition based on speaker code

TL;DR: This paper studies how to conduct effective speaker code based speaker adaptation on RNN-BLSTM and demonstrates that theSpeaker code based adaptation method is also a valid adaptation method for RNN/LSTM.
Proceedings ArticleDOI

Prosospeech: Enhancing Prosody with Quantized Vector Pre-Training in Text-To-Speech

TL;DR: ProsoSpeech is proposed, which enhances the prosody using quantized latent vectors pre-trained on large-scale unpaired and low-quality text and speech data and can generate expressive speech conditioned on the predicted LPV.
Journal ArticleDOI

PolyVoice: Language Models for Speech to Speech Translation

TL;DR: PolyVoice as mentioned in this paper is a language model-based framework for speech-to-speech translation (S2ST) system, which consists of two language models: a translation language model and a speech synthesis language model.
Proceedings ArticleDOI

Unsupervised speaker adaptation of BLSTM-RNN for LVCSR based on speaker code

TL;DR: Evaluated speaker code based adaptation with singular value decomposition (SVD) method and an error normalization method to balance the back-propagation errors derived from different layers for speaker codes show better recognition performance than the i-vector based speaker adaptation of the same dimension.
Proceedings ArticleDOI

Rapid speaker adaptation based on D-code extracted from BLSTM-RNN in LVCSR

TL;DR: This paper proposes an alternative d-code extraction method to replace SC based on modeling speaker information with BLSTM-RNN which makes one-pass decoding possible and a speaker clustering approach is introduced to decrease the target number of speaker-BLSTM which accelerates training speed and improves ASR performance at the same time.