Vector-quantization-based speech recognition and speaker recognition techniques
S. Furui
- pp 954-958
Reads0
Chats0
TLDR
It is concluded that not only has the VQ technique reduced the amount of computation and storage, but it has also created new ideas for solving various problems in speech/speaker recognition.Abstract:
The author reviews major methods of applying the vector quantization (VQ) technique to speech and speaker recognition. These include speech recognition based on the combination of VQ and the DTW/HMM (dynamic time warping/hidden Markov model) technique. VQ-distortion-based recognition, learning VQ algorithms, speaker adaptation by VQ-codebook mapping, and VQ-distortion-based speaker recognition. It is concluded that not only has the VQ technique reduced the amount of computation and storage, but it has also created new ideas for solving various problems in speech/speaker recognition. >read more
Citations
More filters
Proceedings ArticleDOI
Real-time speaker identification.
TL;DR: The number of test vectors is reduced by pre-quantizing the test sequence prior to matching, and the number of speakers are reduced by pruning out unlikely speakers during the identification process by optimizing vector quantization (VQ) based speaker identification.
Quranic Verse Recitation Feature Extraction using Mel-Frequency Cepstral Coefficient (MFCC)
Noor Jamaliah Ibrahim,Zaidi Razak,Emran Mohd Tamil,Mohd Yamani Idna Idris,Zulkifli Mohd Yusoff +4 more
TL;DR: This paper explores the viability of Mel-Frequency Cepstral Coefficient (MFCC) technique to extract features from Quranic verse recitation, one of the most popular feature extraction techniques used in speech recognition.
Toward Constructing A Multilingual Speech Corpus for Taiwanese (Min-nan), Hakka, and Mandarin
TL;DR: The Formosa speech database (ForSDat) is a multilingual speech corpus collected at Chang Gung University and sponsored by the National Science Council of Taiwan and the first version of this corpus containing speech of 600 speakers of Taiwanese and Mandarin was finished and is ready to be released.
Multiband Approach to Robust Text-Independent Speaker Identification
TL;DR: Experimental results show that both proposed methods achieve better performance than GMM using full-band LPCCs and mel-frequency cepstral coefficients (MFCCs) when the speaker identification is evaluated in the presence of clean and noisy environments.
Book ChapterDOI
Learning Intrinsic Video Content Using Levenshtein Distance in Graph Partitioning
Jeffrey Ng,Shaogang Gong +1 more
TL;DR: The graph partitioning method is extended and in particular, the Normalised Cut model originally introduced for static image segmentation is extended to unsupervised clustering of temporal trajectories withfully automated model order selection.
References
More filters
Book
Self Organization And Associative Memory
TL;DR: The purpose and nature of Biological Memory, as well as some of the aspects of Memory Aspects, are explained.
Journal ArticleDOI
Hidden Markov models for speech recognition
TL;DR: The role of statistical methods in this powerful technology as applied to speech recognition is addressed and a range of theoretical and practical issues that are as yet unsolved in terms of their importance and their effect on performance for different system implementations are discussed.
Book
Hidden Markov Models for Speech Recognition
TL;DR: In this article, the authors unified theory with semi-continuous models using hidden Markov models for speech recognition experimental examples, using vector quantization and mixture densities hidden markov models.
Proceedings ArticleDOI
Statistical pattern recognition with neural networks: benchmarking studies
TL;DR: Three basic types of neural-like networks, backpropagation network, Boltzmann machine, and learning vector quantization, were applied to two representative artificial statistical pattern recognition tasks, each with varying dimensionality.
Related Papers (5)
Robust text-independent speaker identification using Gaussian mixture speaker models
Douglas A. Reynolds,Richard Rose +1 more