scispace - formally typeset
Proceedings ArticleDOI

Speaker adaptation through vector quantization

Reads0
Chats0
TLDR
Vector quantization (VQ) is a technique that reduces the computation amount and memory size drastically and is proposed in order to improve speaker-independent recognition.
Abstract
Vector quantization (VQ) is a technique that reduces the computation amount and memory size drastically. In this paper, speaker adaptation algorithms through VQ are proposed in order to improve speaker-independent recognition. The speaker adaptation algorithms use VQ codebooks of a reference speaker and an input speaker. Speaker adaptation is performed by substituting vectors in the codebook of a reference speaker for vectors of the input speaker's codebook, or vice versa. To confirm the effectiveness of these algorithms, word recognition experiments are carried out using the IBM office correspondence task uttered by 11 speakers. The total number of words is 1174 for each speaker, and the number of different words is 422. The average word recognition rate using different speaker's reference through speaker adaptation is 80.9%, and the rate within the second choice is 92.0%.

read more

Citations
More filters
Patent

Intelligent Automated Assistant

TL;DR: In this article, an intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions.
Journal ArticleDOI

Continuous probabilistic transform for voice conversion

TL;DR: The design of a new methodology for representing the relationship between two sets of spectral envelopes and the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods.
Journal ArticleDOI

Speaker-independent phone recognition using hidden Markov models

TL;DR: The authors introduce the co-occurrence smoothing algorithm, which enables accurate recognition even with very limited training data, and can be used as benchmarks to evaluate future systems.
Patent

Automated Response to and Sensing of User Activity in Portable Devices

TL;DR: In this paper, various methods and devices described herein relate to devices which, in at least certain embodiments, may include one or more sensors for providing data relating to user activity and at least one processor for causing the device to respond based on the user activity which was determined, at least in part, through the sensors.
Patent

Using context information to facilitate processing of commands in a virtual assistant

TL;DR: In this article, a virtual assistant uses context information to supplement natural language or gestural input from a user, which helps to clarify the user's intent and reduce the number of candidate interpretations of user's input, and reduces the need for the user to provide excessive clarification input.
References
More filters
Journal ArticleDOI

An Algorithm for Vector Quantizer Design

TL;DR: An efficient and intuitive algorithm is presented for the design of vector quantizers based either on a known probabilistic model or on a long training sequence of data.
Journal ArticleDOI

On the application of vector quantization and hidden Markov models to speaker-independent, isolated word recognition

TL;DR: This paper presents an approach to speaker-independent, isolated word recognition in which the well-known techniques of vector quantization and hidden Markov modeling are combined with a linear predictive coding analysis front end in the framework of a standard statistical pattern recognition model.
Journal ArticleDOI

Discrete utterance speech recognition without time alignment

TL;DR: The results of a new method based on rate-distortion speech coding (speech coding by vector quantization), minimum cross-entropy pattern classification, and information-theoretic spectral distortion measures for discrete utterance speech recognition are presented.
Proceedings ArticleDOI

Isolated word recognition using phoneme-like templates

TL;DR: New technique for use in a word recognition system where word templates are represented as sequences of descrete phoneme-like (pseudo-phoneme) templates which are automatically determined from a training set of word utterances by a clustering technique.
Proceedings ArticleDOI

A real-time, isolated-word, speech recognition system for dictation transcription

TL;DR: The architecture of an experimental, real-time, isolated-word, speech recognition system with a 5,000-word vocabulary which can be used for dictating office correspondence is described and some recent experimental results obtained are given.
Related Papers (5)