scispace - formally typeset
PatentDOI

Method and apparatus for speaker identification using mixture discriminant analysis to develop speaker models

TLDR
In this paper, a speaker identification system is provided that constructs speaker models using a discriminant analysis technique where the data in each class is modeled by Gaussian mixtures, and the likelihood scores of the second set of feature vectors are computed using speaker models trained using mixture discriminant analyses.
Abstract
A speaker identification system is provided that constructs speaker models using a discriminant analysis technique where the data in each class is modeled by Gaussian mixtures. The speaker identification method and apparatus determines the identity of a speaker, as one of a small group, based on a sentence-length password utterance. A speaker's utterance is received and a sequence of a first set of feature vectors are computed based on the received utterance. The first set of feature vectors are then transformed into a second set of feature vectors using transformations specific to a particular segmentation unit, and likelihood scores of the second set of feature vectors are computed using speaker models trained using mixture discriminant analysis. The likelihood scores are then combined to determine an utterance score and the speaker's identity is validated based on the utterance score. The speaker identification method and apparatus also includes training and enrollment phases. In the enrollment phase the speaker's password utterance is received multiple times. A transcription of the password utterance as a sequence of phones is obtained, and the phone string is stored in a database containing phone strings of other speakers in the group. In the training phase, the first set of feature vectors are extracted from each password utterance and the phone boundaries for each phone in the password transcription are obtained using a speaker independent phone recognizer. A mixture model is developed for each phone of a given speaker's password. Then, using the feature vectors from the password utterances of all of the speakers in the group, transformation parameters and transformed models are generated for each phone and speaker, using mixture discriminant analysis.

read more

Citations
More filters
Patent

Method and system for considering information about an expected response when performing speech recognition

TL;DR: In this paper, a speech recognition system receives and analyzes speech input from a user in order to recognize and accept a response from the user, under certain conditions, information about the response expected from user may be available.
Patent

Systems and methods for dynamically improving user intelligibility of synthesized speech in a work environment

TL;DR: In this paper, a method and apparatus that dynamically adjust operational parameters of a text-to-speech engine in a speech-based system are disclosed, in response to one or more environmental conditions.
Patent

Methods and systems for identifying errors in a speech recognition system

TL;DR: In this article, a method for identifying possible errors made by a speech recognition system without using a transcript of words input to the system is described. But this method does not consider the use of a word-to-word model.
Patent

Method and system for mitigating delay in receiving audio stream during production of sound from audio stream

TL;DR: In this article, a communication component modifies production of an audio waveform at determined modification segments to mitigate the effects of a delay in processing and/or receiving a subsequent audio wave form.
Patent

Method and apparatus for searching for music based on speech recognition

TL;DR: In this paper, a method and apparatus for searching music based on speech recognition is presented, where search scores with respect to a speech input using an acoustic model, calculating preferences in music using a user preference model, and extracting a music list according to the search scores in which the preferences are reflected, a personal expression of a search result using speech recognition can be achieved.
References
More filters
Journal ArticleDOI

Discriminant Analysis by Gaussian Mixtures

TL;DR: This paper fits Gaussian mixtures to each class to facilitate effective classification in non-normal settings, especially when the classes are clustered.
Journal ArticleDOI

Flexible Discriminant Analysis by Optimal Scoring

TL;DR: Nonparametric versions of discriminant analysis are obtained by replacing linear regression by any nonparametric regression method so that any multiresponse regression technique can be postprocessed to improve its classification performance.
Patent

Sequential, nonparametric speech recognition and speaker identification

TL;DR: In this article, a speech sample is received and speech recognition is performed on the speech sample to produce recognition results, and the recognition results are evaluated in view of the training data and the identification of the speech elements to which the portions of training data are related.
Patent

Signal pattern recognition apparatus comprising parameter training controller for training feature conversion parameters and discriminant functions

TL;DR: In a signal pattern recognition apparatus, a plurality of feature transformation sections respectively transform an inputted signal pattern into vectors in feature spaces corresponding respectively to predetermined classes using a predetermined transformation parameter corresponding to each of the classes so as to emphasize a feature of each class as mentioned in this paper.
Proceedings ArticleDOI

Sub-word unit talker verification using hidden Markov models

TL;DR: A talker verification system based on characterizing talker utterances as sequences of subword units represented by hidden Markov models (HMMs) was implemented and tested and confirms that excellent verification performance can be obtained using HMMs of sub word units.