scispace - formally typeset
Search or ask a question
Topic

Speaker recognition

About: Speaker recognition is a research topic. Over the lifetime, 14990 publications have been published within this topic receiving 310061 citations.


Papers
More filters
Patent
31 Mar 1998
TL;DR: In this article, a speech sample is received and speech recognition is performed on the speech sample to produce recognition results, and the recognition results are evaluated in view of the training data and the identification of the speech elements to which the portions of training data are related.
Abstract: A speech sample is evaluated using a computer. Training data that include samples of speech are received and stored along with identification of speech elements to which portions of the training data are related. A speech sample is received and speech recognition is performed on the speech sample to produce recognition results. Finally, the recognition results are evaluated in view of the training data and the identification of the speech elements to which the portions of the training data are related. The technique may be used to perform tasks such as speech recognition, speaker identification, and language identification.

118 citations

Patent
28 Apr 2004
TL;DR: In this article, a speech feature vector for a voice associated with a source of a text message was determined and compared to speaker models, and a speaker model was selected as a preferred match for the voice based on the comparison.
Abstract: A method of generating speech from textmessages includes determining a speech feature vector for a voice associated with a source of a text message, and comparing the speech feature vector to speaker models. The method also includes selecting one of the speaker models as a preferred match for the voice based on the comparison, and generating speech from the text message based on the selected speaker model.

118 citations

Proceedings ArticleDOI
26 Apr 2012
TL;DR: A fast and accurate automatic voice recognition algorithm using Mel frequency Cepstral Coefficient (MFCC) to extract the features from voice and Vector quantization technique to identify the speaker.
Abstract: This paper presents a fast and accurate automatic voice recognition algorithm. We use Mel frequency Cepstral Coefficient (MFCC) to extract the features from voice and Vector quantization technique to identify the speaker, this technique is usually used in data compression, it allows to model a probability functions by the distribution of different vectors, the results that we achieve were 100% of precision with a database of 10 speakers.

118 citations

Patent
20 Jan 2014
TL;DR: In this paper, an application detecting speaker locations and prompting a user to input rough room boundaries and a desired listener location in the room is used to determine optimum speaker locations/frequency assignations/speaker parameters.
Abstract: In an audio speaker network, setup of speaker location, sound track or channel assignation, and speaker parameters is facilitated by an application detecting speaker locations and prompting a user to input rough room boundaries and a desired listener location in the room. Based on this, optimum speaker locations/frequency assignations/speaker parameters may be determined and output.

117 citations

Journal ArticleDOI
TL;DR: A method based on the vowel onset point (VOP) is proposed for locating the end-points of an utterance and combining the evidence from these features seem to improve the performance of the system significantly.
Abstract: This paper proposes a text-dependent (fixed-text) speaker verification system which uses different types of information for making a decision regarding the identity claim of a speaker. The baseline system uses the dynamic time warping (DTW) technique for matching. Detection of the end-points of an utterance is crucial for the performance of the DTW-based template matching. A method based on the vowel onset point (VOP) is proposed for locating the end-points of an utterance. The proposed method for speaker verification uses the suprasegmental and source features, besides spectral features. The suprasegmental features such as pitch and duration are extracted using the warping path information in the DTW algorithm. Features of the excitation source, extracted using the neural network models, are also used in the text-dependent speaker verification system. Although the suprasegmental and source features individually may not yield good performance, combining the evidence from these features seem to improve the performance of the system significantly. Neural network models are used to combine the evidence from multiple sources of information.

117 citations


Network Information
Related Topics (5)
Feature vector
48.8K papers, 954.4K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
82% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Signal processing
73.4K papers, 983.5K citations
81% related
Decoding methods
65.7K papers, 900K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023165
2022468
2021283
2020475
2019484
2018420