scispace - formally typeset
Search or ask a question
Topic

Speaker recognition

About: Speaker recognition is a research topic. Over the lifetime, 14990 publications have been published within this topic receiving 310061 citations.


Papers
More filters
Proceedings Article
John S. Bridle, Stephen Cox1
01 Oct 1990
TL;DR: A method of training this network to "tune in" the speaker parameters to a particular speaker based on a trick for converting a supervised network to an unsupervised mode is outlined, indicating an improvement over speaker-independent performance and, for unlabelled data, a performance close to that achieved on labelled data.
Abstract: A particular form of neural network is described, which has terminals for acoustic patterns, class labels and speaker parameters. A method of training this network to "tune in" the speaker parameters to a particular speaker is outlined, based on a trick for converting a supervised network to an unsupervised mode. We describe experiments using this approach in isolated word recognition based on whole-word hidden Markov models. The results indicate an improvement over speaker-independent performance and, for unlabelled data, a performance close to that achieved on labelled data.

65 citations

PatentDOI
TL;DR: In this article, an apparatus operates to identify the speech signal of an unknown speaker as one of a finite number of speakers, each speaker is modeled and recognized with any example of their speech, and the output is a list of scores that measure how similar the input speaker is to each of the speakers whose models are stored in the system.
Abstract: An apparatus operates to identify the speech signal of an unknown speaker as one of a finite number of speakers. Each speaker is modeled and recognized with any example of their speech. The input to the system is analog speech and the output is a list of scores that measure how similar the input speaker is to each of the speakers whose models are stored in the system. The system includes front end processing means which is responsive to the speech signal to provide digitized samples of the speech signal at an output which are stored in a memory. The stored digitized samples are then retrieved and divided into frames. The frames are processed to provide a series of speech parameters indicative of the nature of the speech content in each of the frames. The processor for producing the speech parameters is coupled to either a speaker modeling means, whereby a model for each speaker is provided and consequently stored, or a speaker recognition mode, whereby the speech parameters are again processed with current parameters and compared with the stored parameters during each speech frame. The comparison is accomplished over a predetermined number of frames whereby a favorable comparison is indicative of a known speaker for which a model is stored.

65 citations

Patent
Tomohiro Koganei1
26 Sep 2013
TL;DR: A speech recognition apparatus includes a speech acquisition unit which acquires speech uttered by a user, a recognition result acquisition unit that acquires a result of recognition performed on the acquired speech, an extraction unit which, when the recognition result includes a keyword and a selection command that is used for selecting one of selectable information items, extracts a selection candidate that includes the keyword, and a display control unit which changes a display manner of the display information according to the second selection mode switched from the first selection mode.
Abstract: A speech recognition apparatus includes: a speech acquisition unit which acquires speech uttered by a user; a recognition result acquisition unit which acquires a result of recognition performed on the acquired speech; an extraction unit which, when the recognition result includes a keyword and a selection command that is used for selecting one of selectable information items, extracts a selection candidate that includes the keyword; a selection mode switching unit which, when more than one selection candidate is extracted, switches a selection mode from a first selection mode that allows selection among the selectable information items to a second selection that allows selection among the selection candidates; a display control unit which changes a display manner of the display information, according to the second selection mode switched from the first selection mode; and a selection unit which selects one of the selection candidates, according to an entry from the user.

65 citations

Proceedings ArticleDOI
06 Apr 2003
TL;DR: A new algorithm for audio segmentation that is both accurate and uses fewer computational resources than other approaches is developed, which performs substantially better than the standard symmetric Kullback-Liebler, KL2, and is much faster than the full BIC.
Abstract: The paper describes our work on the development of an audio segmentation, classification and clustering system applied to a broadcast news task for the European Portuguese language. We developed a new algorithm for audio segmentation that is both accurate and uses fewer computational resources than other approaches. Our speaker clustering module uses a modified BIC (Bayesian information criterion) algorithm which performs substantially better than the standard symmetric Kullback-Liebler, KL2, and is much faster than the full BIC. Finally, we developed a scheme for tagging certain speaker clusters (anchors) using trained cluster models. A series of tests were conducted showing the advantage of the new algorithms. This system is part of a prototype system that is daily processing the main news show of the national Portuguese broadcaster.

64 citations

Journal ArticleDOI
TL;DR: Experimental results show that inclusion of lip motion modality provides further performance gains over those which are obtained by fusion of audio and lip texture alone, in both speaker identification and isolated word recognition scenarios.

64 citations


Network Information
Related Topics (5)
Feature vector
48.8K papers, 954.4K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
82% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Signal processing
73.4K papers, 983.5K citations
81% related
Decoding methods
65.7K papers, 900K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023165
2022468
2021283
2020475
2019484
2018420