scispace - formally typeset
Search or ask a question
Topic

Speaker recognition

About: Speaker recognition is a research topic. Over the lifetime, 14990 publications have been published within this topic receiving 310061 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper extracts information about the cell phones from their speech records by using mel-frequency cepstrum coefficients and identify their brands and models using vector quantization and support vector machine classifiers.
Abstract: Speech signals convey various pieces of information such as the identity of its speaker, the language spoken, and the linguistic information about the text being spoken, etc. In this paper, we extract information about the cell phones from their speech records by using mel-frequency cepstrum coefficients and identify their brands and models. Closed-set identification rates of 92.56% and 96.42% have been obtained on a set of 14 different cell phones in the experiments using vector quantization and support vector machine classifiers, respectively.

70 citations

Journal ArticleDOI
TL;DR: Perception by normal-hearing subjects of gender and identity of a talker as a function of the number of channels in spectrally reduced speech was examined and results showed that gender and talker identification was better for the sine-wave processor, and that performance through the noise-band processor was more sensitive to thenumber of channels.
Abstract: Considerable research on speech intelligibility for cochlear-implant users has been conducted using acoustic simulations with normal-hearing subjects. However, some relevant topics about perception through cochlear implants remain scantly explored. The present study examined the perception by normal-hearing subjects of gender and identity of a talker as a function of the number of channels in spectrally reduced speech. Two simulation strategies were compared. They were implemented by two different processors that presented signals as either the sum of sine waves at the center of the channels or as the sum of noise bands. In Experiment 1, 15 subjects determined the gender of 40 talkers (20 males + 20 females) from a natural utterance processed through 3, 4, 5, 6, 8, 10, 12, and 16 channels with both processors. In Experiment 2, 56 subjects matched a natural sentence uttered by 10 talkers with the corresponding simulation replicas processed through 3, 4, 8, and 16 channels for each processor. In Experiment 3, 72 subjects performed the same task but different sentences were used for natural and processed stimuli. A control Experiment 4 was conducted to equate the processing steps between the two simulation strategies. Results showed that gender and talker identification was better for the sine-wave processor, and that performance through the noise-band processor was more sensitive to the number of channels. Implications and possible explanations for the superiority of sine-wave simulations are discussed.

70 citations

Proceedings ArticleDOI
14 Sep 2014
TL;DR: This paper applies a convolutional neural network trained for automatic speech recognition (ASR) to the task of speaker identification (SID), and in the CNN/i-vector front end, the sufficient statistics are collected based on the outputs of the CNN as opposed to the traditional universal background model (UBM).
Abstract: This paper applies a convolutional neural network (CNN) trained for automatic speech recognition (ASR) to the task of speaker identification (SID). In the CNN/i-vector front end, the sufficient statistics are collected based on the outputs of the CNN as opposed to the traditional universal background model (UBM). Evaluated on heavily degraded speech data, the CNN/i-vector front end provides performance comparable to the UBM/i-vector baseline. The combination of these approaches, however, is shown to provide improvements of 26% in miss rate to considerably outperform the fusion of two different features in the traditional UBM/i-vectors approach. An analysis of the language- and channel-dependency of the CNN/i-vector approach is also provided to highlight future research directions. Index Terms: Deep neural networks, Convolutional neural networks, Speaker recognition, i-vectors, noisy speech

70 citations

Journal ArticleDOI
TL;DR: This work combines the decisions of two classifiers as an alternative means of improving the performance of a speaker recognition system in adverse environments and shows that there is information that is not captured in the popular mel-frequency cepstral coefficients (MFCC), and the parametric feature-sets (PFS) is able to add further information for improved performance.

70 citations

Proceedings ArticleDOI
22 May 2011
TL;DR: An audio/video database, especially built for the speaker diarization task, based on different video genres, is described, which highlights the difficulties encountered in this context, mainly linked to the database heterogeneity.
Abstract: In the last ten years, internet as well as its applications changed significantly, mainly thanks to the raising of available personal resources. Concerning multimedia, the most impressive evolution is the continuous growing success of the video sharing websites. But with this success come the difficulties to efficiently search, index and access relevant information about these documents. Speaker diarization is an important task in the overall information retrieval process. This paper describes an audio/video database, especially built for the speaker diarization task, based on different video genres. Through some preliminary experiments, it highlights the difficulties encountered in this context, mainly linked to the database heterogeneity.

69 citations


Network Information
Related Topics (5)
Feature vector
48.8K papers, 954.4K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
82% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Signal processing
73.4K papers, 983.5K citations
81% related
Decoding methods
65.7K papers, 900K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023165
2022468
2021283
2020475
2019484
2018420