Home
/
Topics
/
Speaker recognition

Topic

Speaker recognition

About: Speaker recognition is a research topic. Over the lifetime, 14990 publications have been published within this topic receiving 310061 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Recognition of Brand and Models of Cell-Phones From Recorded Speech Signals

[...]

Cemal Hanilci¹, Figen Ertaş¹, T. Ertas¹, Ömer Eskidere¹•Institutions (1)

Uludağ University¹

01 Apr 2012-IEEE Transactions on Information Forensics and Security

TL;DR: This paper extracts information about the cell phones from their speech records by using mel-frequency cepstrum coefficients and identify their brands and models using vector quantization and support vector machine classifiers.

...read moreread less

Abstract: Speech signals convey various pieces of information such as the identity of its speaker, the language spoken, and the linguistic information about the text being spoken, etc. In this paper, we extract information about the cell phones from their speech records by using mel-frequency cepstrum coefficients and identify their brands and models. Closed-set identification rates of 92.56% and 96.42% have been obtained on a set of 14 different cell phones in the experiments using vector quantization and support vector machine classifiers, respectively.

...read moreread less

70 citations

Journal Article•DOI•

Gender and speaker identification as a function of the number of channels in spectrally reduced speech.

[...]

Julio González, Juan Carlos Oliver

28 Jun 2005-Journal of the Acoustical Society of America

TL;DR: Perception by normal-hearing subjects of gender and identity of a talker as a function of the number of channels in spectrally reduced speech was examined and results showed that gender and talker identification was better for the sine-wave processor, and that performance through the noise-band processor was more sensitive to thenumber of channels.

...read moreread less

Abstract: Considerable research on speech intelligibility for cochlear-implant users has been conducted using acoustic simulations with normal-hearing subjects. However, some relevant topics about perception through cochlear implants remain scantly explored. The present study examined the perception by normal-hearing subjects of gender and identity of a talker as a function of the number of channels in spectrally reduced speech. Two simulation strategies were compared. They were implemented by two different processors that presented signals as either the sum of sine waves at the center of the channels or as the sum of noise bands. In Experiment 1, 15 subjects determined the gender of 40 talkers (20 males + 20 females) from a natural utterance processed through 3, 4, 5, 6, 8, 10, 12, and 16 channels with both processors. In Experiment 2, 56 subjects matched a natural sentence uttered by 10 talkers with the corresponding simulation replicas processed through 3, 4, 8, and 16 channels for each processor. In Experiment 3, 72 subjects performed the same task but different sentences were used for natural and processed stimuli. A control Experiment 4 was conducted to equate the processing steps between the two simulation strategies. Results showed that gender and talker identification was better for the sine-wave processor, and that performance through the noise-band processor was more sensitive to the number of channels. Implications and possible explanations for the superiority of sine-wave simulations are discussed.

...read moreread less

70 citations

Proceedings Article•DOI•

Application of convolutional neural networks to speaker recognition in noisy conditions.

[...]

Mitchell McLaren¹, Yun Lei¹, Nicolas Scheffer¹, Luciana Ferrer¹•Institutions (1)

SRI International¹

14 Sep 2014

TL;DR: This paper applies a convolutional neural network trained for automatic speech recognition (ASR) to the task of speaker identification (SID), and in the CNN/i-vector front end, the sufficient statistics are collected based on the outputs of the CNN as opposed to the traditional universal background model (UBM).

...read moreread less

Abstract: This paper applies a convolutional neural network (CNN) trained for automatic speech recognition (ASR) to the task of speaker identification (SID). In the CNN/i-vector front end, the sufficient statistics are collected based on the outputs of the CNN as opposed to the traditional universal background model (UBM). Evaluated on heavily degraded speech data, the CNN/i-vector front end provides performance comparable to the UBM/i-vector baseline. The combination of these approaches, however, is shown to provide improvements of 26% in miss rate to considerably outperform the fusion of two different features in the traditional UBM/i-vectors approach. An analysis of the language- and channel-dependency of the CNN/i-vector approach is also provided to highlight future research directions. Index Terms: Deep neural networks, Convolutional neural networks, Speaker recognition, i-vectors, noisy speech

...read moreread less

70 citations

Journal Article•DOI•

Rapid and brief communication: Combining classifier decisions for robust speaker identification

[...]

Daniel J. Mashao¹, Marshalleno Skosan¹•Institutions (1)

University of Cape Town¹

01 Jan 2006-Pattern Recognition

TL;DR: This work combines the decisions of two classifiers as an alternative means of improving the performance of a speaker recognition system in adverse environments and shows that there is information that is not captured in the popular mel-frequency cepstral coefficients (MFCC), and the parametric feature-sets (PFS) is able to add further information for improved performance.

...read moreread less

70 citations

Proceedings Article•DOI•

Speaker diarization of heterogeneous web video files: A preliminary study

[...]

Pierre Clement¹, Thierry Bazillon², Corinne Fredouille¹•Institutions (2)

University of Avignon¹, Aix-Marseille University²

22 May 2011

TL;DR: An audio/video database, especially built for the speaker diarization task, based on different video genres, is described, which highlights the difficulties encountered in this context, mainly linked to the database heterogeneity.

...read moreread less

Abstract: In the last ten years, internet as well as its applications changed significantly, mainly thanks to the raising of available personal resources. Concerning multimedia, the most impressive evolution is the continuous growing success of the video sharing websites. But with this success come the difficulties to efficiently search, index and access relevant information about these documents. Speaker diarization is an important task in the overall information retrieval process. This paper describes an audio/video database, especially built for the speaker diarization task, based on different video genres. Through some preliminary experiments, it highlights the difficulties encountered in this context, mainly linked to the database heterogeneity.

...read moreread less

69 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
…
176
177
178
179
180
181
182
…
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

15,632

Papers

337,766

Citations

No. of papers in the topic in previous years
Year	Papers
2023	165
2022	468
2021	283
2020	475
2019	484
2018	420

Speaker recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics