Home
/
Topics
/
Speaker recognition

Topic

Speaker recognition

About: Speaker recognition is a research topic. Over the lifetime, 14990 publications have been published within this topic receiving 310061 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Generalized Linear Kernels for One-Versus-All Classification: Application to Speaker Recognition

[...]

Andrew O. Hatch, Andreas Stolcke

14 May 2006

TL;DR: This paper examines the problem of kernel selection for one-versus-all (OVA) classification of multiclass data with support vector machines (SVMs) and focuses specifically on generalized linear kernels of the form, k(x1,x2) = xT 1Rx2 , where R is a positive semidefinite matrix.

...read moreread less

Abstract: In this paper we examie the problem of kernel seection for one-versus-all (OVA) classification of multiclass data with support vector machines (SVMs). We focus specifically on the problem of training what we refer to as generalized linear kernels—that is, kernels of the form, k(x 1 , x 2 ) = xT 1 Rx 2 , where R is a positive semidefinite matrix. Our approach for training k(x 1 , x 2 ) involves first constructing a set of upper bounds on the rates of false positives and false negatives at a given score threshold. Under various conditions, minimizing these bounds leads to the closed-form solution, R = W-1, where W is the expected within-class covariance matrix of the data. We tested various parameterizations of R, including a diagonal parameterization that simply performs per-feature variance normalization, on the 1-conversation training condition of the SRE-2003 and SRE-2004 speaker reecognition tasks. In experiments on a state-of-the-art MLLR-SVM speaker recognition system [1], the parameterization, R = [see above equation in pdf file], wheere [see above equation in pdf file] is a smoothed estimate of W, achieves relative reductions in the minimum decision cost function (DCF) [2] of up to 22% below the results obtained when R does per-feature variance normalization.

...read moreread less

68 citations

Journal Article•DOI•

An improved method for voice pathology detection by means of a HMM-based feature space transformation

[...]

Julián D. Arias-Londoño¹, Juan Ignacio Godino-Llorente¹, Nicolás Sáenz-Lechón¹, Víctor Osma-Ruiz¹, Germán Castellanos-Domínguez² - Show less +1 more•Institutions (2)

Technical University of Madrid¹, National University of Colombia²

01 Sep 2010-Pattern Recognition

TL;DR: The proposed feature space transformation technique demonstrates a significant improvement of the performance with no addition of new features to the original input space and it is expected that this technique could provide good results in other areas such as speaker verification and/or identification.

...read moreread less

68 citations

Patent•

Method and apparatus for word counting in continuous speech recognition useful for reliable barge-in and early end of speech detection

[...]

Anand Rangaswamy Setlur¹, Rafid Antoon Aurora Sukkar¹•Institutions (1)

Alcatel-Lucent¹

31 Jul 1998

TL;DR: In this paper, the authors used therapidly available speech recognition results to provide intelligent barge-in for voice-response systems and, to count words to output sub-sequences to provide paralleling and/or pipelining of tasks related to the entire word sequence to increase processing throughput.

...read moreread less

Abstract: Speech recognition technology has attained maturity such that the most likely speech recognition result has been reached and is available before an energy based termination of speech has been made. The present invention innovatively uses therapidly available speech recognition results to provide intelligent barge-in forvoice-response systems and, to count words to output sub-sequences to provide paralleling and/or pipelining of tasks related to the entire word sequence to increase processing throughput.

...read moreread less

68 citations

Journal Article•DOI•

Speech Enhancement and Recognition in Meetings With an Audio–Visual Sensor Array

[...]

H.K. Maganti¹, Daniel Gatica-Perez², Iain McCowan³•Institutions (3)

University of Ulm¹, École Polytechnique Fédérale de Lausanne², Queensland University of Technology³

01 Nov 2007-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: The accurate speaker tracking provided by the audio-visual sensor array proved beneficial to improve the recognition performance in a microphone array-based speech recognition system, both in terms of enhancement and recognition.

...read moreread less

Abstract: This paper addresses the problem of distant speech acquisition in multiparty meetings, using multiple microphones and cameras. Microphone array beamforming techniques present a potential alternative to close-talking microphones by providing speech enhancement through spatial filtering. Beamforming techniques, however, rely on knowledge of the speaker location. In this paper, we present an integrated approach, in which an audio-visual multiperson tracker is used to track active speakers with high accuracy. Speech enhancement is then achieved using microphone array beamforming followed by a novel postfiltering stage. Finally, speech recognition is performed to evaluate the quality of the enhanced speech signal. The approach is evaluated on data recorded in a real meeting room for stationary speaker, moving speaker, and overlapping speech scenarios. The results show that the speech enhancement and recognition performance achieved using our approach are significantly better than a single table-top microphone and are comparable to a lapel microphone for some of the scenarios. The results also indicate that the audio-visual-based system performs significantly better than audio-only system, both in terms of enhancement and recognition. This reveals that the accurate speaker tracking provided by the audio-visual sensor array proved beneficial to improve the recognition performance in a microphone array-based speech recognition system.

...read moreread less

68 citations

Patent•

Individual verification apparatus

[...]

Sadakazu Watanabe, Hidenori Shinoda

27 Jan 1983

TL;DR: In this article, an individual verification apparatus consisting of a verification data file (20), a speech input section (10), a data memory (30), speech recognition unit (40), and a speaker verification unit (50) is described.

...read moreread less

Abstract: An individual verification apparatus comprises a verification data file (20), a speech input section (10), a data memory (30), a speech recognition unit (40), and a speaker verification unit (50). In the verification data file key codes set by customers and corresponding reference data for individual verification are registered. Speech of the key code spoken by a customer is processed by the speech input section (10) and the result is stored in the data memory (30). The speech recognition unit (40) recognizes the input key code based on the key code data stored in the data memory (30). The speaker verification unit (50) verifies the customer by comparing the key code data with speech reference data of customers having the recognized key code.

...read moreread less

68 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
…
181
182
183
184
185
186
187
…
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

15,632

Papers

337,766

Citations

No. of papers in the topic in previous years
Year	Papers
2023	165
2022	468
2021	283
2020	475
2019	484
2018	420

Speaker recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics