Home
/
Topics
/
Speaker recognition

Topic

Speaker recognition

About: Speaker recognition is a research topic. Over the lifetime, 14990 publications have been published within this topic receiving 310061 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•

Data driven example based continuous speech recognition.

[...]

Mathias De Wachter¹, Kris Demuynck¹, Dirk Van Compernolle¹, Patrick Wambacq¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Sep 2003

TL;DR: This paper shows how this problem can be tackled using a data driven approach which selects appropriate speech examples as candidates for DTW-alignment, resulting in an explosion of the search space.

...read moreread less

Abstract: The dominant acoustic modeling methodology based on Hidden Markov Models is known to have certain weaknesses Partial solutions to these flaws have been presented, but the fundamental problem remains: compression of the data to a compact HMM discards useful information such as time dependencies and speaker information In this paper, we look at pure example based recognition as a solution to this problem By replacing the HMM with the underlying examples, all information in the training data is retained We show how information about speaker and environment can be used, introducing a new interpretation of adaptation The basis for the recognizer is the wellknown DTW algorithm, which has often been used for small tasks However, large vocabulary speech recognition introduces new demands, resulting in an explosion of the search space We show how this problem can be tackled using a data driven approach which selects appropriate speech examples as candidates for DTW-alignment

...read moreread less

66 citations

Natural statistical models for automatic speech recognition

[...]

Jeff A. Bilmes, Nelson Morgan

01 Jan 1999

TL;DR: A new method for automatic speech recognition is developed where the natural statistical properties of speech are used to determine the probabilistic model, and can be seen as a general discriminative structure-learning procedure for Bayesian networks.

...read moreread less

Abstract: The performance of state-of-the-art speech recognition systems is still far worse than that of humans. This is partly caused by the use of poor statistical models. In a general statistical pattern classification task, the probabilistic models should represent the statistical structure unique to and distinguishing those objects to be classified. In many cases, however, model families are selected without verification of their ability to represent vital discriminative properties. For example, Hidden Markov Models (HMMs) are frequently used in automatic speech recognition systems even though they possess conditional independence properties that might cause inaccuracies when modeling and classifying speech signals. In this work, a new method for automatic speech recognition is developed where the natural statistical properties of speech are used to determine the probabilistic model. Starting from an HMM, new models are created by adding dependencies only if they are not already well captured by the HMM, and only if they increase the model's ability to distinguish one object from another. Based on conditional mutual information, a new measure is developed and used for dependency selection. If dependencies are selected to maximize this measure, then the class posterior probability is better approximated leading to a lower Bayes classification error. The method can be seen as a general discriminative structure-learning procedure for Bayesian networks. In a large-vocabulary isolated-word speech recognition task, test results have shown that the new models can result in an appreciable word-error reduction relative to comparable HMM systems.

...read moreread less

66 citations

Patent•

Selection of superwords based on criteria relevant to both speech recognition and understanding

[...]

Allen Louis Gorin¹, Giuseppe Riccardi¹, Jeremy H. Wright¹•Institutions (1)

AT&T¹

23 Oct 1998

TL;DR: In this paper, superwords are used to refer to those word combinations which are so often spoken that they are recognized as units or should have models to reflect them in the language model.

...read moreread less

Abstract: This invention is directed to the selection of superwords based on a criterion relevant to speech recognition and understanding. Superwords are used to refer to those word combinations which are so often spoken that they are recognized as units or should have models to reflect them in the language model. The selected superwords are placed in a lexicon along with selected meaningful phrases. The lexicon is then used by a speech recognizer to improve recognition of input speech utterances for the proper routing of a user's task objectives.

...read moreread less

66 citations

Proceedings Article•DOI•

Filterbank Design for End-to-end Speech Separation

[...]

Manuel Pariente¹, Samuele Cornell², Antoine Deleforge¹, Emmanuel Vincent¹•Institutions (2)

University of Lorraine¹, Marche Polytechnic University²

04 May 2020

TL;DR: In this paper, the authors extend real-valued learned and parameterized filterbanks into complex-valued analytic filterbanks and define a set of corresponding representations and masking strategies, and evaluate these filterbanks on a newly released noisy speech separation dataset.

...read moreread less

Abstract: Single-channel speech separation has recently made great progress thanks to learned filterbanks as used in ConvTasNet. In parallel, parameterized filterbanks have been proposed for speaker recognition where only center frequencies and bandwidths are learned. In this work, we extend real-valued learned and parameterized filterbanks into complex-valued analytic filterbanks and define a set of corresponding representations and masking strategies. We evaluate these filterbanks on a newly released noisy speech separation dataset (WHAM). The results show that the proposed analytic learned filterbank consistently outperforms the real-valued filterbank of ConvTasNet. Also, we validate the use of parameterized filterbanks and show that complex-valued representations and masks are beneficial in all conditions. Finally, we show that the STFT achieves its best performance for 2 ms windows.

...read moreread less

66 citations

Proceedings Article•

A high-level approach to confidence estimation in speech recognition.

[...]

Stephen Cox, Srinandan Dasmahapatra

01 Jan 1999

TL;DR: A method for constructing "semantic similarities" between words and hence estimating a confidence is described, based on the construction of "metamodels," which generate alternative word hypotheses for an utterance.

...read moreread less

66 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
…
188
189
190
191
192
193
194
…
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

15,632

Papers

337,766

Citations

No. of papers in the topic in previous years
Year	Papers
2023	165
2022	468
2021	283
2020	475
2019	484
2018	420

Speaker recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics