Home
/
Topics
/
Speaker recognition

Topic

Speaker recognition

About: Speaker recognition is a research topic. Over the lifetime, 14990 publications have been published within this topic receiving 310061 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Sequential, nonparametric speech recognition and speaker identification

[...]

Laurence S. Gillick, Andres Corrada-Emmanuel, Michael J. Newman, Barbara Peskin

31 Mar 1998

TL;DR: In this article, a speech sample is received and speech recognition is performed on the speech sample to produce recognition results, and the recognition results are evaluated in view of the training data and the identification of the speech elements to which the portions of training data are related.

...read moreread less

Abstract: A speech sample is evaluated using a computer. Training data that include samples of speech are received and stored along with identification of speech elements to which portions of the training data are related. A speech sample is received and speech recognition is performed on the speech sample to produce recognition results. Finally, the recognition results are evaluated in view of the training data and the identification of the speech elements to which the portions of the training data are related. The technique may be used to perform tasks such as speech recognition, speaker identification, and language identification.

...read moreread less

118 citations

Patent•

Source-dependent text-to-speech system

[...]

Nicholas J. Cutaia¹•Institutions (1)

Cisco Systems, Inc.¹

28 Apr 2004

TL;DR: In this article, a speech feature vector for a voice associated with a source of a text message was determined and compared to speaker models, and a speaker model was selected as a preferred match for the voice based on the comparison.

...read moreread less

Abstract: A method of generating speech from textmessages includes determining a speech feature vector for a voice associated with a source of a text message, and comparing the speech feature vector to speaker models. The method also includes selecting one of the speaker models as a preferred match for the voice based on the comparison, and generating speech from the text message based on the selected speaker model.

...read moreread less

118 citations

Proceedings Article•DOI•

Speaker recognition using Mel frequency Cepstral Coefficients (MFCC) and Vector quantization (VQ) techniques

[...]

Jorge Salvador Valdez Martinez¹, Hector Perez¹, Enrique Escamilla¹, Masahisa Mabo Suzuki²•Institutions (2)

Instituto Politécnico Nacional¹, University of Electro-Communications²

26 Apr 2012

TL;DR: A fast and accurate automatic voice recognition algorithm using Mel frequency Cepstral Coefficient (MFCC) to extract the features from voice and Vector quantization technique to identify the speaker.

...read moreread less

Abstract: This paper presents a fast and accurate automatic voice recognition algorithm. We use Mel frequency Cepstral Coefficient (MFCC) to extract the features from voice and Vector quantization technique to identify the speaker, this technique is usually used in data compression, it allows to model a probability functions by the distribution of different vectors, the results that we achieve were 100% of precision with a database of 10 speakers.

...read moreread less

118 citations

Patent•

Distributed wireless speaker system with automatic configuration determination when new speakers are added

[...]

Gregory Peter Carlsson¹, Steven Richman¹, James R. Milne¹•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

20 Jan 2014

TL;DR: In this paper, an application detecting speaker locations and prompting a user to input rough room boundaries and a desired listener location in the room is used to determine optimum speaker locations/frequency assignations/speaker parameters.

...read moreread less

Abstract: In an audio speaker network, setup of speaker location, sound track or channel assignation, and speaker parameters is facilitated by an application detecting speaker locations and prompting a user to input rough room boundaries and a desired listener location in the room. Based on this, optimum speaker locations/frequency assignations/speaker parameters may be determined and output.

...read moreread less

117 citations

Journal Article•DOI•

Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system

[...]

B. Yegnanarayana¹, S. R. M. Prasanna¹, J.M. Zachariah¹, C.S. Gupta•Institutions (1)

Indian Institute of Technology Madras¹

20 Jun 2005-IEEE Transactions on Speech and Audio Processing

TL;DR: A method based on the vowel onset point (VOP) is proposed for locating the end-points of an utterance and combining the evidence from these features seem to improve the performance of the system significantly.

...read moreread less

Abstract: This paper proposes a text-dependent (fixed-text) speaker verification system which uses different types of information for making a decision regarding the identity claim of a speaker. The baseline system uses the dynamic time warping (DTW) technique for matching. Detection of the end-points of an utterance is crucial for the performance of the DTW-based template matching. A method based on the vowel onset point (VOP) is proposed for locating the end-points of an utterance. The proposed method for speaker verification uses the suprasegmental and source features, besides spectral features. The suprasegmental features such as pitch and duration are extracted using the warping path information in the DTW algorithm. Features of the excitation source, extracted using the neural network models, are also used in the text-dependent speaker verification system. Although the suprasegmental and source features individually may not yield good performance, combining the evidence from these features seem to improve the performance of the system significantly. Neural network models are used to combine the evidence from multiple sources of information.

...read moreread less

117 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
…
87
88
89
90
91
92
93
…
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

15,632

Papers

337,766

Citations

No. of papers in the topic in previous years
Year	Papers
2023	165
2022	468
2021	283
2020	475
2019	484
2018	420

Speaker recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics