Home
/
Topics
/
Spectrogram

Topic

Spectrogram

About: Spectrogram is a research topic. Over the lifetime, 5813 publications have been published within this topic receiving 81547 citations.

...read moreread less

Papers published on a yearly basis

2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Joint Acoustic-Modulation Frequency for Speaker Recognition

[...]

Tomi Kinnunen¹•Institutions (1)

Institute for Infocomm Research Singapore¹

14 May 2006

TL;DR: This feature describes the amplitude modulation spectrum of each subband, and results in a single feature vector per utterance that is directly used as the speaker's modulation frequency template, excluding the need for a separate training phase.

...read moreread less

Abstract: We propose a method for computing joint acoustic-modulation frequency feature for speaker recognition. This feature describes the amplitude modulation spectrum of each subband, and results in a single feature vector per utterance. This vector is directly used as the speaker's modulation frequency template, excluding the need for a separate training phase. The effects of analysis parameters and pattern matching are studied using the NIST 2001 corpus. When fusing the proposed feature with the baseline MFCC/GMM system, EER is reduced from 18.2% to 16.7%.

...read moreread less

41 citations

Journal Article•DOI•

Learning a Precedence Effect-Like Weighting Function for the Generalized Cross-Correlation Framework

[...]

Kevin W. Wilson¹, Trevor Darrell¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Nov 2006-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: Using the learned mappings in the generalized cross-correlation framework, improved localization performance is demonstrated and the resulting mappings exhibit behavior consistent with the well-known precedence effect from psychoacoustic studies.

...read moreread less

Abstract: Speech source localization in reverberant environments has proved difficult for automated microphone array systems. Because of its nonstationary nature, certain features observable in the reverberant speech signal, such as sudden increases in audio energy, provide cues to indicate time-frequency regions that are particularly useful for audio localization. We exploit these cues by learning a mapping from reverberated signal spectrograms to localization precision using ridge regression. Using the learned mappings in the generalized cross-correlation framework, we demonstrate improved localization performance. Additionally, the resulting mappings exhibit behavior consistent with the well-known precedence effect from psychoacoustic studies

...read moreread less

41 citations

Journal Article•DOI•

Clean speech reconstruction from MFCC vectors and fundamental frequency using an integrated front-end

[...]

Ben Milner¹, Xu Shao¹•Institutions (1)

University of East Anglia¹

01 Jun 2006-Speech Communication

TL;DR: Speech reconstruction tests reveal that the combination of robust fundamental frequency and voicing estimation with spectral subtraction in the integrated front-end leads to intelligible and relatively noise-free speech.

...read moreread less

41 citations

Journal Article•DOI•

Spontaneous Speech Emotion Recognition Using Multiscale Deep Convolutional LSTM

[...]

Shiqing Zhang¹, Xiaoming Zhao¹, Qi Tian²•Institutions (2)

Taizhou University¹, Huawei²

17 Oct 2019-IEEE Transactions on Affective Computing

TL;DR: Wang et al. as mentioned in this paper proposed a multiscale deep convolutional long short-term memory (LSTM) framework for spontaneous speech emotion recognition, where a deep CNN model was used to learn segment-level features on the basis of the created image-like three channels of spectrograms.

...read moreread less

Abstract: Recently, emotion recognition in real sceneries such as in the wild has attracted extensive attention in affective computing, because existing spontaneous emotions in real sceneries are more challenging and difficult to identify than other emotions. Motivated by the diverse effects of different lengths of audio spectrograms on emotion identification, this paper proposes a multiscale deep convolutional long short-term memory (LSTM) framework for spontaneous speech emotion recognition. Initially, a deep convolutional neural network (CNN) model is used to learn deep segment-level features on the basis of the created image-like three channels of spectrograms. Then, a deep LSTM model is adopted on the basis of the learned segment-level CNN features to capture the temporal dependency among all divided segments in an utterance for utterance-level emotion recognition. Finally, different emotion recognition results, obtained by combining CNN with LSTM at multiple lengths of segment-level spectrograms, are integrated by using a score-level fusion strategy. Experimental results on two challenging spontaneous emotional datasets, i.e., the AFEW5.0 and BAUM-1s databases, demonstrate the promising performance of the proposed method, outperforming state-of-the-art methods.

...read moreread less

41 citations

Book•

Speech Spectrum Analysis

[...]

Sean A. Fulop

26 May 2011

TL;DR: In this article, the Fourier power spectrum and spectrogram were assigned to the speech spectrum and the wavelet representation of the spectrograms and power spectra were used to estimate the speech signal.

...read moreread less

Abstract: Introduction.- Historical perspective on speech spectrum analysis.-The Fourier power spectrum and spectrogram.- Other time-frequency and wavelet representations.- The new frontier: Reassigned spectrograms and power spectra.- Linear prediction of the speech spectrum.- Homomorphic analysis and the cepstrum.- Formant tracking methods.

...read moreread less

41 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
…
108
109
110
111
112
113
114
…
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

7,848

Papers

107,060

Citations

No. of papers in the topic in previous years
Year	Papers
2024	1
2023	627
2022	1,396
2021	488
2020	595
2019	593

Spectrogram

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics