Home
/
Topics
/
TIMIT

Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Monaural speaker segregation using group delay spectral matrix factorization

[...]

Karan Nathwani¹, Anurag Kumar², Rajesh M. Hegde¹•Institutions (2)

Indian Institute of Technology Kanpur¹, Carnegie Mellon University²

01 Feb 2014

TL;DR: A method that factorizes the spectral magnitude matrix obtained from the group delay function (GDF) is described, showing reasonable improvements over other conventional methods on subjective evaluation, objective evaluation and multi speaker speech recognition on the TIMIT and the GRID corpus.

...read moreread less

Abstract: Non-negative matrix factorization (NMF) methods have been widely used in single channel speaker separation. NMF methods use the magnitude of the Fourier transform for training the basis vectors. In this paper, a method that factorizes the spectral magnitude matrix obtained from the group delay function (GDF) is described. During training, pre-learning is applied on a training set of original sources. The bases are trained iteratively to minimize the approximation error. Separation of the mixed speech signal involves the factorization of the non negative group delay spectral matrix along with the use of fixed stacked bases computed during training. This matrix is then decomposed into a linear combination of trained bases for each individual speaker contained in the mixed speech signal. The estimated spectral magnitude for each speaker signal is modulated by the phase of mixed signal to reconstruct signal for each speaker signal. The separated speaker signals are further refined using a min-max masking method. Experiments on subjective evaluation, objective evaluation and multi speaker speech recognition on the TIMIT and the GRID corpus indicate reasonable improvements over other conventional methods.

...read moreread less

6 citations

Journal Article•DOI•

A machine learning approach for gender identification using statistical features of pitch in speeches

[...]

G.U. Shagi¹, S. Aji¹•Institutions (1)

University of Kerala¹

01 Jan 2022-Applied Acoustics

TL;DR: This work proposed an effective combination of features, PFG-Pitch Feature for Gender, for gender identification with machine learning algorithms, for speech processing with classical learning methods.

...read moreread less

6 citations

Proceedings Article•

Estimating velum height from acoustics during continuous speech.

[...]

Korin Richmond

01 Jan 1999

TL;DR: A recurrent neural network is trained to estimate ‘velum height’ during continuous speech by analyzing the network’s output for each phonetic segment contained in 50 hand-labelled utterances set aside for testing purposes.

...read moreread less

Abstract: This paper reports on present work, in which a recurrent neural network is trained to estimate ‘velum height’ during continuous speech. Parallel acoustic-articulatory data comprising more than 400 read TIMIT sentences is obtained using electromagnetic articulography (EMA). This data is processed and used as training data for a range of neural network sizes. The network demonstrating the highest accuracy is identified. This performance is then evaluated in detail by analysing the network’s output for each phonetic segment contained in 50 hand-labelled utterances set aside for testing purposes.

...read moreread less

6 citations

Proceedings Article•DOI•

Perceptual improvement of deep neural networks for monaural speech enhancement

[...]

Wei Han¹, Xiongwei Zhang¹, Meng Sun¹, Wenhua Shi¹, Xushan Chen¹, Yonggang Hu¹ - Show less +2 more•Institutions (1)

Nanjing University of Science and Technology¹

01 Sep 2016

TL;DR: A novel DNN architecture for monaural speech enhancement is presented, taking into account the masking properties of the human auditory system, which is used to reduce the noise and make the residual noise perceptually inaudible.

...read moreread less

Abstract: Monaural speech enhancement is a key yet challenging problem for many important real world applications. Recently, deep neural networks(DNNs)-based speech enhancement methods, which extract useful feature from complex feature, have demonstrated remarkable performance improvement. In this paper, we present a novel DNN architecture for monaural speech enhancement. Taking into account the masking properties of the human auditory system, a piecewise gain function is applied in the proposed DNN architecture, which is used to reduce the noise and make the residual noise perceptually inaudible. The proposed architecture jointly optimize the piecewise gain function and DNN. Systematic experiments on TIMIT corpus with 20 noise types at various signal-to-noise ratio (SNR) conditions demonstrate the superiority of the proposed DNN over the reference speech enhancement methods, no matter in the matched noise conditions or in the unmatched noise conditions.

...read moreread less

6 citations

Proceedings Article•

Classification of Transition sounds with application to Automatic Speech Recognition

[...]

Zeev Litichever¹, Dan Chazan²•Institutions (2)

Technion – Israel Institute of Technology¹, IBM²

01 Jan 2001

TL;DR: It is shown that some non-parametric classifiers have considerable advantages over traditional hidden Markov models and support vector machines were found the most suitable and the easiest to tune.

...read moreread less

Abstract: This paper addresses the problem of classification of speech transition sounds. A number of non parametric classifiers are compared, and it is shown that some non-parametric classifiers have considerable advantages over traditional hidden Markov models. Among the non-parametric classifiers, support vector machines were found the most suitable and the easiest to tune. Some of the reasons for the superiority of non-parametric classifiers will be discussed. The algorithm was tested on the voiced stop consonant phones extracted from the TIMIT corpus and resulted in very low error rates.

...read moreread less

6 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
…
138
139
140
141
142
143
144
…
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics