Home
/
Topics
/
TIMIT

Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Non-segmental duration feature extraction for prosodic classification.

[...]

Amy Dashiell¹, Brian Hutchinson¹, Anna Margolis¹, Mari Ostendorf¹•Institutions (1)

University of Washington¹

22 Sep 2008

TL;DR: A set of novel duration features for detecting pitch accent and phrase boundaries, which depend on articulatory timing rather than segmental duration information, are presented.

...read moreread less

Abstract: This paper presents a set of novel duration features for detecting pitch accent and phrase boundaries, which depend on articulatory timing rather than segmental duration information. The features are computed from the detected syllable nuclei and boundaries, using peaks and valleys in an energy contour but also leveraging information from a simple HMM phone manner class recognizer to increase recall. In experiments on the hand-segmented TIMIT corpus, we obtain greater than 90% Fmeasure for vowel detection. In prosody detection experiments on the BU Radio News corpus, comparing to a segmental feature baseline, we obtain similar performance for pitch accent detection and slightly worse boundary detection from the new features without the need for phonetic alignments.

...read moreread less

3 citations

Proceedings Article•DOI•

Unsupervised query by example spoken term detection using features concatenated with Self-Organizing Map distances

[...]

Haiwei Wu¹, Ming Li², Zexin Cai², Haibin Zhong•Institutions (2)

Sun Yat-sen University¹, Duke University²

01 Nov 2018

TL;DR: In the task of the unsupervised query by example spoken term detection (QbE-STD), concatenate the features extracted by a Self-Organizing Map (SOM) and features learned by an un supervised GMM based model at the feature level to enhance the performance.

...read moreread less

Abstract: In the task of the unsupervised query by example spoken term detection (QbE-STD), we concatenate the features extracted by a Self-Organizing Map (SOM) and features learned by an unsupervised GMM based model at the feature level to enhance the performance. More specifically, The SOM features are represented by the distances between the current feature vector and the weight vectors of SOM neurons learned in an unsupervised manner. After fetching these features, we apply sub-sequence Dynamic Time Warping (S-DTW) to detect the occurrences of keywords in the test data. We evaluate the performance of these features on the TIMIT English database. After concatenating the SOM features and the GMM based features together, we achieve an improvement of 7.77% and 7.74% on Mean Average Precision (MAP) and P@10 on average.

...read moreread less

3 citations

Journal Article•DOI•

Speech Enhancement Research based on Fractional Fourier transform

[...]

Jingfang Wang

01 Dec 2014-Indonesian Journal of Electrical Engineering and Computer Science

TL;DR: The experimental results show that this algorithm can filter noise from voice availably and improve the performance of automatic speech recognition system significantly and is proved to be robust under various noisy environments and Signal-to-Noise Ratio (SNR) conditions.

...read moreread less

Abstract: As many traditional de-noising methods fail in the intensive noises environment and are unadaptable in various noisy environments, a method of speech enhancement has been advanced based on dynamic Fractional Fourier Transform （FRFT）filtering. The acoustic signals are framed. The renewing methods are put in FRFT optimal disperse degree of noising speech and this method is implemented in detail. By TIMIT criterion voice and Noisex-92, the experimental results show that this algorithm can filter noise from voice availably and improve the performance of automatic speech recognition system significantly. It is proved to be robust under various noisy environments and Signal-to-Noise Ratio (SNR) conditions. This algorithm is of low computational complexity and briefness in realization. http://dx.doi.org/10.11591/telkomnika.v12i12.6694

...read moreread less

3 citations

Proceedings Article•DOI•

Phoneme classification with multinets

[...]

T.J. Reynolds¹, E.B. Pizzolato•Institutions (1)

University of Essex¹

12 Oct 1998

TL;DR: The multinet phone classifier architecture is a framework for combining specialised phone detection networks into a posterior probability estimator for all phones and a standard mixture of Gaussian HMM classifiers is compared.

...read moreread less

Abstract: The multinet phone classifier architecture is a framework for combining specialised phone detection networks into a posterior probability estimator for all phones In this paper we give results obtained for the architecture on TIMIT phone classification tasks We compare it with a standard mixture of Gaussian HMM classifiers

...read moreread less

3 citations

Posted Content•

CGCNN: Complex Gabor Convolutional Neural Network on raw speech

[...]

Paul-Gauthier Noé¹, Titouan Parcollet¹, Mohamed Morchid¹•Institutions (1)

University of Avignon¹

11 Feb 2020-arXiv: Sound

TL;DR: In this paper, the authors combine the complex Gabor filter with complex-valued deep neural networks to replace usual CNN weights kernels, to fully take advantage of its optimal time-frequency resolution and of the complex domain.

...read moreread less

Abstract: Convolutional Neural Networks (CNN) have been used in Automatic Speech Recognition (ASR) to learn representations directly from the raw signal instead of hand-crafted acoustic features, providing a richer and lossless input signal. Recent researches propose to inject prior acoustic knowledge to the first convolutional layer by integrating the shape of the impulse responses in order to increase both the interpretability of the learnt acoustic model, and its performances. We propose to combine the complex Gabor filter with complex-valued deep neural networks to replace usual CNN weights kernels, to fully take advantage of its optimal time-frequency resolution and of the complex domain. The conducted experiments on the TIMIT phoneme recognition task shows that the proposed approach reaches top-of-the-line performances while remaining interpretable.

...read moreread less

3 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
…
188
189
190
191
192
193
194
…
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics