scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
Proceedings ArticleDOI
01 Feb 2014
TL;DR: A method that factorizes the spectral magnitude matrix obtained from the group delay function (GDF) is described, showing reasonable improvements over other conventional methods on subjective evaluation, objective evaluation and multi speaker speech recognition on the TIMIT and the GRID corpus.
Abstract: Non-negative matrix factorization (NMF) methods have been widely used in single channel speaker separation. NMF methods use the magnitude of the Fourier transform for training the basis vectors. In this paper, a method that factorizes the spectral magnitude matrix obtained from the group delay function (GDF) is described. During training, pre-learning is applied on a training set of original sources. The bases are trained iteratively to minimize the approximation error. Separation of the mixed speech signal involves the factorization of the non negative group delay spectral matrix along with the use of fixed stacked bases computed during training. This matrix is then decomposed into a linear combination of trained bases for each individual speaker contained in the mixed speech signal. The estimated spectral magnitude for each speaker signal is modulated by the phase of mixed signal to reconstruct signal for each speaker signal. The separated speaker signals are further refined using a min-max masking method. Experiments on subjective evaluation, objective evaluation and multi speaker speech recognition on the TIMIT and the GRID corpus indicate reasonable improvements over other conventional methods.

6 citations

Journal ArticleDOI
G.U. Shagi1, S. Aji1
TL;DR: This work proposed an effective combination of features, PFG-Pitch Feature for Gender, for gender identification with machine learning algorithms, for speech processing with classical learning methods.

6 citations

Proceedings Article
01 Jan 1999
TL;DR: A recurrent neural network is trained to estimate ‘velum height’ during continuous speech by analyzing the network’s output for each phonetic segment contained in 50 hand-labelled utterances set aside for testing purposes.
Abstract: This paper reports on present work, in which a recurrent neural network is trained to estimate ‘velum height’ during continuous speech. Parallel acoustic-articulatory data comprising more than 400 read TIMIT sentences is obtained using electromagnetic articulography (EMA). This data is processed and used as training data for a range of neural network sizes. The network demonstrating the highest accuracy is identified. This performance is then evaluated in detail by analysing the network’s output for each phonetic segment contained in 50 hand-labelled utterances set aside for testing purposes.

6 citations

Proceedings ArticleDOI
01 Sep 2016
TL;DR: A novel DNN architecture for monaural speech enhancement is presented, taking into account the masking properties of the human auditory system, which is used to reduce the noise and make the residual noise perceptually inaudible.
Abstract: Monaural speech enhancement is a key yet challenging problem for many important real world applications. Recently, deep neural networks(DNNs)-based speech enhancement methods, which extract useful feature from complex feature, have demonstrated remarkable performance improvement. In this paper, we present a novel DNN architecture for monaural speech enhancement. Taking into account the masking properties of the human auditory system, a piecewise gain function is applied in the proposed DNN architecture, which is used to reduce the noise and make the residual noise perceptually inaudible. The proposed architecture jointly optimize the piecewise gain function and DNN. Systematic experiments on TIMIT corpus with 20 noise types at various signal-to-noise ratio (SNR) conditions demonstrate the superiority of the proposed DNN over the reference speech enhancement methods, no matter in the matched noise conditions or in the unmatched noise conditions.

6 citations

Proceedings Article
01 Jan 2001
TL;DR: It is shown that some non-parametric classifiers have considerable advantages over traditional hidden Markov models and support vector machines were found the most suitable and the easiest to tune.
Abstract: This paper addresses the problem of classification of speech transition sounds. A number of non parametric classifiers are compared, and it is shown that some non-parametric classifiers have considerable advantages over traditional hidden Markov models. Among the non-parametric classifiers, support vector machines were found the most suitable and the easiest to tune. Some of the reasons for the superiority of non-parametric classifiers will be discussed. The algorithm was tested on the voiced stop consonant phones extracted from the TIMIT corpus and resulted in very low error rates.

6 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895