scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
01 Jan 2009
TL;DR: Techniques that allow the effects of a variety of telephone channels to be simulated, given wideband speech recordings, indicate that the noise model is the major factor leading to an increased accuracy from a basic bandpass channel approximation.
Abstract: The paper presents techniques that allow the effects of a variety of telephone channels to be simulated, given wideband speech recordings. By comparing corresponding utterances from the TIMIT and NTIMIT corpora, both a channel and a noise model were derived. These models were shown to closely mimic the spectral effects of the NTIMIT telephone network. The application of the techniques to the development of ASR systems indicates that the noise model is the major factor leading to an increased accuracy from a basic bandpass channel approximation.

2 citations

Proceedings ArticleDOI
01 Dec 2011
TL;DR: A new discriminative training criterion for subword unit detectors that is based on the Minimum Phone Error framework is proposed that can optimize the F-score or any other detection performance metric.
Abstract: This paper presents methods and results for optimizing subword detectors in continuous speech. Speech detectors are useful within areas like detection-based ASR, pronunciation training, phonetic analysis, word spotting, etc. We propose a new discriminative training criterion for subword unit detectors that is based on the Minimum Phone Error framework. The criterion can optimize the F-score or any other detection performance metric. The method is applied to the optimization of HMMs and MFCC filterbanks in phone detectors. The resulting filterbanks differ from each other and reflect acoustic properties of the corresponding detection classes. For the experiments in TIMIT, the best optimized detectors had a relative accuracy improvement of 31.3% over baseline and 18.2% over our previous MCE-based method.

2 citations

01 Jan 2018
TL;DR: In this article, a DNN-based preprocessing method for speech coding and automatic speech recognition applications is proposed, which maps noisy log power spectra to clean smoothed log power spectral envelopes using DNN pre-diction.
Abstract: This paper proposes a DNN-based preprocessing method for speech coding and automatic speech recognition applications. The method proposed here maps noisy log power spectra to “clean” smoothed log power spectral envelopes using DNN pre-diction. The proposed method has the advantage of combining feature extraction with DNN-based enhancement, thus reducing computational time and resources. The TIMIT speech database with various additive noise types was used to train the DNN, and the NN prediction results are compared to the target clean log power spectral envelopes using log spectral distortion. The proposed method is found to have lower log spectral distortion measurements compared to similar neural networks that map noisy power spectra to clean power spectra.

2 citations

Proceedings ArticleDOI
Young-Sun Yun1, Yung-Hwan Oh
05 Jun 2000
TL;DR: A parametric trajectory model for characterizing segmental features and their interaction within the segmental HMMs is presented and performance is shown to improve significantly over that of the conventional HMM.
Abstract: We present a parametric trajectory model for characterizing segmental features and their interaction within the segmental HMMs. The trajectory is obtained by applying the design matrix which includes transitional information on contiguous frames, and it is characterized as a polynomial regression function. To apply the trajectory to the segmental HMM, the extra- and intra-segmental variations are modified to contain the trajectory information. We made some experiments to examine the characteristics of variances and the variabilities in a segment. The experimental results are reported on the TIMIT corpus and performance is shown to improve significantly over that of the conventional HMM.

2 citations

Proceedings ArticleDOI
01 Nov 2019
TL;DR: Results show that the proposed approach outperforms a state of the art approach in computation time and has a comparable separation performance.
Abstract: In this paper, a novel system for the separation of 2 moving audio source is being presented and evaluated in experiments. The proposed approach works in the time domain, estimating coefficients of IIR filters derived from attenuation factors and fractional delays between microphone signals to minimize cross-talk. A novel objective function derived from the Kullback-Leibler divergence as a substitute of mutual information between the resulting separated signals is used. For optimization we utilize a novel algorithm of "Random Directions", without the need for gradients, which is very fast and robust. The evaluation of proposed approach on convolutive mixtures generated from speech signals taken from the TIMIT data-set using a room impulse response simulator is being presented. Results show that our proposed approach outperforms a state of the art approach in computation time and has a comparable separation performance.

2 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895