scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
Proceedings ArticleDOI
20 Nov 1995
TL;DR: A relational database management system has been developed to house the speech data and provides much more usability, flexibility and expandibility than file based speech corpora such as TIMIT.
Abstract: A collection of digits and words, spoken with a New Zealand English accent, has been systematically and formally collected. This collection along with the beginning and end points of the realised phonemes from within the words, comprise the Otago Speech Corpora. A relational database management system has been developed to house the speech data. This system provides much more usability, flexibility and expandibility than file based speech corpora such as TIMIT.

31 citations

Journal ArticleDOI
TL;DR: A novel discriminative objective function for the estimation of hidden Markov model (HMM) parameters, based on the calculation of overall risk, which minimises the risk of misclassification on the training database and thus maximises recognition accuracy.

31 citations

Journal ArticleDOI
TL;DR: An energy-constrained signal subspace (ECSS) method is proposed for speech enhancement and automatic speech recognition under additive noise condition and it was found that the ECSS method can achieve very high word recognition accuracy (WRA) for the digits set under low SNR conditions.

30 citations

Proceedings ArticleDOI
05 Mar 2017
TL;DR: A recently developed deep learning model, recurrent convolutional neural network (RCNN), is proposed to use for speech processing, which inherits some merits of recurrent neural networks (RNN) and convolutionals (CNN) and is competitive with previous methods in terms of accuracy and efficiency.
Abstract: Different neural networks have exhibited excellent performance on various speech processing tasks, and they usually have specific advantages and disadvantages. We propose to use a recently developed deep learning model, recurrent convolutional neural network (RCNN), for speech processing, which inherits some merits of recurrent neural network (RNN) and convolutional neural network (CNN). The core module can be viewed as a convolutional layer embedded with an RNN, which enables the model to capture both temporal and frequency dependance in the spectrogram of the speech in an efficient way. The model is tested on speech corpus TIMIT for phoneme recognition and IEMOCAP for emotion recognition. Experimental results show that the model is competitive with previous methods in terms of accuracy and efficiency.

30 citations

Posted Content
TL;DR: The proposed model is a convolutional neural network that operates directly on the raw waveform that is optimized to identify spectral changes in the signal using the Noise-Contrastive Estimation principle and reaches state-of-the-art performance on both data sets.
Abstract: We propose a self-supervised representation learning model for the task of unsupervised phoneme boundary detection. The model is a convolutional neural network that operates directly on the raw waveform. It is optimized to identify spectral changes in the signal using the Noise-Contrastive Estimation principle. At test time, a peak detection algorithm is applied over the model outputs to produce the final boundaries. As such, the proposed model is trained in a fully unsupervised manner with no manual annotations in the form of target boundaries nor phonetic transcriptions. We compare the proposed approach to several unsupervised baselines using both TIMIT and Buckeye corpora. Results suggest that our approach surpasses the baseline models and reaches state-of-the-art performance on both data sets. Furthermore, we experimented with expanding the training set with additional examples from the Librispeech corpus. We evaluated the resulting model on distributions and languages that were not seen during the training phase (English, Hebrew and German) and showed that utilizing additional untranscribed data is beneficial for model performance.

30 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895