scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
Book ChapterDOI
10 Dec 2009
TL;DR: It turns out that OPCA can be used for blindly separating temporal signals from their linear mixtures without need for a pre-whitening step.
Abstract: In this paper we report the results of a comparative study on blind speech signal separation approaches. Three algorithms, Oriented Principal Component Analysis (OPCA), High Order Statistics (HOS), and Fast Independent Component Analysis (Fast-ICA), are objectively compared in terms of signal-to-interference ratio criteria. The results of experiments carried out using the TIMIT and AURORA speech databases show that OPCA outperforms the other techniques. It turns out that OPCA can be used for blindly separating temporal signals from their linear mixtures without need for a pre-whitening step.

3 citations

Proceedings ArticleDOI
06 Nov 2014
TL;DR: Recognition results show that the histogram mapping combined with filter with neural networks in the field of the cepstral coefficients do improve the recognition rates.
Abstract: One of the biggest problems of a speech recognition system is the signal degradation due to adverse conditions. Such situations usually lead to mismatch between the test conditions and the training data, caused by non-linear distortion. The authors propose a histogram mapping followed by a filter through neural networks techniques (based on the features compensation), in order to minimize the misfit caused by noise insertion in the speech signal. The proposed method has been evaluated using the TIMIT and Noisex-92 databases. Recognition results show that the histogram mapping combined with filter with neural networks in the field of the cepstral coefficients do improve the recognition rates.

3 citations

Journal ArticleDOI
TL;DR: In this paper, a pitch determination algorithm based on harmonic differences method (HDM) is proposed, which is designed for both wideband and exclusively narrowband (telephone) speech and tries to find the most repeating difference between the harmonics of speech signal.
Abstract: In this article, a novel pitch determination algorithm based on harmonic differences method (HDM) is proposed. Most of the algorithms today rely on autocorrelation, cepstrum, and lastly convolutional neural networks, and they have some limitations (small datasets, wideband or narrowband, musical sounds, temporal smoothing, etc.), accuracy, and speed problems. There are very rare works exploiting the spacing between the harmonics. HDM is designed for both wideband and exclusively narrowband (telephone) speech and tries to find the most repeating difference between the harmonics of speech signal. We use three vowel databases in our experiments, namely, Hillenbrand Vowel Database, Texas Vowel Database, and Vowels from the TIMIT corpus. We compare HDM with autocorrelation, cepstrum, YIN, YAAPT, CREPE, and FCN algorithms. Results show that harmonic differences are reliable and fast choice for robust pitch detection. Also, it is superior to others in most cases.

3 citations

Journal ArticleDOI
TL;DR: This paper proposes to combine CNN, GRU-RNN and DNN in a single deep architecture called Convolutional Gated Recurrent Unit, Deep Neural Network (CGDNN).
Abstract: Over the last years, many researchers have engaged in improving accuracies on Automatic Speech Recognition (ASR) task by using deep learning. In state-of-the-art speech recognizers, both Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) based Reccurent Neural Network (RNN) have achieved improved performances compared to Convolutional Neural Network (CNN) and Deep Neural Network (DNN). Due to the strong complementarity of CNN, LSTM-RNN and DNN, they may be combined in one architecture called Convolutional Long Short-Term Memory, Deep Neural Network (CLDNN). Similarly we propose to combine CNN, GRU-RNN and DNN in a single deep architecture called Convolutional Gated Recurrent Unit, Deep Neural Network (CGDNN). In this paper, we present our experiments for phoneme recognition task tested on TIMIT data set. A phone error rate of 15.72% has been reached using the proposed CGDNN model. The achieved result confirms the superiority of CGDNN over all their baselines networks used alone and also over the CLDNN architecture.

3 citations

Proceedings ArticleDOI
13 May 2002
TL;DR: Two measures, phoneme error rate (PER) and phoneme confidence score (PCS), are investigated and show that both PER and PCS can help identify where the degradation from noise occurs as well as give a useful indication of how an NM algorithm may impact ASR performance.
Abstract: A common approach to measuring the impact of noise and the effectiveness of noise mitigation (NM) algorithms for Automatic Speech Recognition (ASR) systems is to compare the word error rates (WERs). However, the WER measure does not give much insight into how an NM algorithm affects phoneme-level acoustic characteristics. Such insight can help in tuning the NM parameters and may also lead to reduced research time because the impact of an NM algorithm on ASR can first be investigated on smaller corpora. In this paper, two measures, phoneme error rate (PER) and phoneme confidence score (PCS), are investigated to assess the impact of NM algorithms on the ASR performance. Experimental results using the TIMIT corpus show that both PER and PCS can help identify where the degradation from noise occurs as well as give a useful indication of how an NM algorithm may impact ASR performance. A diagnostic method based on these two measures is also proposed to assess the NM impact on ASR and help improve the NM algorithm performance.

3 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895