scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
Journal ArticleDOI
TL;DR: A new phoneme classifier is proposed consisting of a modular arrangement of experts, with one expert assigned to each BPG and focused on discriminating between phonemes within that BPG.
Abstract: In phoneme recognition experiments, it was found that approximately 75% of misclassified frames were assigned labels within the same broad phonetic group (BPG). While the phoneme can be described as the smallest distinguishable unit of speech, phonemes within BPGs contain very similar characteristics and can be easily confused. However, different BPGs, such as vowels and stops, possess very different spectral and temporal characteristics. In order to accommodate the full range of phonemes, acoustic models of speech recognition systems calculate input features from all frequencies over a large temporal context window. A new phoneme classifier is proposed consisting of a modular arrangement of experts, with one expert assigned to each BPG and focused on discriminating between phonemes within that BPG. Due to the different temporal and spectral structure of each BPG, novel feature sets are extracted using mutual information, to select a relevant time-frequency (TF) feature set for each expert. To construct a phone recognition system, the output of each expert is combined with a baseline classifier under the guidance of a separate BPG detector. Considering phoneme recognition experiments using the TIMIT continuous speech corpus, the proposed architecture afforded significant error rate reductions up to 5% relative

61 citations

Proceedings ArticleDOI
12 May 2008
TL;DR: A novel speech feature analysis technique based on localized spectro- temporal cepstral analysis of speech that is more robust to noise, and better capture temporal modulations important for recognizing plosive sounds is presented.
Abstract: Drawing on recent progress in auditory neuroscience, we present a novel speech feature analysis technique based on localized spectro- temporal cepstral analysis of speech. We proceed by extracting localized 2D patches from the spectrogram and project onto a 2D discrete cosine (2D-DCT) basis. For each time frame, a speech feature vector is then formed by concatenating low-order 2D- DCT coefficients from the set of corresponding patches. We argue that our framework has significant advantages over standard one- dimensional MFCC features. In particular, we find that our features are more robust to noise, and better capture temporal modulations important for recognizing plosive sounds. We evaluate the performance of the proposed features on a TIMIT classification task in clean, pink, and babble noise conditions, and show that our feature analysis outperforms traditional features based on MFCCs.

59 citations

Proceedings ArticleDOI
06 Dec 2010
TL;DR: The results obtained show that approach PSO-SVM gives a better classification in terms of accuracy even though the execution time is increased, which makes it possible to optimize the performance of classifier SVM (Separating with Vast Margin).
Abstract: In many problems of classification, the performances of a classifier are often evaluated by a factor (rate of error).the factor is not well adapted for the complex real problems, in particular the problems multiclass. Our contribution consists in adapting an evolutionary method for optimization of this factor. Among the methods of optimization used we chose the method PSO (Particle Swarm Optimization) which makes it possible to optimize the performance of classifier SVM (Separating with Vast Margin). The experiments are carried out on corpus TIMIT. The results obtained show that approach PSO-SVM gives a better classification in terms of accuracy even though the execution time is increased.

58 citations

Proceedings ArticleDOI
27 Apr 1993
TL;DR: Research on speaker-independent continuous phone recognition for both French and English is presented and it is found that French is easier to recognize at the phone level (the phone error for BREF is 23.6% vs. 30.1% for WSJ), but harder to recognizing at the lexical level due to the larger number of homophones.
Abstract: Research on speaker-independent continuous phone recognition for both French and English is presented. The phone accuracy is assessed on the BREF corpus for French, and on the Wall Street Journal (WSJ) and TIMIT corpora for English. Cross-language differences concerning language properties are presented. It is found that French is easier to recognize at the phone level (the phone error for BREF is 23.6% vs. 30.1% for WSJ), but harder to recognize at the lexical level due to the larger number of homophones. Experiments with signal analysis indicate that a 4 kHz signal bandwidth is sufficient for French, whereas 8 kHz is needed for English. Phone recognition is a powerful technique for language, sex, and speaker identification. With 2 s of speech, the language can be identified with better than 99% accuracy. Sex-identification for BREF and WSJ is error-free. Speaker identification accuracies of 98.2% on TIMIT (462 speakers) and 99.1% on BREF (57 speakers) were obtained with one utterance per speaker. 100% accuracies were obtained with two utterances per speaker. >

58 citations

Journal ArticleDOI
TL;DR: A novel technique is proposed that exploits LSTM in combination with Connectionist Temporal Classification in order to improve performance by using a self-learned amount of contextual information.

58 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895