scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
Book ChapterDOI
30 May 2002
TL;DR: A real-time wideband speech codec adopting a wavelet packet based methodology and adapting the probability model of the quantized coefficients frame by frame by means of a competitive neural network to model better the speech characteristics of the current speaker.
Abstract: We developed a real-time wideband speech codec adopting a wavelet packet based methodology. The transform domain coefficients were first quantized by means of a mid-tread uniform quantizer and then encoded with an arithmetic coding. In the first step the wavelet coefficients were quantized by using a psycho-acoustic model. The second step was carried out by adapting the probability model of the quantized coefficients frame by frame by means of a competitive neural network. The neural network was trained on the TIMIT corpus and his weights updated in real-time during the compression in order to model better the speech characteristics of the current speaker. The coding/decoding algorithm was first written in C and then optimised on the TMS320C6000 DSP platform.

1 citations

Proceedings ArticleDOI
23 May 2022
TL;DR:
Abstract: Recently, a growing interest in unsupervised learning of disentangled representations has been observed, with successful applications to both synthetic and real data. In speech processing, such methods have been able to disentangle speakers’ attributes from verbal content. To have a better understanding of disentanglement, synthetic data is necessary, as it provides a controllable framework to train models and evaluate disentanglement. Thus, we introduce diSpeech, a corpus of speech synthesized with the Klatt synthesizer. Its first version is constrained to vowels synthesized with 5 generative factors relying on pitch and formants. Experiments show the ability of variational autoencoders to disentangle these generative factors and assess the reliability of disentanglement metrics. In addition to provide a support to benchmark speech disentanglement methods, diSpeech also enables the objective evaluation of disentanglement on real speech, which is to our knowledge unprecedented. To illustrate this methodology, we apply it to TIMIT’s isolated vowels.

1 citations

Proceedings ArticleDOI
23 Sep 2020
TL;DR: It is shown that careful selection of traditional techniques may lead to very satisfying results when it comes to achieved EER values.
Abstract: The aim of this paper is to present some research on speaker verification system based on Gaussian Mixture Model-Universal Background Model (GMM-UBM) approach. All tests were done for the TIMIT corpus. Performance for the standard Mel-Frequency Cepstral Coefficients (MFCC) and dynamic delta features is shown. Influence of feature dimensionality and model complexity on Equal Error Rate (EER) is presented. Additionally, an impact of Voice Activity Detection (VAD) and normalization techniques like Cepstral Mean and Variance Normalization (CMVN) and RelAtive SpecTrA (RASTA) filtering is covered. Each combination of factors was examined. It is shown that careful selection of traditional techniques may lead to very satisfying results when it comes to achieved EER values.

1 citations

Proceedings ArticleDOI
23 Aug 2010
TL;DR: The proposed algorithm considerably improves the prediction ability of the classifier and is modified to minimize the sum of costs for misclassified examples.
Abstract: Our aim in this paper is to propose a rule-weight learning algorithm in fuzzy rule-based classifiers. The proposed algorithm is presented in two modes: first, all training examples are assumed to be equally important and the algorithm attempts to minimize the error-rate of the classifier on the training data by adjusting the weight of each fuzzy rule in the rule-base, and second, a weight is assigned to each training example as the cost of misclassification of it using the class distribution of its neighbors. Then, instead of minimizing the error-rate, the learning algorithm is modified to minimize the sum of costs for misclassified examples. Using six data sets from UCI-ML repository and the TIMIT speech corpus for frame wise phone classification, we show that our proposed algorithm considerably improves the prediction ability of the classifier.

1 citations

Journal ArticleDOI
TL;DR: In this article, Evers et al. presented a method for distinguishing automatically between sibilant fricatives using the slope of regression lines over separate frequency ranges within a DFT spectrum.
Abstract: Acoustic cues to the distinction between sibilant fricatives are claimed to be invariant across languages. Evers et al. (1998) present a method for distinguishing automatically between [s] and [ʃ], using the slope of regression lines over separate frequency ranges within a DFT spectrum. They report accuracy rates in excess of 90% for fricatives extracted from recordings of minimal pairs in English, Dutch and Bengali. These findings are broadly replicated by Maniwa et al. (2009), using VCV tokens recorded in the lab. We tested the algorithm from Evers et al. (1998) against tokens of fricatives extracted from the TIMIT corpus of American English read speech, and the Kiel corpora of German. We were able to achieve similar accuracy rates to those reported in previous studies, with the following caveats: (1) the measure relies on being able to perform a DFT for frequencies from 0 to 8 kHz, so that a minimum sampling rate of 16 kHz is necessary for it to be effective, and (2) although the measure draws a simila...

1 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895