scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
Proceedings ArticleDOI
08 Dec 2009
TL;DR: A new method is proposed to determine the threshold value based on the symmetric Kullback-Leibler divergence between the probability distributions of noisy speech and noise wavelet coefficients using segmental SNR.
Abstract: Performance of wavelet thresholding methods for speech enhancement is dependent on estimating an exact threshold value in the wavelet sub-bands. In this paper, we propose a new method for more exact estimating the threshold value. We proposed to determine the threshold value based on the symmetric Kullback-Leibler divergence between the probability distributions of noisy speech and noise wavelet coefficients. In the next step, we improved this value using segmental SNR. We used some of TIMIT utterances to assess the performance of the proposed threshold. The algorithm is evaluated using the PESQ score and the SNR improvement. In average, we obtain 2db SNR improvement and a PESQ score increase up to 0.7 in comparison to the conventional wavelet thresholding approaches.

20 citations

Proceedings ArticleDOI
08 May 2019
TL;DR: A unified DNN architecture to predict both height and age of a speaker for short durations of speech is proposed, and a novel initialization scheme for the deep neural architecture is introduced, that avoids the requirement for a large training dataset.
Abstract: Automatic height and age prediction of a speaker has a wide variety of applications in speaker profiling, forensics etc. Often in such applications only a few seconds of speech data is available to reliably estimate the speaker parameters. Traditionally, age and height were predicted separately using different estimation algorithms. In this work, we propose a unified DNN architecture to predict both height and age of a speaker for short durations of speech. A novel initialization scheme for the deep neural architecture is introduced, that avoids the requirement for a large training dataset. We evaluate the system on TIMIT dataset where the mean duration of speech segments is around 2.5s. The DNN system is able to improve the age RMSE by at least 0.6 years as compared to a conventional support vector regression system trained on Gaussian Mixture Model mean supervectors. The system achieves an RMSE error of 6.85 and 6.29 cm for male and female height prediction. In case of age estimation, the RMSE errors are 7.60 and 8.63 years for male and female respectively. Analysis of shorter speech segments reveals that even with 1 second speech input the performance degradation is at most 3% compared to the full duration speech files.

20 citations

Journal ArticleDOI
TL;DR: A new deep architecture in which two heterogeneous classification techniques named as CNN and support vector machines (SVMs) are combined together is proposed, which improves the result by 13.33% and 2.31% over baseline CNN and segmental recurrent neural networks respectively.
Abstract: Convolutional neural networks (CNNs) have demonstrated the state-of-the-art performances on automatic speech recognition. Softmax activation function for prediction and minimizing the cross-entropy loss is employed by most of the CNNs. This paper proposes a new deep architecture in which two heterogeneous classification techniques named as CNN and support vector machines (SVMs) are combined together. In this proposed model, features are learned using convolution layer and classified by SVMs. The last layer of CNN i.e. softmax layer is replaced by SVMs to efficiently deal with high dimensional features. This model should be interpreted as a special form of structured SVM and named as convolutional support vector machine (CSVM). Instead of training each component separately, the parameters of CNN and SVMs are jointly trained using frame level max-margin, sequence level max-margin, and state-level minimum Bayes risk criterion. The performance of CSVM is checked on TIMIT and Wall Street Journal datasets for phone recognition. By incorporating the features of both CNN and SVMs, CSVM improves the result by 13.33% and 2.31% over baseline CNN and segmental recurrent neural networks respectively.

20 citations

Journal ArticleDOI
Dong Yu1, Li Deng1, Alex Acero1
TL;DR: Improved likelihood score computation in theHTM and a novel A∗-based time-asynchronous lattice-constrained decoding algorithm for the HTM evaluation are described and improvement of recognition accuracy by the new search algorithm on recognition lattices over the traditional N-best rescoring paradigm is shown.

20 citations

Proceedings ArticleDOI
01 Jul 2017
TL;DR: In this article, an unsupervised algorithm based on sequence prediction models such as Markov chains and recurrent neural networks is proposed for phonemic segmentation of speech, which consists in analyzing the error profile of a model trained to predict speech features frame-by-frame.
Abstract: Phonemic segmentation of speech is a critical step of speech recognition systems. We propose a novel unsupervised algorithm based on sequence prediction models such as Markov chains and recurrent neural network. Our approach consists in analyzing the error profile of a model trained to predict speech features frame-by-frame. Specifically, we try to learn the dynamics of speech in the MFCC space and hypothesize boundaries from local maxima in the prediction error. We evaluate our system on the TIMIT dataset, with improvements over similar methods.

20 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895