scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
Journal ArticleDOI
TL;DR: This work explores the estimation of multiple physical parameters of the speaker from the short duration of speech in a multilingual setting and is the first attempt to use a common set of features for estimating the different physical traits of a speaker.

19 citations

Proceedings ArticleDOI
02 Dec 1997
TL;DR: Results of the investigation of three common speech coding systems (CELP, LPC and GSM) on the pitch and formant frequencies of speech extracted from several dialect regions of the TIMIT Speech Corpus are presented.
Abstract: The introduction of speech coding systems in the telephone network raises the question of their impact on formant frequencies, fundamental frequency trajectories and other acoustics features used for text dependent speaker identification. This paper presents results of the investigation of three common speech coding systems (CELP, LPC and GSM) on the pitch and formant frequencies of speech extracted from several dialect regions of the TIMIT Speech Corpus. Voice pitch (F0) and formant frequencies (F1, F2, F3) extracted from time aligned, uncoded and coded speech samples are compared to establish the statistical distribution of error attributed to the coding system.

19 citations

Journal ArticleDOI
TL;DR: In this paper, an adaptive windows convolutional neural network (AWCNN) was proposed to analyze joint temporal-spectral features variation, which makes the model more robust against both intra-and inter-speaker variations.
Abstract: The hybrid convolutional neural network and hidden Markov model (CNN-HMM) has recently achieved considerable performance in speech recognition because deep neural networks, model complex correlations between features. Automatic speech recognition (ASR) as an input to many intelligent and expert systems has impacts in various fields such as evolving search engines (inclusion of speech recognition in search engines), healthcare industry (medical reporting by medical personnel, and disease diagnosis expert systems), service delivery, communication in service providers (to establish the callers demands and then direct them to the appropriate operator for assistance), etc. This paper introduces a method, which further reduces the recognition error rate. In this paper, we first propose adaptive windows convolutional neural network (AWCNN) to analyze joint temporal-spectral features variation. AWCNN makes the model more robust against both intra- and inter-speaker variations. We further propose a new residual learning, which leads to better utilization of information in deep layers and provides a better control on transferring input information. The proposed speech recognition system can be used as the vocal input for many artificial and expert systems. We evaluated the proposed method on TIMIT, FARSDAT, Switchboard, and CallHome datasets and one image database i.e. MNIST. The experimental results show that the proposed method reduces the absolute error rate by 7% compared with the state-of-the-art methods in some speech recognition tasks.

19 citations

Proceedings Article
27 Sep 2018
TL;DR: In this paper, an augmented cyclic adversarial learning model is proposed to enforce the cycle-consistency constraint via an external task specific model, which encourages the preservation of task-relevant content as opposed to exact reconstruction.
Abstract: Training a model to perform a task typically requires a large amount of data from the domains in which the task will be applied. However, it is often the case that data are abundant in some domains but scarce in others. Domain adaptation deals with the challenge of adapting a model trained from a data-rich source domain to perform well in a data-poor target domain. In general, this requires learning plausible mappings between domains. CycleGAN is a powerful framework that efficiently learns to map inputs from one domain to another using adversarial training and a cycle-consistency constraint. However, the conventional approach of enforcing cycle-consistency via reconstruction may be overly restrictive in cases where one or more domains have limited training data. In this paper, we propose an augmented cyclic adversarial learning model that enforces the cycle-consistency constraint via an external task specific model, which encourages the preservation of task-relevant content as opposed to exact reconstruction. We explore digit classification in a low-resource setting in supervised, semi and unsupervised situation, as well as high resource unsupervised. In low-resource supervised setting, the results show that our approach improves absolute performance by 14% and 4% when adapting SVHN to MNIST and vice versa, respectively, which outperforms unsupervised domain adaptation methods that require high-resource unlabeled target domain. Moreover, using only few unsupervised target data, our approach can still outperforms many high-resource unsupervised models. In speech domains, we similarly adopt a speech recognition model from each domain as the task specific model. Our approach improves absolute performance of speech recognition by 2% for female speakers in the TIMIT dataset, where the majority of training samples are from male voices.

19 citations

Journal ArticleDOI
TL;DR: This paper demonstrates the application of the Laplacian eigenmaps latent variable model (LELVM) to the task of speech recognition and implies the superiority of the proposed method to the usual PCA methods.

19 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895