scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
Proceedings ArticleDOI
14 May 2006
TL;DR: A generalized feature projection scheme which allows each feature dimension to be classified in a set of 1 to M classes, where M is the total number of classes, which allows for a better trade-off of number of parameters versus model complexity.
Abstract: This paper presents a generalized feature projection scheme which allows each feature dimension to be classified in a set of 1 to M classes, where M is the total number of classes. Our method is an extension of the classical full-space null-space approach where each dimension can only be classified in either M classes or 1 class. We believe that this more general formulation allows for a better trade-off of number of parameters versus model complexity, which in turn should provide better classification. We first tested GLDA on TIMIT and obtained an improvement up to 1% in phone classification rate over the best HLDA classifier. Preliminary results on Wall Street Journal 20K also show an improvement over the best HLDA system of about 0.2% absolute.

1 citations

Journal ArticleDOI
TL;DR: The evaluation results demonstrate that the proposed feature extraction method outperforms the classic methods such as Perceptual Linear Prediction, Linear Predictive Coding, Linear Prediction Cepstral coefficients and Mel Frequency CepStral Coefficients.
Abstract: In this paper, a new method is presented to extract robust speech features in the presence of the external noise. The proposed method based on two-dimensional Gabor filters takes in account the spectro-temporal modulation frequencies and also limits the redundancy on the feature level. The performance of the proposed feature extraction method was evaluated on isolated speech words which are extracted from TIMIT corpus and corrupted by background noise. The evaluation results demonstrate that the proposed feature extraction method outperforms the classic methods such as Perceptual Linear Prediction, Linear Predictive Coding, Linear Prediction Cepstral coefficients and Mel Frequency Cepstral Coefficients.

1 citations

Book ChapterDOI
22 Oct 2010
TL;DR: This paper presents another method that is based on ant colony optimization (ACO) that is compared to the performance of genetic algorithm on the task of feature selection in TIMIT corpora and indicates that with the optimized feature set, theperformance of the ASV system is improved.
Abstract: With the growing trend toward remote security verification procedures for telephone banking, biometric security measures and similar applications, automatic speaker verification (ASV) has received a lot of attention in recent years. The complexity of ASV system and its verification time depends on the number of feature vectors, their dimensionality, the complexity of the speaker models and the number of speakers. In this paper, we concentrate on optimizing dimensionality of feature space by selecting relevant features. It presents another method that is based on ant colony optimization (ACO). The performance of the proposed algorithm is compared to the performance of genetic algorithm on the task of feature selection in TIMIT corpora. The results of experiments indicate that with the optimized feature set, the performance of the ASV system is improved.

1 citations

Proceedings ArticleDOI
20 Feb 2020
TL;DR: The results clearly show that there is a significant difference in perceptual quality score between female and male speech signals which demonstrate another reliability issue of PESQ as a perceptual quality of speech signal in mobile communications.
Abstract: Perceptual evaluation of speech signals is a very crucial measure for quality of service in mobile speech communication. Several subjective and objective quality measures are being utilized to evaluate the perceptual quality of speech signals. Perceptual Evaluation of Speech Quality (PESQ) has been found to reliably predict the quality of processed speech signals with a higher correlation with the perceived quality. However, some studied have shown some issues with PESQ measure in specific environments or speech signals. This paper investigates the effect of speaker gender on PESQ measure of the perceptual quality of GSM Full Rate (GSM-FR) encoded speech signals. A Matlab experiment is carried out to encode 350 speech files from TIMIT corpus using GSM-FR vocoder and calculate the PESQ scores. The results clearly show that there is a significant difference in perceptual quality score between female and male speech signals which demonstrate another reliability issue of PESQ as a perceptual quality of speech signal in mobile communications.

1 citations

Proceedings ArticleDOI
01 Aug 2018
TL;DR: Instead of taking the whole utterance as a sequence, the frame-level LSTM exploits the sequence information in each segment and brings a more precise segmented speaking rate estimation.
Abstract: Speaking rate has various applications in many domains such as speech recognition, speaker verification, emotion recognition, etc. It conveys long-term information in speech and changes over time which can be seen as a kind of time sequence. This paper proposes a frame-level LSTM speaking rate estimation method. Instead of taking the whole utterance as a sequence, the frame-level LSTM exploits the sequence information in each segment and brings a more precise segmented speaking rate estimation. We also evaluate the influence of fixed-length segmentation and voice activity detection(vad) segmentation on speaking rate estimation. Results show that the proposed frame-level LSTM method yields a high correlation between the estimated speaking rate and the ground truth. It achieves a relative improvement of 13.0% compared to the state of the art statistical learning method and 16.3% over the support vector regression(SVR) evaluated on the same TIMIT corpus.

1 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895