scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
13 Nov 2014
TL;DR: A novel class-dependent extension of two-dimensional linear discriminant analysis (2DLDA) applied in automatic speech recognition using two-pass recognition strategy, which markedly outperforms the 2DLDA method.
Abstract: In this paper, we introduce a novel class-dependent extension of two-dimensional linear discriminant analysis (2DLDA) named CD-2DLDA, applied in automatic speech recognition using two-pass recognition strategy. In the first pass, the class labels of test sample are obtained using baseline recognition. The labels are then used in CD transformation of test features. In the second pass, recognition of previously transformed test samples is performed using CD-2DLDA acoustic model. The novelty of the paper lies in improvement of the present 2DLDA algorithm by its modification to more precise, class-dependent estimations repeated separately for each class. The proposed approach is evaluated in several scenarios using the TIMIT corpus in phoneme-based continuous speech recognition task. CD-2DLDA features are compared to state-of-the-art MFCCs, conventional LDA and 2DLDA features. The experimental results show that our method performs better than MFCCs and LDA. Furthermore, the results confirm that CD-2DLDA markedly outperforms the 2DLDA method.

7 citations

Journal ArticleDOI
TL;DR: In this paper, an adaptive network fuzzy inference system ANFIS for phonemes recognition was applied, where the appropriate learning algorithm was performed on TIMIT speech database supervised type, a pre-processing of the acoustic signal and extracting the coefficients MFCCs parameters relevant to the recognition system.
Abstract: Fuzzy modeling require two main steps which are structure identification and parameter optimization, the first one determines the numbers of membership functions and fuzzy if-then rules, while the second identifies a feasible set of parameters under the given structure. However, the increase of input dimension, rule numbers will have an exponential growth and there will cause problem of “rule disaster”. In this paper, we have applied adaptive network fuzzy inference system ANFIS for phonemes recognition. The appropriate learning algorithm is performed on TIMIT speech database supervised type, a pre-processing of the acoustic signal and extracting the coefficients MFCCs parameters relevant to the recognition system. First learning of the network structure by subtractive clustering, in order to define an optimal structure and obtain small number of rules, then learning of parameters network by hybrid learning which combine the gradient decent and least square estimation LSE to find a feasible set of antecedents and consequents parameters. The results obtained show the effectiveness of the method in terms of recognition rate and number of fuzzy rules generated.

7 citations

Proceedings ArticleDOI
01 Nov 2017
TL;DR: An effort to build a TIMIT-like corpus in Standard Chinese, demonstrating that males in this dataset speak faster but pause more frequently and longer, and have shorter phones/tones but more and longer utterance internal silences than females.
Abstract: This paper describes an effort to build a TIMIT-like corpus in Standard Chinese, which is part of our "Global TIMIT" project. Three steps are involved and detailed in the paper: selection of sentences; speaker recruitment and recording; and phonetic segmentation. The corpus consists of 6000 sentences read by 50 speakers (25 females and 25 males). Phonetic segmentation obtained from forced alignment is provided, which has 93.2% agreement (of phone boundaries) within 20 ms compared to manual segmentation on 50 randomly selected sentences. Statistics on the number of tokens and mean duration of phones and tones in the corpus are also reported. Males have shorter phones/tones but more and longer utterance internal silences than females, demonstrating that males in this dataset speak faster but pause more frequently and longer.

7 citations

Proceedings ArticleDOI
01 Jan 2013
TL;DR: Two approaches to improve the performance of supervised ISA are proposed, and the effect of applying Linear Discriminant technique in the intrinsic subspace compared with the extrinsic one is examined.
Abstract: Intrinsic Spectral Analysis (ISA) has been formulated within a manifold learning setting allowing natural extensions to out-of-sample data together with feature reduction in a learning framework. In this paper, we propose two approaches to improve the performance of supervised ISA, and then we examine the effect of applying Linear Discriminant technique in the intrinsic subspace compared with the extrinsic one. In the interest of reducing complexity, we propose a preprocessing operation to find a small subset of data points being well representative of the manifold structure; this is accomplished by maximizing the quadratic Renyi entropy. Furthermore, we use class based graphs which not only simplify our problem but also can be helpful in a classification task. Experimental results for phone classification task on TIMIT dataset showed that ISA features improve the performance compared with traditional features, and supervised discriminant techniques outperform in the ISA subspace compared to conventional feature spaces.

7 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895