scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
Proceedings ArticleDOI
06 Sep 2009
TL;DR: A novel phonetic filter based on the active articulator is introduced and has a higher recall than previous filters and can process any language transcribed in IPA and is currently being used to assist the phonemic analysis of unwritten languages.
Abstract: Phonemic analysis, the process of identifying the contrastive sounds in a language, involves finding allophones; phonetic variants of those contrastive sounds. An algorithm for finding allophones (developed by Peperkamp et al.) is evaluated on consonants in the TIMIT acoustic phonetic transcripts. A novel phonetic filter based on the active articulator is introduced and has a higher recall than previous filters. The combined retrieval performance, measured by area under the ROC curve, is 83%. The system implemented can process any language transcribed in IPA and is currently being used to assist the phonemic analysis of unwritten languages.

7 citations

Journal ArticleDOI
TL;DR: Speech enhancement and speaker verification experiments showed that the proposed ImNMF can effectively enhance speech signal in the noise environment of electric vehicles and further can reduce the equal error rate of the speaker verification system.
Abstract: Speech-based human–machine interaction (HMI) is essential to electronic navigation, autonomous cars, and intelligent vehicles. The noises generated by the mechanical motion or electric power equipment degrade speech quality and result in HMI failing to work effectively. However, there is relatively little literature available on speech enhancement under electric vehicle noise condition. This paper presents a speech enhancement method based on improved nonnegative matrix factorization (ImNMF). Unlike the traditional nonnegative matrix factorization (NMF) trains its speech dictionary using speech recorded in advance which inevitably contains a little noise component, ImNMF generates the speech dictionary using the spectra of pitch and their harmonics via mathematical model. This purpose is to guarantee the purity of speech dictionary. In addition, in order to alleviate the loss of the information of the noise sample, ImNMF constructs noise dictionary by a combination of the gain adjusted spectrum frames of the noise samples separated online. Compared with traditional NMF, the ImNMF noise atoms are relatively larger. Thus, the representation of speech signal mixed with noise atoms is greatly reduced. Therefore, ImNMF can reduce distortion of reconstructed speech while enhancing the recovered speech quality. Speech enhancement and speaker verification experiments on NUST603 and TIMIT data showed that the proposed ImNMF can effectively enhance speech signal in the noise environment of electric vehicles and further can reduce the equal error rate of the speaker verification system.

7 citations

Proceedings ArticleDOI
13 Apr 1994
TL;DR: This paper compares a mixture-Gaussian vector quantisation method, ergodic continuous hidden Markov models (CHMMs) and phone-level left-to-right CHMMs for text-independent speaker recognition to represent a progression of phonetic specificity prior to the generation of probabilities against which speakers are compared.
Abstract: This paper compares a mixture-Gaussian vector quantisation (VQ) method, ergodic continuous hidden Markov models (CHMMs) and phone-level left-to-right CHMMs for text-independent speaker recognition. These three methods represent a progression of phonetic specificity prior to the generation of probabilities against which speakers are compared. The mixture-Gaussian VQ uses a single distribution for all phones, the ergodic CHMM uses several distributions which have been shown in a previous text-independent speaker recognition study to represent broad phonetic classes, and the phone-based left-to-right CHMM uses many distributions representing the specific phones in the test utterance. Our experiments with speaker recognition on 40 TIMIT speakers show that the recognition rates of the mixture-Gaussian VQ, ergodic CHMMs and phone-based left-to-right CHMMs are 87.5%, 87.5% and 100% respectively. >

7 citations

Proceedings Article
01 Jan 1998
TL;DR: A new method for automatic segmentation of continuous speech into phone-like units is addressed, based on a very fast presegmentation algorithm which uses a new statistical modeling of speech and searching in a multilevel structure, called Dendrogram, for decreasing insertion rate.
Abstract: In this paper a new method for automatic segmentation of continuous speech into phone-like units is addressed. Our method is based on a very fast presegmentation algorithm which uses a new statistical modeling of speech and searching in a multilevel structure, called Dendrogram, for decreasing insertion rate. In each step the performance of algorithms have been tested over a large set of TIMIT sentences. According to these tests, our final segmentation algorithm is capable of detecting nearly 97% of segments with an average boundary position error of less than 7 msec and average insertion rate of less than 12.6%. The paper will describe the algorithms for determining the acoustic segments. Performance results will also be included.

7 citations

Proceedings ArticleDOI
20 Mar 2016
TL;DR: An experimental evaluation for speech separation showed that the use of adaptive constraints increases the performance of the source/filter model for speaker-dependent speech separation, and compares favorably to fully-supervised speech separation.
Abstract: This paper introduces a constrained source/filter model for semi-supervised speech separation based on non-negative matrix factorization (NMF). The objective is to inform NMF with prior knowledge about speech, providing a physically meaningful speech separation. To do so, a source/filter model (indicated as Instantaneous Mixture Model or IMM) is integrated in the NMF. Furthermore, constraints are added to the IMM-NMF, in order to control the NMF behaviour during separation, and to enforce its physical meaning. In particular, a speech specific constraint — based on the source/filter coherence of speech — and a method for the automatic adaptation of constraints' weights during separation are presented. Also, the proposed source/filter model is semi-supervised: during training, one filter basis is estimated for each phoneme of a speaker; during separation, the estimated filter bases are then used in the constrained source/filter model. An experimental evaluation for speech separation was conducted on the TIMIT speakers database mixed with various environmental background noises from the QUT-NOISE database. This evaluation showed that the use of adaptive constraints increases the performance of the source/filter model for speaker-dependent speech separation, and compares favorably to fully-supervised speech separation.

7 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895