Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Analysis of a low-dimensional bottleneck neural network representation of speech for modelling speech dynamics.

[...]

Linxue Bai¹, Peter Jancovic¹, Martin J. Russell¹, Philip Weber¹•Institutions (1)

University of Birmingham¹

06 Sep 2015

TL;DR: It is demonstrated that the bottleneck features preserve well the trajectory continuity over time and can provide a suitable representation for the continuous-state hidden Markov model (CS-HMM), which considers speech as a sequence of dwell and transition regions.

...read moreread less

Abstract: This paper presents an analysis of a low-dimensional representation of speech for modelling speech dynamics, extracted using bottleneck neural networks. The input to the neural network is a set of spectral feature vectors. We explore the effect of various designs and training of the network, such as varying the size of context in the input layer, size of the bottleneck and other hidden layers, and using input reconstruction or phone posteriors as targets. Experiments are performed on TIMIT. The bottleneck features are employed in a conventional HMMbased phoneme recognition system, with recognition accuracy of 70.6% on the core test achieved using only 9-dimensional features. We also analyse how the bottleneck features fit the assumptions of dynamic models of speech. Specifically, we employ the continuous-state hidden Markov model (CS-HMM), which considers speech as a sequence of dwell and transition regions. We demonstrate that the bottleneck features preserve well the trajectory continuity over time and can provide a suitable representation for CS-HMM.

...read moreread less

16 citations

Journal Article•DOI•

Image Processing Techniques for Segments Grouping in Monaural Speech Separation

[...]

S. Shoba¹, R. Rajavel¹•Institutions (1)

Sri Sivasubramaniya Nadar College of Engineering¹

01 Aug 2018-Circuits Systems and Signal Processing

TL;DR: An image analysis-based algorithm is proposed to enhance the binary T–F mask obtained in the initial segmentation stage of CASA-based monaural speech separation systems to improve the speech quality and reduce the noise residue.

...read moreread less

Abstract: Monaural speech separation is the process of separating the target speech from the noisy speech mixture recorded using single microphone. It is a challenging problem in speech signal processing, and recently, computational auditory scene analysis (CASA) finds a reasonable solution to solve this problem. This research work proposes an image analysis-based algorithm to enhance the binary T–F mask obtained in the initial segmentation stage of CASA-based monaural speech separation systems to improve the speech quality. The proposed algorithm consists of labeling the initial segmentation mask, boundary extraction, active pixel detection and finally eliminating the noisy non-active pixels. In labeling, the T–F mask obtained from the initial segmentation is labeled as periodicity pixel matrix and non-periodicity pixel matrix. Next boundaries are created by connecting all the possible nearby periodicity pixel matrix and non-periodicity pixel matrix as speech boundary. Some speech boundary may include noisy T–F units as holes, and these holes are treated using the proposed algorithm to properly classify them as the speech-dominant or noise-dominant T–F units in the active pixel detection process. Finally, the noisy T–F units are eliminated. The performance of the proposed algorithm is evaluated using TIMIT speech database. The experimental results show that the proposed algorithm improves the quality of the separated speech by increasing the signal-to-noise ratio by an average value of 9.64 dB and reduces the noise residue by 25.55% as compared to the noisy speech mixture.

...read moreread less

16 citations

Proceedings Article•DOI•

Learning Phonological Rule Probabilities from Speech Corpora with Exploratory Computational Phonology

[...]

Gary N. Tajchman¹, Dan Jurafsky¹, Eric Fosler¹•Institutions (1)

University of California, Berkeley¹

26 Jun 1995

TL;DR: The algorithm is based on using a speech recognition system to discover the surface pronunciations of words in speech corpora and shows the probabilities the system has learned for ten common phonological rules which model reductions and coarticulation effects.

...read moreread less

Abstract: This paper presents an algorithm for learning the probabilities of optional phonological rules from corpora. The algorithm is based on using a speech recognition system to discover the surface pronunciations of words in speech corpora; using an automatic system obviates expensive phonetic labeling by hand. We describe the details of our algorithm and show the probabilities the system has learned for ten common phonological rules which model reductions and coarticulation effects. These probabilities were derived from a corpus of 7203 sentences of read speech from the Wall Street Journal, and are shown to be a reasonably close match to probabilities from phonetically hand-transcribed data (TIMIT). Finally, we analyze the probability differences between rule use in male versus female speech, and suggest that the differences are caused by differing average rates of speech.

...read moreread less

16 citations

Proceedings Article•DOI•

Lagrangian support vector machines for phoneme classification

[...]

Ahmed Ech-Cherif¹, M. Kohili¹, Abdelkader Benyettou¹, M. Benyettou¹•Institutions (1)

University of Science and Technology of Oran Mohamed-Boudiaf¹

01 Jan 2002

TL;DR: This paper attempts to overcome the above difficulty by using the alternative Lagrangian formulation which only requires the inversion of a matrix whose dimension is proportional to the size of the MFCC sequence of vectors.

...read moreread less

Abstract: We study the performance of binary and multi-category SVMs for phoneme classification. The training process of the standard formulation involves the solution of a quadratic programming problem whose complexity depends on the size of the training set. The large size of speech corpora such as TIMIT limits seriously their practical use in continuous speech recognition tasks, using off the shelf personal computers in a reasonable time. In this paper, we attempt to overcome the above difficulty by using the alternative Lagrangian formulation which only requires the inversion of a matrix whose dimension is proportional to the size of the MFCC sequence of vectors. We provide computational results of all possible binary classifiers (1830) on the TIMIT database which are shown to be competitive in terms of recognition rates (96.8%) with those found in the literature (95.6%). The binary classifiers are introduced in the DAGSVM and voting algorithms to perform multi-category classification on some hand picked subsets from TIMIT corpus.

...read moreread less

16 citations

Journal Article•DOI•

A two-stage speech activity detection system considering fractal aspects of prosody

[...]

Soheil Shafiee¹, Farshad Almasganj¹, Bahram Vazirnezhad¹, Ayyoob Jafari¹•Institutions (1)

Amirkabir University of Technology¹

01 Jul 2010-Pattern Recognition Letters

TL;DR: A two-stage speech activity detection system is presented which at first takes advantage of a voice activity detector to discard pause segments out of the audio signals; this is done even in presence of stationary background noises.

...read moreread less

16 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics