Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•

Plosive Spotting with Margin Classifiers

[...]

Joseph Keshet¹, Dan Chazan¹, Ben-Zion Bobrovsky²•Institutions (2)

Tel Aviv University¹, IBM²

01 Jan 2001

TL;DR: The algorithm presented here is applied to plosives detection, but can easily be adapted to any class of phonemes, and uses the loss-based multi-class decisions.

...read moreread less

Abstract: This paper presents a novel algorithm for precise spotting of plosives. The algorithm is based on a pattern matching technique implemented with margin classifiers, such as support vector machines (SVM). A special hierarchical treatment to overcome the problem of fricative and false silence detection is presented. It uses the loss-based multi-class decisions. Furthermore, a method for smoothing the overall decisions by sequential linear programming is described. The proposed algorithm was tested on the TIMIT corpus, which produced a very high spotting accuracy. The algorithm presented here is applied to plosives detection, but can easily be adapted to any class of phonemes.

...read moreread less

22 citations

Proceedings Article•DOI•

Dimensionality reduction methods for HMM phonetic recognition

[...]

Hongbing Hu¹, Stephen A. Zahorian¹•Institutions (1)

Binghamton University¹

14 Mar 2010

TL;DR: Two nonlinear feature dimensionality reduction methods based on neural networks for a HMM-based phone recognition system are presented and it is shown that recognition accuracies with the transformed features are slightly higher than those obtained with original features and considerably higher than obtained with linear dimensionality Reduction methods.

...read moreread less

Abstract: This paper presents two nonlinear feature dimensionality reduction methods based on neural networks for a HMM-based phone recognition system. The neural networks are trained as feature classifiers to reduce feature dimensionality as well as maximize discrimination among speech features. The outputs of different network layers are used for obtaining transformed features. Moreover, the training of the neural networks uses the category information that corresponds to a state in HMMs so that the trained networks can better accommodate the temporal variability of features and obtain more discriminative features in a low dimensional space. Experimental evaluation using the TIMIT database shows that recognition accuracies with the transformed features are slightly higher than those obtained with original features and considerably higher than obtained with linear dimensionality reduction methods. The highest phone accuracy obtained with 39 phone classes and TIMIT was 74.9% using a large number of training iterations based on the state-specific targets.

...read moreread less

22 citations

Book Chapter•DOI•

MLP internal representation as discriminative features for improved speaker recognition

[...]

Dalei Wu¹, Andrew C. Morris¹, Jacques Koreman¹•Institutions (1)

Saarland University¹

19 Apr 2005

TL;DR: The output from the second hidden layer (compression layer) of an MLP with three hidden layers trained to identify a subset of 100 speakers selected at random from a set of 300 training speakers in Timit, can provide a 77% relative error reduction for common Gaussian mixture model (GMM) based speaker identification.

...read moreread less

Abstract: Feature projection by non-linear discriminant analysis (NLDA) can substantially increase classification performance. In automatic speech recognition (ASR) the projection provided by the pre-squashed outputs from a one hidden layer multi-layer perceptron (MLP) trained to recognise speech sub-units (phonemes) has previously been shown to significantly increase ASR performance. An analogous approach cannot be applied directly to speaker recognition because there is no recognised set of "speaker sub-units" to provide a finite set of MLP target classes, and for many applications it is not practical to train an MLP with one output for each target speaker. In this paper we show that the output from the second hidden layer (compression layer) of an MLP with three hidden layers trained to identify a subset of 100 speakers selected at random from a set of 300 training speakers in Timit, can provide a 77% relative error reduction for common Gaussian mixture model (GMM) based speaker identification.

...read moreread less

22 citations

Journal Article•DOI•

Improved variable step-size NLMS adaptive filtering algorithm for acoustic echo cancellation

[...]

Mahfoud Hamidia, Abderrahmane Amrouche

01 Feb 2016-Digital Signal Processing

TL;DR: It is shown from single-talk and double-talk scenarios using speech signals from TIMIT database that the proposed algorithm achieves a better performance, more than 3 dB of attenuation in the misalignment evaluation compared to GSVSS-NLMS, non-parametric VSS- NLMS, and standard NLMS algorithms for a non-stationary input in noisy environments.

...read moreread less

22 citations

Proceedings Article•DOI•

Robust syllable segmentation and its application to syllable-centric continuous speech recognition

[...]

Rajesh Janakiraman¹, J. Chaitanya Kumar¹, Hema A. Murthy¹•Institutions (1)

Indian Institute of Technology Madras¹

15 Mar 2010

TL;DR: The focus of this paper is to develop a knowledge-based robust syllable segmentation algorithm and to establish the importance of accurate segmentation in both the training and testing phases of a speech recognition system.

...read moreread less

Abstract: The focus of this paper is two-fold: (a) to develop a knowledge-based robust syllable segmentation algorithm and (b) to establish the importance of accurate segmentation in both the training and testing phases of a speech recognition system. A robust segmentation algorithm for segmenting the speech signal into syllables is first developed. This uses a non-statistical technique that is based on group delay(GD) segmentation and Vowel Onset point(VOP) detection. The transcription corresponding to the utterance is syllabified using rules. This produces an annotation for the train data. The annotated train data is then used to train a syllable-based speech recognition system. The test signal is also segmented using the proposed algorithm. This segmentation information is then incorporated into the linguistic search space to reduce both computational complexity and word error rate(WER). WER's of 4.4% and 21.2% are reported on the TIMIT and NTIMIT databases respectively.

...read moreread less

22 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics