scispace - formally typeset
Search or ask a question
Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.


Papers
More filters
Journal ArticleDOI
TL;DR: A novel Self-Adjustable Neural Network is presented, to enable the network to adjust itself according to different data input sizes, and is benchmarked against the standard and state-of-the-art recogniser, Hidden Markov Model.

16 citations

Posted Content
TL;DR: Results show that this approach significantly outperforms previous HMM based acoustic units discovery systems and compares favorably with the Variational Auto Encoder-HMM.
Abstract: This work tackles the problem of learning a set of language specific acoustic units from unlabeled speech recordings given a set of labeled recordings from other languages. Our approach may be described by the following two steps procedure: first the model learns the notion of acoustic units from the labelled data and then the model uses its knowledge to find new acoustic units on the target language. We implement this process with the Bayesian Subspace Hidden Markov Model (SHMM), a model akin to the Subspace Gaussian Mixture Model (SGMM) where each low dimensional embedding represents an acoustic unit rather than just a HMM's state. The subspace is trained on 3 languages from the GlobalPhone corpus (German, Polish and Spanish) and the AUs are discovered on the TIMIT corpus. Results, measured in equivalent Phone Error Rate, show that this approach significantly outperforms previous HMM based acoustic units discovery systems and compares favorably with the Variational Auto Encoder-HMM.

16 citations

Proceedings ArticleDOI
22 May 2011
TL;DR: Experimental results of continuous phoneme recognition on TIMIT core test set and Japanese read speach recognition task using monophone showed that HCNF was superior to HCRF and HMM trained in MPE manner.
Abstract: Hidden Conditional Random Fields(HCRF) is a very promising approach to model speech. However, because HCRF computes the score of a hypothesis by summing up linearly weighted features, it cannot consider non-linearity among features that will be crucial for speech recognition. In this paper, we extend HCRF by incorporating gate function used in neural networks and propose a new model called Hidden Conditional Neural Fields(HCNF). Differently with conventional approaches, HCNF can be trained without any initial model and incorporate any kinds of features. Experimental results of continuous phoneme recognition on TIMIT core test set and Japanese read speach recognition task using monophone showed that HCNF was superior to HCRF and HMM trained in MPE manner.

15 citations

Proceedings ArticleDOI
M. Cutajar1, Edward Gatt1, Ivan Grech1, Owen Casha1, Joseph Micallef1 
01 Jul 2013
TL;DR: This paper presents the design of a digital hardware implementation based on Support Vector Machines (SVMs), for the task of multi-speaker phoneme recognition, and a priority scheme was also included in the architecture, in order to forecast the three most likely phonemes.
Abstract: This paper presents the design of a digital hardware implementation based on Support Vector Machines (SVMs), for the task of multi-speaker phoneme recognition. The One-against-one multiclass SVM method, with the Radial Basis Function (RBF) kernel was considered. Furthermore, a priority scheme was also included in the architecture, in order to forecast the three most likely phonemes. The designed system was synthesised on a Xilinx Virtex-II XC2V3000 FPGA, and evaluated with the TIMIT corpus. This phoneme recognition system is intended to be implemented on a dedicated chip, along with the Discrete Wavelet Transforms (DWTs) for feature extraction, to further improve the resultant performance.

15 citations

Proceedings Article
04 Sep 2000
TL;DR: The influence of GSM speech coding in the performance of a text-independent speaker recognition system based on Gaussian Mixture Models (GMM) is investigated and feature calculation directly from the GSM EFR encoded parameters is explored.
Abstract: We have investigated the influence of GSM speech coding in the performance of a text-independent speaker recognition system based on Gaussian Mixture Models (GMM). The performance degradation due to the utilization of the three GSM speech coders was assessed, using three trans-coded databases, obtained by passing the TIMIT through each GSM coder / decoder. The recognition performance was also assessed using the original TIMIT and its 8 kHz downsampled version. Then, different experiments were carried out in order to explore feature calculation directly from the GSM EFR encoded parameters and to measure the degradation introduced by different aspects of the coder.

15 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
76% related
Feature (machine learning)
33.9K papers, 798.7K citations
75% related
Feature vector
48.8K papers, 954.4K citations
74% related
Natural language
31.1K papers, 806.8K citations
73% related
Deep learning
79.8K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202262
202167
202086
201977
201895