Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

On using parameterized multi-channel non-causal Wiener filter-adapted convolutional neural networks for distant speech recognition

[...]

Jeehye Lee¹, Joon-Hyuk Chang¹, Jinho Sohn²•Institutions (2)

Hanyang University¹, LG Electronics²

01 Jan 2016

TL;DR: Experimental results on the TIMIT dataset show that the proposed PMWF-based CNN approach outperforms the cross-channel CNN and the DS beamformer when evaluating the word error rate (WER) in various DSR environments.

...read moreread less

Abstract: Recently, the convolutional neural network (CNN) with multiple microphones was proposed to use the delay-sum (DS) beamformer for distant speech recognition (DSR) and compared to the direct use of multiple acoustic channels as a parallel input to the CNN [1]. We explore the parameterized multi-channel non-causal Wiener filter (PMWF) as the front-end to train the CNN, which is applied to acoustic modeling for DSR. For this, we first present a concise description of the basic PMWF as well as its advantages and then explain how to organize the PMWF into the CNN with a novel architecture. Experimental results on the TIMIT dataset show that the proposed PMWF-based CNN approach outperforms the cross-channel CNN and the DS beamformer when evaluating the word error rate (WER) in various DSR environments.

...read moreread less

5 citations

Proceedings Article•DOI•

Arabic and English speech recognition using cross-language acoustic models

[...]

Yousef Ajami Alotaibi¹, Sid-Ahmed Selouani², Mansour M. Alghamdi³, Ali H. Meftah¹•Institutions (3)

King Saud University¹, Université de Moncton², King Abdulaziz City for Science and Technology³

02 Jul 2012

TL;DR: The results show that lack of enough speech resource that faces Arabic language can be solved by considering models' features of common phonemes given by English.

...read moreread less

Abstract: In recent years there has been an increasing interest in speech and language processing systems dedicated to Arabic language. In order to perform adequate design and evaluation of those systems, speech databases are needed. The aim of this paper is to evaluate the design of Arabic and English speech recognition systems by using common acoustic models. Cross-language experiments between Arabic and English are conducted and discussed with respect to the main class of phonemes in each language. The LDC WestPoint Arabic database and TIMIT are used in these experiments. The results show that lack of enough speech resource that faces Arabic language can be solved by considering models' features of common phonemes given by English.

...read moreread less

5 citations

Proceedings Article•

Analysis of L2 English speech corpus by automatic phoneme alignment.

[...]

Hajime Tsubaki¹, Mariko Kondo¹•Institutions (1)

Waseda University¹

01 Jan 2011

TL;DR: The test showed that the L2 incorporated new alignment module could perform more accurate automatic alignment of L2 English data and the same methods should be able to be applied to other language data.

...read moreread less

Abstract: This study tested the application of adapted HTK for automatic alignment of speech corpus of Asian speakers’ English. The HTK tool with TIMIT has problems in aligning non-native speakers’ English. New sets of phoneme sequences for each word were listed to test if an adapted alignment module could accurately analyze pronunciation of Japanese speakers’ English. The new sets of phoneme sequences produced better alignment of Japanese accented English and showed that the L2 incorporated new alignment module could perform more accurate automatic alignment of L2 English data. The same methods should be able to be applied to other language data.

...read moreread less

5 citations

Proceedings Article•DOI•

The fixed-point optimization of mel frequency cepstrum coefficients for speech recognition

[...]

Ge Zhang¹, Jinghua Yin¹, Qian Liu¹, Chao Yang¹•Institutions (1)

Harbin University of Science and Technology¹

15 Sep 2011

TL;DR: The optimized algorithm was applied to a binary-search-based look-up table to take place of original Taylor expansion algorithm, and it reduced the time of execution frames to meet real-time speech recognition system.

...read moreread less

Abstract: Speech recognition is a computationally complexity process and it is suitable for battery powered devices like mobile phones and other personal PDAs. Particularly the parts of mel-scaled frequency cepstrum coefficients (MFCCs) are a process of dimension reduction for reducing resources to accurately describe speech samples. The optimized algorithm was applied to a binary-search-based look-up table to take place of original Taylor expansion algorithm, and it reduced the time of execution frames to meet real-time speech recognition system. The look-up tables were established by analysing the pseudo code to reduce the memory size in this paper. The transition algorithm of floating-point MFCCs to fixed-point ones was investigated to reach a higher precision in the first order approximation of linear interpolation of Log algorithm. The Hidden Markov Model Toolke (HTK) was applied to training the speech samples of Texas Instruments and Massachusetts Institute of Technology (TIMIT). The rate of speech recognition improved 12.02% by the optimized algorithm in the system of speech recognition.

...read moreread less

5 citations

Automatic segmentation of TIMIT by dynamic programming

[...]

Thomas Niesler¹, S. Van Vuuren², L.F.M. ten Bosch¹•Institutions (2)

Radboud University Nijmegen¹, Stellenbosch University²

01 Jan 2012

TL;DR: This work proposes an algorithm based on the principle of dynamic programming for the automatic segmentation of continuous speech into phoneme-like units and shows that a hybrid approach which combines aspects of all three algorithms leads to even better results.

...read moreread less

Abstract: We propose an algorithm based on the principle of dynamic programming for the automatic segmentation of continuous speech into phoneme-like units. A measure of local dissimilarity among consecutive feature vectors is combined with a knowledge of the expected statistical distribution of the segment lengths within a dynamic programming framework to obtain an optimal placement of segment boundaries. We compare the performance of our algorithm with the performance of two recently-proposed alternatives by measuring how closely the hypothesised boundaries match the TIMIT phone boundaries. The results showed that we are able to improve on the performance of the two contrasting approaches. Furthermore, we show that a hybrid approach which combines aspects of all three algorithms leads to even better results.

...read moreread less

5 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics