Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Speaker Verification Performance Evaluation Based on Open Source Speech Processing Software and TIMIT Speech Corpus

[...]

Piotr Kłosowski, Adam Dustor, Jacek Izydorczyk

16 Jun 2015-Computer Networks and Isdn Systems

TL;DR: The article presents the example of using open source speech processing software to perform speaker verification experiments designed to test various speaker recognition models based on different scenarios.

...read moreread less

Abstract: Creating of speaker recognition application requires advanced speech processing techniques realized by specialized speech processing software. It is very possible to improve the speaker recognition research by using speech processing platform based on open source software. The article presents the example of using open source speech processing software to perform speaker verification experiments designed to test various speaker recognition models based on different scenarios. Speaker verification efficiency was evaluated for each scenario using TIMIT speech corpus distributed by Linguistic Data Consortium. The experiment results allowed to compare and select the best scenario to build speaker model for speaker verification application.

...read moreread less

12 citations

Posted Content•

Speech recognition with quaternion neural networks

[...]

Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges Linarès, Renato De Mori - Show less +1 more

21 Nov 2018-arXiv: Audio and Speech Processing

TL;DR: This work proposes to investigate modern quaternion-valued models such as convolutional and recurrentQuaternion neural networks in the context of speech recognition with the TIMIT dataset and shows that QNNs always outperform real-valued equivalent models with way less free parameters, leading to a more efficient, compact, and expressive representation of the relevant information.

...read moreread less

Abstract: Neural network architectures are at the core of powerful automatic speech recognition systems (ASR). However, while recent researches focus on novel model architectures, the acoustic input features remain almost unchanged. Traditional ASR systems rely on multidimensional acoustic features such as the Mel filter bank energies alongside with the first, and second order derivatives to characterize time-frames that compose the signal sequence. Considering that these components describe three different views of the same element, neural networks have to learn both the internal relations that exist within these features, and external or global dependencies that exist between the time-frames. Quaternion-valued neural networks (QNN), recently received an important interest from researchers to process and learn such relations in multidimensional spaces. Indeed, quaternion numbers and QNNs have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with up to four times less learning parameters than real-valued models. We propose to investigate modern quaternion-valued models such as convolutional and recurrent quaternion neural networks in the context of speech recognition with the TIMIT dataset. The experiments show that QNNs always outperform real-valued equivalent models with way less free parameters, leading to a more efficient, compact, and expressive representation of the relevant information.

...read moreread less

12 citations

Journal Article•DOI•

Vowel detection using a perceptually-enhanced spectrum matching conditioned to phonetic context and speaker identity

[...]

Hamidreza Baradaran Kashani¹, Abolghasem Sayadiyan¹, Hamid Sheikhzadeh¹•Institutions (1)

Amirkabir University of Technology¹

01 Jul 2017-Speech Communication

TL;DR: A new model based on some proposed components called matched filters (MFs), where instead of using a fixed filter bank for the entire speech signal, the proposed TOC is generated by adopting a pair of vowel and consonant MFs for each voiced speech frame.

...read moreread less

12 citations

Journal Article•DOI•

Spectro-temporal modulation energy based mask for robust speaker identification

[...]

Tai-Shih Chi¹, Ting-Han Lin¹, Chung Chien Hsu¹•Institutions (1)

National Chiao Tung University¹

09 Apr 2012-Journal of the Acoustical Society of America

TL;DR: Simulation results show the proposed method produces much higher speaker identification rates in all signal-to-noise ratio (SNR) conditions than the baseline system using mel-frequency cepstral coefficients.

...read moreread less

Abstract: Spectro-temporal modulations of speech encode speech structures and speaker characteristics. An algorithm which distinguishes speech from non-speech based on spectro-temporal modulation energies is proposed and evaluated in robust text-independent closed-set speaker identification simulations using the TIMIT and GRID corpora. Simulation results show the proposed method produces much higher speaker identification rates in all signal-to-noise ratio (SNR) conditions than the baseline system using mel-frequency cepstral coefficients. In addition, the proposed method also outperforms the system, which uses auditory-based nonnegative tensor cepstral coefficients [Q. Wu and L. Zhang, “Auditory sparse representation for robust speaker recognition based on tensor structure,” EURASIP J. Audio, Speech, Music Process. 2008, 578612 (2008)], in low SNR (≤ 10 dB) conditions.

...read moreread less

12 citations

Proceedings Article•DOI•

A hierarchical, context-dependent neural network architecture for improved phone recognition

[...]

László Tóth¹•Institutions (1)

Hungarian Academy of Sciences¹

22 May 2011

TL;DR: This paper combines three simple refinements proposed recently to improve HMM/ANN hybrid models to apply a hierarchy of two nets, where the second net models the contextual relations of the state posteriors produced by the first network.

...read moreread less

Abstract: In this paper we combine three simple refinements proposed recently to improve HMM/ANN hybrid models The first refinement is to apply a hierarchy of two nets, where the second net models the contextual relations of the state posteriors produced by the first network The second idea is to train the network on context-dependent units (HMM states) instead of context-independent phones or phone states As the latter refinement results in a lot of output neurons, combining the two methods directly would be problematic Hence the third trick is to shrink the output layer of the first net using the bottleneck technique before applying the second net on top of it The phone recognition results obtained on the TIMIT database demonstrate that both the context-dependent and the 2-stage modeling methods can bring about marked improvements Using them in combination, however, results in a further significant gain in accuracy With the bottleneck technique a further improvement can be obtained, especially when the number of context-dependent units is large

...read moreread less

12 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics