Home
/
Topics
/
TIMIT

Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Speech Enhancement Based on Noise Eigenspace Projection

[...]

Dongwen Ying¹, Masashi Unoki¹, Xugang Lu¹, Jianwu Dang¹•Institutions (1)

Japan Advanced Institute of Science and Technology¹

01 May 2009-IEICE Transactions on Information and Systems

TL;DR: An approach based on noise eigenspace projections to pack the color component into a subspace, named “noise subspace”, which efficiently reduces noise with little speech distortion.

...read moreread less

Abstract: How to reduce noise with less speech distortion is a challenging issue for speech enhancement. We propose a novel approach for reducing noise with the cost of less speech distortion. A noise signal can generally be considered to consist of two components, a “white-like” component with a uniform energy distribution and a “color” component with a concentrated energy distribution in some frequency bands. An approach based on noise eigenspace projections is proposed to pack the color component into a subspace, named “noise subspace”. This subspace is then removed from the eigenspace to reduce the color component. For the white-like component, a conventional enhancement algorithm is adopted as a complementary processor. We tested our algorithm on a speech enhancement task using speech data from the Texas Instruments and Massachusetts Institute of Technology (TIMIT) dataset and noise data from NOISEX-92. The experimental results show that the proposed algorithm efficiently reduces noise with little speech distortion. Objective and subjective evaluations confirmed that the proposed algorithm outperformed conventional enhancement algorithms.

...read moreread less

Proceedings Article•DOI•

Phoneme dependent frame selection preference

[...]

Tingyao Wu, Jacques Duchateau, Dirk Van Compernolle¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Jan 2007

TL;DR: This paper shows that frame selection behavior is phoneme dependent, and observes that some phonemes benefit from frame selection while others do not, and that this separation matches the phonetic categories.

...read moreread less

Abstract: In previous study we proposed algorithms to select representative frames from a segment for phoneme likelihood evaluation. In this paper we show that this frame selection behavior is phoneme dependent. We observe that some phonemes benefit from frame selection while others do not, and that this separation matches the phonetic categories. For those phonemes sensitive to frame selection, we find that selecting frames at some pre-defined positions in the segment enhances the discrimination between phonemes. These phoneme-dependent positions are explicitly retrieved and used in a phoneme classification task. Experimental results on the TIMIT phonetic database show that the frame selection method significantly outperforms decoding by the classical Viterbi decoder.

...read moreread less

Template Based Low Data Rate Speech Encoder

[...]

Lawrence Fransen

30 Sep 1993

TL;DR: There is a need for lower-data-rate voice encoders for special applications: improved performance in high bit-error conditions, low- probability-of-intercept (LPI) voice communication, and narrowband integrated voice/data systems.

...read moreread less

Abstract: : The 2400-b/s linear predictive coder (LPC) is currently being widely deployed to support tactical voice communication over narrowband channels. However, there is a need for lower-data-rate voice encoders for special applications: improved performance in high bit-error conditions, low- probability-of-intercept (LPI) voice communication, and narrowband integrated voice/data systems. An 800-b/s voice encoding algorithm is presented which is an extension of the 2400-b/s LPC. To construct template tables, speech samples of 420 speakers uttering 8 sentences each were excerpted from the Texas Instrument - Massachusetts Institute of Technology (TIMIT) Acoustic-Phonetic Speech Data Base. Speech intelligibility of the 800-b/s voice encoding algorithm measured by the diagnostic rhyme test (DRT) is 91.5 for three male speakers. This score compares favorably with the 2400-b/s LPC of a few years ago.

...read moreread less

Journal Article•DOI•

Speaker identification using hybrid neural network support vector machine classifier

[...]

V. Karthikeyan, Suja Priyadharsini Subramoniam, K. Balamurugan, Manickam Ramasamy

30 Nov 2022-International Journal of Speech Technology

Journal Article•DOI•

Method of an Acoustic Echo Suppression Based on Recurrent Neural Network and Clustering

[...]

01 May 2022-Вестник Южно-Уральского государственного университета

TL;DR: In this paper , a neural network that evaluates an ideal binary mask IBM using features extracted from a mixture of near-end and far-end signals was used to solve the problem of acoustic echo suppression.

...read moreread less

Abstract: The article solves the problem of acoustic echo suppression based on a neural network that evaluates an ideal binary mask IBM using features extracted from a mixture of near-end and far-end signals. The novelty of the proposed method lies in the use of the clustering algorithm in addition to the bidirectional recurrent neural network BLSTM. To evaluate the use of the EM, Mean-Shift, k-Means clustering algorithms, the models have been trained and tested on the TIMIT database. For each model, the ERLE, PESQ, STOI metrics have been calculated to characterize its quality. The use of the EM and Mean-Shift clustering algorithms appeared to be inefficient compared to the BLSTM algorithm at a signal-to-echo ratio of 10 dB. With a signal-to-echo ratio of 6 dB, BLSTM+Mean-Shift resulted in a marginal improvement in the PESQ metric compared to the BLSTM algorithm. The results of the experiments show the effectiveness of the proposed BLSTM model when using a network with the K-Means algorithm, compared to using a pure BLSTM for echo cancellation in double-talk scenarios. With a signal-to-echo ratio of 10 dB, the STOI metric, which characterizes speech intelligibility, has improved by 7%, and the PESQ metric, which characterizes the quality of speech restoration, by 18.8%.

...read moreread less

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics