Home
/
Topics
/
TIMIT

Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Speaker-independent voiced-stop-consonant recognition using a block-windowed neural network architecture

[...]

B.D. Bryant¹, J.N. Gowdy¹•Institutions (1)

Clemson University¹

07 Mar 1993

TL;DR: The authors study several of the more well-known connectionist models, and how they address the time and frequency variability of the multispeaker, voiced-stop-consonant recognition task.

...read moreread less

Abstract: The authors study several of the more well-known connectionist models, and how they address the time and frequency variability of the multispeaker, voiced-stop-consonant recognition task. Among the network architectures reviewed or tested for were the self-organizing feature maps (SOFM) architecture, various derivatives of this architecture, the time-delay neural network (TDNN) architecture, various derivatives of this architecture, and two frequency-and-time-shift-invariant architectures, frequency-shift-invariant TDNN, and the block-windowed neural network (FTDNN and BWNN). Voiced-stop speech was extracted from up to four dialect regions of the TIMIT continuous speech corpus for subsequent preprocessing and training and testing of network instances. Various feature representations were tested for their robustness in representing the voiced-stop consonants.

...read moreread less

2 citations

Proceedings Article•DOI•

Dynamic evidence models in a DBN phone recognizer.

[...]

William Schuler¹, Timothy A. Miller¹, Stephen Wu¹, Andrew Exley¹•Institutions (1)

University of Minnesota¹

17 Sep 2006

TL;DR: Results on the standard TIMIT phone recognition task show this CRF evidence model, even with a relatively simple first-order feature set, is competitive with standard HMMs and DBN variants using static Gaussian mixture models on MFCC features.

...read moreread less

Abstract: This paper describes an implementation of a discriminative acoustical model – a Conditional Random Field (CRF) – within a Dynamic Bayes Net (DBN) formulation of a Hierarchic Hidden Markov Model (HHMM) phone recognizer. This CRF-DBN topology accounts for phone transition dynamics in conditional probability distributions over random variables associated with observed evidence, and therefore has less need for hidden variable states corresponding to transitions between phones, leaving more hypothesis space available for modeling higher-level linguistic phenomena such syntax and semantics. The model also has the interesting property that it explicitly represents likely formant trajectories and formant targets of modeled phones in its random variable distributions, making it more linguistically transparent than models based on traditional HMMs with conditionally independent evidence variables. Results on the standard TIMIT phone recognition task show this CRF evidence model, even with a relatively simple first-order feature set, is competitive with standard HMMs and DBN variants using static Gaussian mixture models on MFCC features.

...read moreread less

2 citations

Proceedings Article•DOI•

A simple perceptual method for quantizing wavelet packet coefficients of wideband speech

[...]

Omid Ghahabi¹, Mohammad H. Savoji¹•Institutions (1)

Shahid Beheshti University¹

03 Dec 2010

TL;DR: The results on 500 TIMIT files show that this method based on some basic perceptual considerations achieves about 15–35% reduction in the average bit-rates with almost the same or even better perceptual qualities.

...read moreread less

Abstract: In this paper an efficient and low complexity perceptual method is proposed for quantizing the wavelet packet coefficients of high quality speech signals. The performance of the proposed method is compared, using the same codec, with the case where all coefficients are quantized using a fixed number of bits. The results on 500 TIMIT files show that this method based on some basic perceptual considerations achieves about 15–35% reduction in the average bit-rates with almost the same or even better perceptual qualities.

...read moreread less

2 citations

Journal Article•DOI•

Speech enhancement using U-nets with wide-context units

[...]

Tomasz Grzywalski, Szymon Drgas

09 Mar 2022-Multimedia Tools and Applications

2 citations

Proceedings Article•DOI•

Syllable nucleus Durations Estimation using Linear Regression based ensemble model

[...]

Jingli Lu¹, Ruili Wang¹, Liyanage C. De Silva¹, Yang Gao²•Institutions (2)

Massey University¹, Nanjing University²

19 Apr 2009

TL;DR: An interval-data-based Linear Regression Model for syllable nucleus Durations Estimation (LRM-DE), which treats syllable boundary time-marks in pairs makes it more suitable for estimating syllable durations for English sentences, which can be used for sentence stress detection.

...read moreread less

Abstract: Unlike conventional automatic continuous speech segmentation models that deal with each boundary time-mark individually, in this paper, we propose an interval-data-based Linear Regression Model for syllable nucleus Durations Estimation (LRM-DE), which treats syllable boundary time-marks in pairs. This characteristic of LRM-DE makes it more suitable for estimating syllable durations for English sentences, which can be used for sentence stress detection. LRM-DE combines the outcomes of multiple base automatic speech segmentation machines (ASMs) to generate final boundary time-marks that miminize the average distance of the predicted and reference boundary-pairs of syllable nuclei. Experimental results show that on TIMIT dataset, LRM-DE reduces the average difference between the predicted syllable nucleus durations and their reference ones from 13.64ms (the best result of a single ASM) to 11.81ms. Also, LRM-DE improves the syllable nucleus segmentation accuracy from 81.59% to 83.98% within a tolerance of 20ms.

...read moreread less

2 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics