Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Confidence measure extraction for SVM speech classifiers using artificial neural networks

[...]

S. Amini¹, Farbod Razzazi¹, K. Nayebi•Institutions (1)

Islamic Azad University¹

08 Dec 2008

TL;DR: Two methods to add CM into binary SVM outputs using trainable intelligent systems are described, the first method is the simulation of Platt method using neural network while the second method is a linear combination of Platts sigmoid functions using multi-layer perceptron.

...read moreread less

Abstract: Although the recognition results of support vector machines are very promising in many applications, however there is a gap between the accuracy of SVM based speech recognizers and time series models (e.g. HMM). The main reason is the lack of reliable confidence measure (CM) in SVM outputs. This paper describes two methods to add CM into binary SVM outputs using trainable intelligent systems. The first method is the simulation of Platt method using neural network while the second method is a linear combination of Platt sigmoid functions using multi-layer perceptron. The results of experiments, arranged on a set of confused phonemes using TIMIT corpus, show that the second method demonstrates better performance than the first one, e.g. After rejecting 20% of classifications by CM, the achieved error rates for ldquo/p/,/t/rdquo, ldquo/p/,/q/rdquo and ldquo/t/,q/rdquo phonemes are 3.86%, 2.1% and 0.6% respectively, while this error rate is much higher without employing neural networks. Although by increasing the number of phonemes, the performance of the second method will match that of the first one.

...read moreread less

2 citations

Proceedings Article•DOI•

Two-Layer Decision Model Based on Noise Classification

[...]

Liu Tingting, Kang Kai, Chou Li

01 Aug 2016

TL;DR: A novel two-layer decision model based on noise classification to detect the activity voice robustly is proposed and experimental results show that the method outperforms global classifier, especially in low SNR condition.

...read moreread less

Abstract: Generally, the performance of endpoint detection is affected by the noise. In this paper, we propose a novel two-layer decision model based on noise classification to detect the activity voice robustly. The training processing mainly contains two steps: firstly, we employ the noisex-92 database, which consists of different types of pure noise, to train a BP neural network in order to classify the noise type precisely, secondly, we train BP neural networks for each noise type covering large range of signal noise ratio (SNR). In the testing phase, we assume that the short period of silence at the beginning of the signal contains features for noise and utilize them to get the noise type. Then, we use the classifier corresponding to the noise type to detect activity voice. We conduct experiments on TIMIT corpus for 5 noise types under 7 SNR conditions. And experimental results show that our method outperforms global classifier, especially in low SNR condition.

...read moreread less

2 citations

Proceedings Article•DOI•

Isolated digit recognition using a block diagonal recurrent neural network

[...]

Shyamala C. Sivakumar¹, William J. Phillips, William Robertson•Institutions (1)

Dalhousie University¹

07 Mar 2000

TL;DR: Simulation results for classifying the utterances show that the size of the BDRNN required is very small compared to multilayer perceptron networks with time delayed feedforward connections.

...read moreread less

Abstract: The objective of this paper is to recognize speech based on speech prediction techniques using a discrete time recurrent neural network (DTRNN) with a block diagonal feedback weight matrix called the block diagonal recurrent neural network (BDRNN). The ability of this network has been investigated for the TIMIT isolated digits spoken by a representative speaker. Simulation results for classifying the utterances show that the size of the BDRNN required is very small compared to multilayer perceptron networks with time delayed feedforward connections.

...read moreread less

2 citations

Journal Article•DOI•

Disentangled Feature Learning for Noise-Invariant Speech Enhancement

[...]

Soo Hyun Bae, Inkyu Choi, Nam Soo Kim

03 Jun 2019-Applied Sciences

TL;DR: This work proposes a novel noise-invariant speech enhancement method which manipulates the latent features to distinguish between the speech and noise features in the intermediate layers using adversarial training scheme and offers more robust noise- Invariant property than the conventional speech enhancement techniques.

...read moreread less

Abstract: Most of the recently proposed deep learning-based speech enhancement techniques have focused on designing the neural network architectures as a black box. However, it is often beneficial to understand what kinds of hidden representations the model has learned. Since the real-world speech data are drawn from a generative process involving multiple entangled factors, disentangling the speech factor can encourage the trained model to result in better performance for speech enhancement. With the recent success in learning disentangled representation using neural networks, we explore a framework for disentangling speech and noise, which has not been exploited in the conventional speech enhancement algorithms. In this work, we propose a novel noise-invariant speech enhancement method which manipulates the latent features to distinguish between the speech and noise features in the intermediate layers using adversarial training scheme. To compare the performance of the proposed method with other conventional algorithms, we conducted experiments in both the matched and mismatched noise conditions using TIMIT and TSPspeech datasets. Experimental results show that our model successfully disentangles the speech and noise latent features. Consequently, the proposed model not only achieves better enhancement performance but also offers more robust noise-invariant property than the conventional speech enhancement techniques.

...read moreread less

2 citations

Proceedings Article•DOI•

Boosting small MLPs with entropy combination improves phoneme posteriors estimation

[...]

Alireza Kazemi¹, Fariborz Sobhanmanesh¹, Reza Boostani¹•Institutions (1)

Shiraz University¹

15 Jun 2011

TL;DR: This paper investigates improvements in phoneme classification and recognition using an ensemble of small size multi-layer perceptrons (MLPs) instead of a large monolithic MLP.

...read moreread less

Abstract: In this paper we investigate improvements in phoneme classification and recognition using an ensemble of small size multi-layer perceptrons (MLPs) instead of a large monolithic MLP. The ensemble adopts different input context spans. It is trained using AdaBoost algorithm and output posteriors are combined according to two static and adaptive combination rules including weighting based on static classifier error and inverse entropy. The proposed method improves accuracy without increasing number of total connectionist weights. Experimental results on TIMIT corpus present promising improvements in phoneme classification and recognition rates.

...read moreread less

2 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics