Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Transductive nonnegative matrix factorization for semi-supervised high-performance speech separation

[...]

Naiyang Guan¹, Long Lan¹, Dacheng Tao², Zhigang Luo¹, Xuejun Yang¹ - Show less +1 more•Institutions (2)

National University of Defense Technology¹, University of Technology, Sydney²

04 May 2014

TL;DR: Experiments show that the proposed TNMF-based methods outperform traditional NMF- based methods for separating the monophonic mixtures of speech signals of known speakers.

...read moreread less

Abstract: Regarding the non-negativity property of the magnitude spectrogram of speech signals, nonnegative matrix factorization (NMF) has obtained promising performance for speech separation by independently learning a dictionary on the speech signals of each known speaker. However, traditional NM-F fails to represent the mixture signals accurately because the dictionaries for speakers are learned in the absence of mixture signals. In this paper, we propose a new transductive NMF algorithm (TNMF) to jointly learn a dictionary on both speech signals of each speaker and the mixture signals to be separated. Since TNMF learns a more descriptive dictionary by encoding the mixture signals than that learned by NMF, it significantly boosts the separation performance. Experiments results on a popular TIMIT dataset show that the proposed TNMF-based methods outperform traditional NMF-based methods for separating the monophonic mixtures of speech signals of known speakers.

...read moreread less

13 citations

Proceedings Article•DOI•

Acoustic-phonetic analysis of fricatives for classification using SVM based algorithm

[...]

Alex Frid¹, Yizhar Lavner¹•Institutions (1)

Tel-Hai Academic College¹

13 Dec 2010

TL;DR: An effective algorithm for classification of one group of phonemes, namely the unvoiced fricatives, which are characterized by a relatively large amount of spectral energy in the high frequency range is presented.

...read moreread less

Abstract: Classification of phonemes is the process of assigning a phonetic category to a short section of speech signal. It is a key stage in various applications such as Spoken Term Detection, continuous speech recognition and music to lyrics synchronization, but it can also be useful on its own, for example in the professional music industry, and for applications for the hearing impaired. In this study we present an effective algorithm for classification of one group of phonemes, namely the unvoiced fricatives, which are characterized by a relatively large amount of spectral energy in the high frequency range. The classification between individual phonemes within this group is fairly difficult due to the fact that their acoustic-phonetic characteristics are quite similar. A three-stage classification algorithm between the unvoiced fricatives is utilized. In the first, preprocessing stage, each phoneme segment is divided into consecutive non-overlapping short windowed frames, which is represented by a 15-dimensional feature vector. In the second stage a support vector machine (SVM) is trained, using radial basis kernel function and an automatic grid search for optimizing the SVM parameter. A tree-based algorithm is used in the classification stage, where the phonemes are first classified into two subgroups according to their articulation: sibilants (/s/ and /sh/) and the nonsibilants (/f/ and /th/). Each subgroup is further classified using another SVM. For the evaluation of the performance of the algorithm we used more than 11000 phonemes extracted from the TIMIT speech database. Using a majority vote for the feature vectors of the-same phoneme, the overall accuracy of 85% is obtained (91% for the subset /s/, /sh/ and /f/). These results are comparable and somewhat better than those achieved in other studies. The efficiency and robustness of the algorithm make it implementable in real time applications for the hearing impaired or in recording studios.

...read moreread less

13 citations

Posted Content•

Blind phoneme segmentation with temporal prediction errors

[...]

Paul Michel¹, Okko Räsänen², Roland Thiollière, Emmanuel Dupoux•Institutions (2)

Carnegie Mellon University¹, Aalto University²

01 Aug 2016-arXiv: Computation and Language

TL;DR: In this article, an unsupervised algorithm based on sequence prediction models such as Markov chains and recurrent neural networks is proposed to predict speech features frame-by-frame by analyzing the error profile of a model.

...read moreread less

Abstract: Phonemic segmentation of speech is a critical step of speech recognition systems We propose a novel unsupervised algorithm based on sequence prediction models such as Markov chains and recurrent neural network Our approach consists in analyzing the error profile of a model trained to predict speech features frame-by-frame Specifically, we try to learn the dynamics of speech in the MFCC space and hypothesize boundaries from local maxima in the prediction error We evaluate our system on the TIMIT dataset, with improvements over similar methods

...read moreread less

13 citations

Posted Content•

Leveraging End-to-End Speech Recognition with Neural Architecture Search.

[...]

Ahmed Baruwa, Mojeed Abisiga, Ibrahim Gbadegesin, Afeez Fakunle

11 Dec 2019-arXiv: Audio and Speech Processing

TL;DR: It is shown that a large improvement in the accuracy of deep speech models can be achieved with effective Neural Architecture Optimization at a very low computational cost.

...read moreread less

Abstract: Deep neural networks (DNNs) have been demonstrated to outperform many traditional machine learning algorithms in Automatic Speech Recognition (ASR). In this paper, we show that a large improvement in the accuracy of deep speech models can be achieved with effective Neural Architecture Optimization at a very low computational cost. Phone recognition tests with the popular LibriSpeech and TIMIT benchmarks proved this fact by displaying the ability to discover and train novel candidate models within a few hours (less than a day) many times faster than the attention-based seq2seq models. Our method achieves test error of 7% Word Error Rate (WER) on the LibriSpeech corpus and 13% Phone Error Rate (PER) on the TIMIT corpus, on par with state-of-the-art results.

...read moreread less

13 citations

Proceedings Article•DOI•

Robust features fusion for text independent speaker verification enhancement in noisy environments

[...]

Mohsen Mohammadi, Hamid Reza Sadegh Mohammadi

01 May 2017

TL;DR: Four systems based on different speech features were combined in score-level to improve verification accuracy under clean and noisy speech conditions and this reduces the equal error rates is in some cases up to 44%.

...read moreread less

Abstract: So far, many methods have been proposed for speaker verification which provide good results, but their performances reduce in actual noisy environments. A common approach to partially alleviate this problem is the fusion of several methods. In this paper, four systems based on different speech features, i.e., MFCC, IMFCC, LFCC, and PNCC were combined in score-level to improve verification accuracy under clean and noisy speech conditions. The features pairwise and foursome fusion in a speaker verification system based on speaker modeling through the Gaussian mixture model (GMM) were evaluated. TIMIT and NOISEX92 databases were used to implement as the speech and noise datasets, respectively. The experimental results show that the score-level fusion of different feature vectors enhances the accuracy of speaker verification system and this reduces the equal error rates is in some cases up to 44%.

...read moreread less

13 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics