Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

An initial attempt for phoneme recognition using Structured Support Vector Machine (SVM)

[...]

Hao Tang¹, Chao-hong Meng¹, Lin-Shan Lee¹•Institutions (1)

National Taiwan University¹

14 Mar 2010

TL;DR: An initial attempt for phoneme recognition using structured SVM is presented, which was able to offer an absolute performance improvement of 1.33% over HMMs even with a highly simplified initial approach, probably because of the concept of maximized margin of SVM.

...read moreread less

Abstract: Structured Support Vector Machine (SVM) is a recently developed extension of the very successful SVM approach, which can efficiently classify structured pattern with maximized margin This paper presents an initial attempt for phoneme recognition using structured SVM We simply learn the basic framework of HMMs in configuring the structured SVM In the preliminary experiments with TIMIT corpus, the proposed approach was able to offer an absolute performance improvement of 133% over HMMs even with a highly simplified initial approach, probably because of the concept of maximized margin of SVM We see the potential of this approach because of the high generality, high flexibility, and high power of structured SVM

...read moreread less

16 citations

Proceedings Article•DOI•

Automatic Speaker Segmentation using Multiple Features and Distance Measures: A Comparison of Three Approaches

[...]

M. Kotti, Luis Gustavo Martins, Emmanouil Benetos¹, Jaime S. Cardoso, Constantine Kotropoulos¹ - Show less +1 more•Institutions (1)

Aristotle University of Thessaloniki¹

09 Jul 2006

TL;DR: This paper addresses the problem of unsupervised speaker change detection by testing three systems based on the Bayesian information criterion (BIC), a real-time approach employing the line spectral pairs and the BIC to validate a potential speaker change point.

...read moreread less

Abstract: This paper addresses the problem of unsupervised speaker change detection. Three systems based on the Bayesian Information Criterion (BIC) are tested. The first system investigates the AudioSpectrumCentroid and the AudioWaveformEnvelope features, implements a dynamic thresholding followed by a fusion scheme, and finally applies BIC. The second method is a real-time one that uses a metric-based approach employing the line spectral pairs and the BIC to validate a potential speaker change point. The third method consists of three modules. In the first module, a measure based on second-order statistics is used; in the second module, the Euclidean distance and T2 Hotelling statistic are applied; and in the third module, the BIC is utilized. The experiments are carried out on a dataset created by concatenating speakers from the TIMIT database, that is referred to as the TIMIT data set. A comparison between the performance of the three systems is made based on t-statistics.

...read moreread less

16 citations

Journal Article•DOI•

Autoregressive modeling of speech trajectory transformed to the reconstructed phase space for ASR purposes

[...]

Yasser Shekofteh¹, Farshad Almasganj¹•Institutions (1)

Amirkabir University of Technology¹

01 Dec 2013-Digital Signal Processing

TL;DR: This paper proposes a new method for feature extraction from the trajectory of the speech signal in the RPS using the multivariate autoregressive (MVAR) method and benefits from linear discriminant analysis (LDA) for dimension reduction.

...read moreread less

16 citations

Journal Article•

Hybrid orthogonal projection and estimation (HOPE): a new framework to learn neural networks

[...]

Shiliang Zhang¹, Hui Jiang², Li-Rong Dai¹•Institutions (2)

University of Science and Technology of China¹, York University²

01 Jan 2016-Journal of Machine Learning Research

TL;DR: Experimental results have shown that the HOPE framework yields significant performance gains over the current state-of-the-art methods in various types of NN learning problems, including unsupervised feature learning, supervised or semi-supervised learning.

...read moreread less

Abstract: In this paper, we propose a novel model for high-dimensional data, called the Hybrid Orthogonal Projection and Estimation (HOPE) model, which combines a linear orthogonal projection and a finite mixture model under a unified generative modeling framework. The HOPE model itself can be learned unsupervised from unlabelled data based on the maximum likelihood estimation as well as discriminatively from labelled data. More interestingly, we have shown the proposed HOPE models are closely related to neural networks (NNs) in a sense that each hidden layer can be reformulated as a HOPE model. As a result, the HOPE framework can be used as a novel tool to probe why and how NNs work, more importantly, to learn NNs in either supervised or unsupervised ways. In this work, we have investigated the HOPE framework to learn NNs for several standard tasks, including image recognition on MNIST and speech recognition on TIMIT. Experimental results have shown that the HOPE framework yields significant performance gains over the current state-of-the-art methods in various types of NN learning problems, including unsupervised feature learning, supervised or semi-supervised learning.

...read moreread less

16 citations

Proceedings Article•DOI•

Using Mutual Information to design class-specific phone recognizers

[...]

Patricia Scanlon¹, Daniel P. W. Ellis¹, Richard B. Reilly²•Institutions (2)

Columbia University¹, University College Dublin²

01 Jan 2003

TL;DR: This work uses Mutual Information as measure of the usefulness of individual time-frequency cells for various speech classification tasks and shows that selecting input features according to the mutual information criteria can provides a significant increase in classification accuracy.

...read moreread less

Abstract: Information concerning the identity of subword units such as phones cannot easily be pinpointed because it is broadly distributed in time and frequency. Continuing earlier work, we use Mutual Information as measure of the usefulness of individual time-frequency cells for various speech classification tasks, usin gt he hand-annotations of the TIMIT database as our ground truth. Since different broad phonetic classes such as vowels and stops have such different temporal characteristics, we examine mutual information separately for each class, revealing structure that was not uncovered in earlier work; further structure is revealed by aligning the time-frequency displays of each phone at the center of their hand-marked segments, rather than averaging across all possible alignments within each segment. Based on these results, we evaluate a range of vowel classifiers over the TIMIT test set and show that selecting input features according to the mutual information criteria can provides a significant increase in classification accuracy.

...read moreread less

16 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics