Topic

TIMIT

About: TIMIT is a research topic. Over the lifetime, 1401 publications have been published within this topic receiving 59888 citations. The topic is also known as: TIMIT Acoustic-Phonetic Continuous Speech Corpus.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Posted Content•

Maximum Feasible Subsystem Algorithms for Recovery of Compressively Sensed Speech

[...]

Fereshteh Fakhar Firouzeh, John W. Chinneck, Sreeraman Rajan

05 Mar 2020-arXiv: Signal Processing

TL;DR: In this article, the authors presented three new algorithms based on solutions for the maximum feasible subsystem problem (MAX FS) that improve on the state-of-the-art in recovery of compressed speech signals.

...read moreread less

Abstract: The goal in signal compression is to reduce the size of the input signal without a significant loss in the quality of the recovered signal. One way to achieve this goal is to apply the principles of compressive sensing, but this has not been particularly successful for real-world signals that are insufficiently sparse, such as speech. We present three new algorithms based on solutions for the maximum feasible subsystem problem (MAX FS) that improve on the state of the art in recovery of compressed speech signals: more highly compressed signals can be successfully recovered with greater quality. The new recovery algorithms deliver sparser solutions when compared with those obtained using traditional compressive sensing recovery algorithms. When tested by recovering compressively sensed speech signals in the TIMIT speech database, the recovered speech has better perceptual quality than speech recovered using traditional compressive sensing recovery algorithms.

...read moreread less

Proceedings Article•

Multiclass support vector machines for articulatory feature classification

[...]

Brian Hutchinson¹, Jianna Zhang¹•Institutions (1)

Western Washington University¹

16 Jul 2006

TL;DR: This ongoing research project investigates articulatory feature (AF) classification using multiclass support vector machines (SVMs) to assess the AF classification performance of different multiclass generalizations of the SVM, including one-versus-rest, one-Versus-one, Decision Directed Acyclic Graph (DDAG), and direct methods for multiclass learning.

...read moreread less

Abstract: This ongoing research project investigates articulatory feature (AF) classification using multiclass support vector machines (SVMs). SVMs are being constructed for each AF in the multi-valued feature set (Table 1), using speech data and annotation from the IFA Dutch “Open-Source” (van Son et al. 2001) and TIMIT English (Garofolo et al. 1993) corpora. The primary objective of this research is to assess the AF classification performance of different multiclass generalizations of the SVM, including one-versus-rest, one-versus-one, Decision Directed Acyclic Graph (DDAG), and direct methods for multiclass learning. Observing the successful application of SVMs to numerous classification problems (Bennett and Campbell 2000), it is hoped that multiclass SVMs will outperform existing state-of-the-art AF classifiers. One of the most basic challenges for speech recognition and other spoken language systems is to accurately map data from the acoustic domain into the linguistic domain. Much speech processing research has approached this task by taking advantage of the correlation between phones, the basic units of speech sound, and their acoustic manifestation (intuitively, there is a range of sounds that humans would consider to be an “e”). The mapping of acoustic data to phones has been largely successful, and is used in many speech systems today. Despite its success, there are drawbacks to using phones as the point of entry from the acoustic to linguistic domains. Notably, the granularity of the “phoneticsegmental” model, in which speech is represented as a series of phones, makes it difficult to account for various subphone phenomena that affect performance on spontaneous speech. Researchers have pursued an alternative approach to the acoustic-linguistic mapping through the use of articulatory modeling. This approach more directly exploits the intimate relation between articulation and acoustics: the state of one’s speech articulators (e.g. vocal folds, tongue) uniquely determines the parameters of the acoustic speech signal. Unfortunately, while the mapping from articulator to acoustics is straightforward, the problem of recovering the state of the articulators from an acoustic speech representation, acoustic-to-articulatory inversion, poses a formidable challenge (Toutios and Margaritis 2003). Nevertheless, re-

...read moreread less

Journal Article•DOI•

Multi-channel speech enhancement using early and late fusion convolutional neural networks

[...]

S. Siva Priyanka, T. Kishore Kumar

05 Aug 2022-Signal, Image and Video Processing

Frame Level Based Algorithm

[...]

Rafik Djemili, Badji Mokhtar, Hocine Bourouba

01 Jan 2015

TL;DR: Experimental results showed that the proposed algorithm yielded to relative reduction in error rates of 24.4 and 37.3% over the baseline systems respectively for IVIE and TIMIT.

...read moreread less

Abstract: In this paper, we propose an algorithm to improve the performance of speaker identification systems. A baseline speaker identification system uses a scoring of a test utterance against all speakers' models; this could be termed as an evaluation at the observation level. In the proposed approach, and prior to the standard evaluation phase, an algorithm based on a frame level evaluation is applied. The speaker identification study is conducted using IVIE corpus and a randomly selected 120 speakers from TIMIT. Mel-frequency cepstral coefficients (MFCC) and Gaussian mixture model (GMM) are the main components in state of the art speaker identification systems and will be adopted in this work. Experimental results based on several systems with different training and testing conditions, showed that our proposed algorithm yielded to relative reduction in error rates of 24.4 and 37.3% over the baseline systems respectively for IVIE and TIMIT. The final performances reached measured by identification error rates are 3.4% and 5.2% for IVIE and TIMIT corp uses.

...read moreread less

Proceedings Article•DOI•

A V/UV Speech Detection based on Characterization of Background Noise

[...]

Francesco Beritelli, Salvatore Casale, Alessandra Russo, Salvatore Serrano

01 Nov 2007

TL;DR: An adaptive system for voiced/unvoiced (V/UV) speech detection in the presence of background noise was implemented and the results were compared with a non-adaptive classification system and the V/UV detectors adopted by three important speech coding standards.

...read moreread less

Abstract: The paper presents an adaptive system for voiced/unvoiced (V/UV) speech detection in the presence of background noise Genetic algorithms were used to select the features that offer the best V/UV detection according to the output of a background noise classifier (NC) and a signal to noise ratio estimation (SNRE) system The system was implemented and the tests performed using the TIMIT speech corpus and its phonetic classification The results were compared with a non-adaptive classification system and the V/UV detectors adopted by three important speech coding standards: LPC10, ITU-T G7231 and ETSI AMR In all cases the adaptive V/UV classifier outperformed the traditional solutions

...read moreread less

Collapse

Network Information

Performance

Metrics

1,488

Papers

68,688

Citations

No. of papers in the topic in previous years
Year	Papers
2023	24
2022	62
2021	67
2020	86
2019	77
2018	95

TIMIT

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics