scispace - formally typeset
Proceedings ArticleDOI

Robust pitch tracking in the car environment

Reads0
Chats0
TLDR
Four different pitch tracking algorithms - autocorrelation, cepstrum, harmonic product spectrum, and a new method based on the modulation spectrum - are compared and investigated and their fitness for noisy environments is investigated.
Abstract
In this paper we compare four different pitch tracking algorithms - autocorrelation, cepstrum, harmonic product spectrum, and a new method based on the modulation spectrum - and investigate their fitness for noisy environments. From each tracker, possible ƒ 0 candidates in every time window are stored over a 10-frame interval, and the best contour is computed through a Viterbi search. The audio data comprises speech samples recorded in a moving car with a microphone setup commonly used for car speech applications like speakerphones. To see how performances deteriorate with increasing noise level, a second audio test set is prepared by blending clean speech recordings and noise at different signal-ta-noise ratios.

read more

Citations
More filters
Proceedings ArticleDOI

An improved non-intrusive intelligibility metric for noisy and reverberant speech

TL;DR: Experimental results show that the updated SRMR metric presents higher performance and lower variability relative to the original SRMR when assessing speech intelligibility in noisy and reverberant environments, as well as outperforms several standard intrusive and non-intrusive benchmark metrics.
Journal ArticleDOI

Noisy Speech Enhancement Using Harmonic-Noise Model and Codebook-Based Post-Processing

TL;DR: Evaluations of the performance gain obtained from the proposed post-processing speech restoration module are presented and compared to standard speech enhancement systems which show substantial improvement gains in perceptual quality.
Proceedings ArticleDOI

Extraction of pitch in adverse conditions

TL;DR: The performance of the proposed algorithm is found to be superior, even in adverse conditions, compared with the simple inverse filtering technique (SIFT) algorithm.
Proceedings ArticleDOI

Evaluation of Pitch Detection Algorithms Under Real Conditions

TL;DR: This paper is a first initiative to perform an evaluation of widely used PDA algorithms over an extensive and realistic database and proves the good performance of the described algorithm in noisy conditions.
Proceedings ArticleDOI

Speech Bandwidth Extension: Extrapolations of Spectral Envelop and Harmonicity Quality of Excitation

TL;DR: The method is successful in restoring the harmonicity of speech and converts telephone quality speech to perceptually high quality wideband speech.
References
More filters
Journal ArticleDOI

A tutorial on hidden Markov models and selected applications in speech recognition

TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
BookDOI

Verbmobil : foundations of speech-to-speech translation

TL;DR: Mobile Speech-to-Speech Translation of Spontaneous Dialogs and Verbmobil From a Software Engineering Point of View: System Design and Software Integration.
Journal ArticleDOI

Period Histogram and Product Spectrum: New Methods for Fundamental‐Frequency Measurement

TL;DR: Several methods of fundamental frequency and period measurement, based on these concepts, are described and the results of computer simulations and analog instrumentations indicate that these new methods compare favorably with, and in some cases exceed, the capabilities of cepstrum analysis.
Book ChapterDOI

The Prosody Module

TL;DR: In multimodal dialogue systems, several input and output modalities are used for user interaction and it is advantageous to recognize his internal emotional state because it is then possible to adapt the dialogue strategy to the situation in order to reduce the anger or uncertainty of the user.
Related Papers (5)