scispace - formally typeset
Proceedings ArticleDOI

Pitch tracking in reverberant environments

Reads0
Chats0
TLDR
This paper compares Neural Network (NN) based approaches such as the Subband Autocorrelation Classifier (SAcC) with signal processing based methods such as YIN and RAPT and shows that multi-style training of NN using the CC+SA cC feature outperforms all the other methods.
Abstract
Pitch, or fundamental frequency, estimation is an important problem in speech processing. Research on pitch extraction is several years old and numerous algorithms have been developed over the years to improve its accuracy. It becomes more difficult in the presence of additive noise and reverberation because noise corrupts the periodicity information which is vital for estimating the pitch. In this paper, we present a quantitative analysis on pitch tracking in the presence of reverberation by different state of the art methods. We compare Neural Network (NN) based approaches such as the Subband Autocorrelation Classifier (SAcC) with signal processing based methods such as YIN and RAPT. We enhance the performance of SAcC by introducing a cross-correlogram feature (CC+SAcC). We further show that multi-style training of NN using the CC+SAcC feature outperforms all the other methods. Experiments were conducted using artificially reverberated Keele and TIMIT databases with room impulse responses of varying T60 values.

read more

Citations
More filters
Journal ArticleDOI

A Comparative Study between Pitch Detection Techniques on Reverberant Speech Signals

TL;DR: Several methods for pitch frequency estimation are investigated and compared on clear and reverberant male and female speech signals to select the one that is not affected so much by the reverberation effect.
References
More filters
Proceedings Article

A pitch extraction reference database.

TL;DR: A database for the comparison of pitch extraction algorithms based on a core speech module and several additional modules, based on speech and laryngograph data for 15 speakers reading a phonetically balanced text.
Journal ArticleDOI

A Tandem Algorithm for Pitch Estimation and Voiced Speech Segregation

TL;DR: A tandem algorithm is proposed that performs pitch estimation of a target utterance and segregation of voiced portions of target speech jointly and iteratively and performs substantially better than previous systems for either pitch extraction or voiced speech segregation.
Proceedings ArticleDOI

A perceptual pitch detector

TL;DR: A pitch detector based on Licklider's (1979) duplex theory of pitch perception was implemented and tested on a variety of stimuli from human perceptual tests and it is shown that it correctly identifies the pitch of complex harmonic and inharmonic stimuli and that it is robust in the face of noise and phase changes.
Journal ArticleDOI

A spectral/temporal method for robust fundamental frequency tracking

TL;DR: A fundamental frequency (F(0) tracking algorithm is presented that is extremely robust for both high quality and telephone speech, at signal to noise ratios ranging from clean speech to very noisy speech.
Journal ArticleDOI

A computational auditory scene analysis system for speech segregation and robust speech recognition

TL;DR: This work estimates the ideal binary time-frequency (T-F) mask which retains the mixture in a local T-F unit if and only if the target is stronger than the interference within the unit.
Related Papers (5)