scispace - formally typeset
Journal ArticleDOI

A spectral autocorrelation method for measurement of the fundamental frequency of noise-corrupted speech

Reads0
Chats0
TLDR
A method for measurement of the fundamental frequency of a voiced speech signal corrupted by high levels of additive white Gaussian noise and voiced/unvoiced classification by making use of a two-dimensional, nearest-neighbor pattern recognition approach.
Abstract
A method for measurement of the fundamental frequency of a voiced speech signal corrupted by high levels of additive white Gaussian noise is described. The method is based on flattening the spectrum of the signal by a bank of bandpass lifters and extracting the pitch frequency from autocorrelation functions calculated at the output of the lifters. A smoothing modified median filter is applied to the calculated pitch frequency contour to result in an improvement in the accuracy of the method. A byproduct of the pitch tracker is a voiced/ unvoiced classifier. The maximum and the variance of the autocorrelation function maxima, over the bank of lifters, serve as the basis for voiced/unvoiced classification by making use of a two-dimensional, nearest-neighbor pattern recognition approach. Results are presented for fundamental frequency measurement and voiced/unvoiced classification for several signal-to-noise ratios.

read more

Citations
More filters
Journal ArticleDOI

Multiple fundamental frequency estimation based on harmonicity and spectral smoothness

TL;DR: The spectral smoothness principle is proposed as an efficient new mechanism in estimating the spectral envelopes of detected sounds and works robustly in noise, and is able to handle sounds that exhibit inharmonicities.
Journal ArticleDOI

Automatic Music Transcription as We Know it Today

TL;DR: The aim of this overview is to describe methods for the automatic transcription of Western polyphonic music as transforming an acoustic musical signal into a MIDI-like symbolic representation, with main emphasis on estimating the multiple fundamental frequencies of several concurrent sounds.

Signal Processing Methods for the Automatic Transcription of Music

TL;DR: Signal processing methods for the automatic transcription of music are developed in this thesis and the main part of the thesis is dedicated to multiple fundamental frequency (F0) estimation, that is, estimation of the F0s of several concurrent musical sounds.
Proceedings ArticleDOI

Bayesian harmonic models for musical pitch estimation and analysis

TL;DR: An earlier Bayesian model which describes each component signal in terms of fundamental frequency, partials (‘harmonics’), and amplitude is proposed, modified for greater realism to include non-white residual spectrum, time-varying amplitudes and partials ‘detuned’ from the natural linear relationship.

Automatic transcription of music

TL;DR: The aim of this tutorial paper is to introduce and discuss different approaches to the automatic music transcription problem, here understood as a transformation from an acoustic signal into a MIDI-like symbolic representation.
References
More filters
Journal ArticleDOI

Nearest neighbor pattern classification

TL;DR: The nearest neighbor decision rule assigns to an unclassified sample point the classification of the nearest of a set of previously classified points, so it may be said that half the classification information in an infinite sample set is contained in the nearest neighbor.
Journal ArticleDOI

Cepstrum Pitch Determination

TL;DR: Algorithms were developed heuristically for picking those peaks corresponding to voiced‐speech segments and the vocal pitch periods, which were then used to derive the excitation for a computer‐simulated channel vocoder.
Journal ArticleDOI

A comparative performance study of several pitch detection algorithms

TL;DR: A comparative performance study of seven pitch detection algorithms was conducted, consisting of eight utterances spoken by three males, three females, and one child, to assess their relative performance as a function of recording condition, and pitch range of the various speakers.
Journal ArticleDOI

On the use of autocorrelation analysis for pitch detection

TL;DR: Several types of (nonlinear) preprocessing which can be used to effectively spectrally flatten the speech signal are presented and an algorithm for adaptively choosing a frame size for an autocorrelation pitch analysis is discussed.
Journal ArticleDOI

A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition

TL;DR: A pattern recognition approach for deciding whether a given segment of a speech signal should be classified as voiced speech, unvoiced speech, or silence, based on measurements made on the signal, which has been found to provide reliable classification with speech segments as short as 10 ms.