Journal ArticleDOI
A spectral autocorrelation method for measurement of the fundamental frequency of noise-corrupted speech
Reads0
Chats0
TLDR
A method for measurement of the fundamental frequency of a voiced speech signal corrupted by high levels of additive white Gaussian noise and voiced/unvoiced classification by making use of a two-dimensional, nearest-neighbor pattern recognition approach.Abstract:
A method for measurement of the fundamental frequency of a voiced speech signal corrupted by high levels of additive white Gaussian noise is described. The method is based on flattening the spectrum of the signal by a bank of bandpass lifters and extracting the pitch frequency from autocorrelation functions calculated at the output of the lifters. A smoothing modified median filter is applied to the calculated pitch frequency contour to result in an improvement in the accuracy of the method. A byproduct of the pitch tracker is a voiced/ unvoiced classifier. The maximum and the variance of the autocorrelation function maxima, over the bank of lifters, serve as the basis for voiced/unvoiced classification by making use of a two-dimensional, nearest-neighbor pattern recognition approach. Results are presented for fundamental frequency measurement and voiced/unvoiced classification for several signal-to-noise ratios.read more
Citations
More filters
Journal ArticleDOI
Multiple fundamental frequency estimation based on harmonicity and spectral smoothness
TL;DR: The spectral smoothness principle is proposed as an efficient new mechanism in estimating the spectral envelopes of detected sounds and works robustly in noise, and is able to handle sounds that exhibit inharmonicities.
Journal ArticleDOI
Automatic Music Transcription as We Know it Today
TL;DR: The aim of this overview is to describe methods for the automatic transcription of Western polyphonic music as transforming an acoustic musical signal into a MIDI-like symbolic representation, with main emphasis on estimating the multiple fundamental frequencies of several concurrent sounds.
Signal Processing Methods for the Automatic Transcription of Music
TL;DR: Signal processing methods for the automatic transcription of music are developed in this thesis and the main part of the thesis is dedicated to multiple fundamental frequency (F0) estimation, that is, estimation of the F0s of several concurrent musical sounds.
Proceedings ArticleDOI
Bayesian harmonic models for musical pitch estimation and analysis
Simon J. Godsill,Manuel Davy +1 more
TL;DR: An earlier Bayesian model which describes each component signal in terms of fundamental frequency, partials (‘harmonics’), and amplitude is proposed, modified for greater realism to include non-white residual spectrum, time-varying amplitudes and partials ‘detuned’ from the natural linear relationship.
Automatic transcription of music
TL;DR: The aim of this tutorial paper is to introduce and discuss different approaches to the automatic music transcription problem, here understood as a transformation from an acoustic signal into a MIDI-like symbolic representation.
References
More filters
Journal ArticleDOI
Nearest neighbor pattern classification
Thomas M. Cover,Peter E. Hart +1 more
TL;DR: The nearest neighbor decision rule assigns to an unclassified sample point the classification of the nearest of a set of previously classified points, so it may be said that half the classification information in an infinite sample set is contained in the nearest neighbor.
Journal ArticleDOI
Cepstrum Pitch Determination
TL;DR: Algorithms were developed heuristically for picking those peaks corresponding to voiced‐speech segments and the vocal pitch periods, which were then used to derive the excitation for a computer‐simulated channel vocoder.
Journal ArticleDOI
A comparative performance study of several pitch detection algorithms
TL;DR: A comparative performance study of seven pitch detection algorithms was conducted, consisting of eight utterances spoken by three males, three females, and one child, to assess their relative performance as a function of recording condition, and pitch range of the various speakers.
Journal ArticleDOI
On the use of autocorrelation analysis for pitch detection
TL;DR: Several types of (nonlinear) preprocessing which can be used to effectively spectrally flatten the speech signal are presented and an algorithm for adaptively choosing a frame size for an autocorrelation pitch analysis is discussed.
Journal ArticleDOI
A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition
TL;DR: A pattern recognition approach for deciding whether a given segment of a speech signal should be classified as voiced speech, unvoiced speech, or silence, based on measurements made on the signal, which has been found to provide reliable classification with speech segments as short as 10 ms.