Pitch tracking in reverberant environments

doi:10.1109/ISSPIT.2015.7394326

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A Comparative Study between Pitch Detection Techniques on Reverberant Speech Signals

[...]

Shaimaa E. A. Aziz Hassan, Adel S. El-Fishawy, Fathi Abd El-Samie, A. M. Sharshar, Gaber El-Sayed Mohamed El-Abyad, M. I. Dessouky, El-Sayed M. El-Rabaie - Show less +3 more

01 Jan 2020

TL;DR: Several methods for pitch frequency estimation are investigated and compared on clear and reverberant male and female speech signals to select the one that is not affected so much by the reverberation effect.

...read moreread less

Abstract: Reverberation is one of the effects that occur regularly in closed room due to multiple reflections. This paper investigates the result of reverberation on both male and female speech signals. This effect is reflected in pitch frequency of speech signals. This parameter is important as it is usually used for speaker identification. Hence, several methods for pitch frequency estimation are investigated and compared on clear and reverberant male and female speech signals to select the one that is not affected so much by the reverberation effect.

...read moreread less

Cites background from "Pitch tracking in reverberant envir..."

...It transfers multiple characteristics of the information transmitted by speech signal [4], so the pitch is the auditory quality of sound; it is a perceived fundamental frequency of sound....
[...]

References

PDF

Open Access

More filters

Journal Article•DOI•

Image method for efficiently simulating small‐room acoustics

[...]

Jont B. Allen¹, David A. Berkley¹•Institutions (1)

Alcatel-Lucent¹

01 Nov 1976-Journal of the Acoustical Society of America

TL;DR: The theoretical and practical use of image techniques for simulating the impulse response between two points in a small rectangular room, when convolved with any desired input signal, simulates room reverberation of the input signal.

...read moreread less

Abstract: Image methods are commonly used for the analysis of the acoustic properties of enclosures. In this paper we discuss the theoretical and practical use of image techniques for simulating, on a digital computer, the impulse response between two points in a small rectangular room. The resulting impulse response, when convolved with any desired input signal, such as speech, simulates room reverberation of the input signal. This technique is useful in signal processing or psychoacoustic studies. The entire process is carried out on a digital computer so that a wide range of room parameters can be studied with accurate control over the experimental conditions. A fortran implementation of this model has been included.

...read moreread less

3,720 citations

"Pitch tracking in reverberant envir..." refers background in this paper

...Beginning from the image model [4], various algorithms have originated over the past few decades that allow the simulation and modeling of reverberant environments....
[...]

Journal Article•DOI•

YIN, a fundamental frequency estimator for speech and music

[...]

Alain de Cheveigné¹, Hideki Kawahara•Institutions (1)

IRCAM¹

03 Apr 2002-Journal of the Acoustical Society of America

TL;DR: An algorithm is presented for the estimation of the fundamental frequency (F0) of speech or musical sounds, based on the well-known autocorrelation method with a number of modifications that combine to prevent errors.

...read moreread less

Abstract: An algorithm is presented for the estimation of the fundamental frequency (F0) of speech or musical sounds. It is based on the well-known autocorrelation method with a number of modifications that combine to prevent errors. The algorithm has several desirable features. Error rates are about three times lower than the best competing methods, as evaluated over a database of speech recorded together with a laryngograph signal. There is no upper limit on the frequency search range, so the algorithm is suited for high-pitched voices and music. The algorithm is relatively simple and may be implemented efficiently and with low latency, and it involves few parameters that must be tuned. It is based on a signal model (periodic signal) that may be extended in several ways to handle various forms of aperiodicity that occur in particular applications. Finally, interesting parallels may be drawn with models of auditory processing.

...read moreread less

1,975 citations

"Pitch tracking in reverberant envir..." refers background or methods in this paper

...The first one is a time domain approach that uses properties like autocorrelation [2,3] and phase space properties....
[...]
...Experimental Setup The Yin [3] Wu [5], RAPT [2], SWIPE [15] and YAAPT [14] algorithms are used for comparison of the pitch tracks Set Name Description...
[...]
...From Figure 6(a) we can observe the pitch tracking errors for Yin [3], Wu [5] and the RAPT [2] and SAcC [9] trained on the clean Keele corpus....
[...]
...The YIN [3] algorithm uses the squared difference function based on ACF to identify pitch candidates....
[...]
...We compare Neural Network (NN) based approaches such as the Subband Autocorrelation Classifier (SAcC) with signal processing based methods such as YIN and RAPT....
[...]

Journal Article•DOI•

Computational auditory scene analysis

[...]

Guy J. Brown¹, Martin Cooke¹•Institutions (1)

University of Sheffield¹

01 Oct 1994-Computer Speech & Language

TL;DR: A segregation system that is consistent with psychological and physiological findings and significantly better than that of the frame-based segregation scheme described by Meddis and Hewitt (1992).

...read moreread less

817 citations

Journal Article•DOI•

A comparative performance study of several pitch detection algorithms

[...]

Lawrence R. Rabiner¹, M. Cheng¹, Aaron E. Rosenberg¹, C. A. McGonegal¹•Institutions (1)

Bell Labs¹

01 Oct 1976-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A comparative performance study of seven pitch detection algorithms was conducted, consisting of eight utterances spoken by three males, three females, and one child, to assess their relative performance as a function of recording condition, and pitch range of the various speakers.

...read moreread less

Abstract: A comparative performance study of seven pitch detection algorithms was conducted. A speech data base, consisting of eight utterances spoken by three males, three females, and one child was constructed. Telephone, close talking microphone, and wideband recordings were made of each of the utterances. For each of the utterances in the data base; a "standard" pitch contour was semiautomatically measured using a highly sophisticated interactive pitch detection program. The "standard" pitch contour was then compared with the pitch contour that was obtained from each of the seven programmed pitch detectors. The algorithms used in this study were 1) a center clipping, infinite-peak clipping, modified autocorrelation method (AUTOC), 2) the cepstral method (CEP), 3) the simplified inverse filtering technique (SIFT) method, 4) the parallel processing time-domain method (PPROC), 5) the data reduction method (DARD), 6) a spectral flattening linear predictive coding (LPC) method, and 7) the average magnitude difference function (AMDF) method. A set of measurements was made on the pitch contours to quantify the various types of errors which occur in each of the above methods. Included among the error measurements were the average and standard deviation of the error in pitch period during voiced regions, the number of gross errors in the pitch period, and the average number of voiced-unvoiced classification errors. For each of the error measurements, the individual pitch detectors could be rank ordered as a measure of their relative performance as a function of recording condition, and pitch range of the various speakers. Performance scores are presented for each of the seven pitch detectors based on each of the categories of error.

...read moreread less

793 citations

"Pitch tracking in reverberant envir..." refers background in this paper

...PERFORMANCE METRIC Gross Pitch Error (GPE) and Voicing Decision Error (VDE) are the standard measures to determine errors in pitch tracking [13,18]....
[...]

Journal Article•DOI•

Speech database development at MIT: Timit and beyond

[...]

Victor W. Zue¹, Stephanie Seneff¹, James Glass¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 1990-Speech Communication

TL;DR: The experiences of researchers at MIT in the collection of two large speech databases, timit and voyager, are described, which have somewhat complementary objectives.

...read moreread less

570 citations

"Pitch tracking in reverberant envir..." refers background or methods in this paper

...Corpus The Keele corpora [17] was used for training the MLP while the TIMIT corpora [18] was used for testing....
[...]
...But a corpus like the TIMIT [18] database possesses samples in which the UE and VE are very different....
[...]
...PERFORMANCE METRIC Gross Pitch Error (GPE) and Voicing Decision Error (VDE) are the standard measures to determine errors in pitch tracking [13,18]....
[...]
...As mentioned earlier, the training set is Keele [17] and the test set is TIMIT [18]....
[...]

Pitch tracking in reverberant environments

Citations

Cites background from "Pitch tracking in reverberant envir..."

References

"Pitch tracking in reverberant envir..." refers background in this paper

"Pitch tracking in reverberant envir..." refers background or methods in this paper

"Pitch tracking in reverberant envir..." refers background in this paper

"Pitch tracking in reverberant envir..." refers background or methods in this paper

Related Papers (5)