Glottal inverse filtering analysis of human voice production — A review of estimation and parameterization methods of the glottal excitation and their applications

doi:10.1007/S12046-011-0041-5

Open AccessJournal ArticleDOI

Glottal inverse filtering analysis of human voice production — A review of estimation and parameterization methods of the glottal excitation and their applications

Paavo Alku

- 22 Nov 2011 -

Sadhana-academy Proceedings in Engineeri...

- Vol. 36, Iss: 5, pp 623-650

Chats0

TLDR

An era spanning five decades during which this topic has been under development is examined, including the estimation methods of the glottal source, the parameterization techniques that have been developed to express the estimatedglottal excitations in numerical forms, and the application areas of GIF.

Abstract:

Glottal inverse filtering (GIF) refers to methods of estimating the source of voiced speech, the glottal volume velocity waveform. GIF is based on the idea of inversion, in which the effects of the vocal tract and lip radiation are cancelled from the output of the voice production mechanism, the speech signal. This article provides a review on GIF research by examining an era spanning five decades during which this topic has been under development. The topic is handled from three main perspectives: the estimation methods of the glottal source, the parameterization techniques that have been developed to express the estimated glottal excitations in numerical forms, and the application areas of GIF. Finally, the strengths and limitations of the GIF approach are discussed.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

COVAREP — A collaborative voice analysis repository for speech technologies

Gilles Degottex, +4 more

TL;DR: An overview of the current offerings of COVAREP is provided and a demonstration of the algorithms through an emotion classification experiment is included, to allow more reproducible research by strengthening complex implementations through shared contributions and openly available code.

...read moreread less

Videokymography : High-speed line scanning of vocal fold vibration

Jan G. Švec, +1 more

TL;DR: In this paper, a video camera is used for high-speed visualization of the vocal folds of the human laryngeal larynx, where the camera selects one active horizontal line (transverse to the glottis) from the whole image and the successive line images are presented in real time o a commercial TV monitor.

...read moreread less

Journal ArticleDOI

Quasi Closed Phase Glottal Inverse Filtering Analysis With Weighted Linear Prediction

Manu Airaksinen, +3 more

- 01 Mar 2014 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: The proposed quasi closed phase (QCP) analysis method utilizes weighted linear prediction with a specific attenuated main excitation (AME) weight function that attenuates the contribution of the glottal source in the linear prediction model optimization.

...read moreread less

Journal ArticleDOI

Robust and complex approach of pathological speech signal analysis

Jiri Mekyska, +10 more

- 01 Nov 2015 -

Neurocomputing

TL;DR: 36 completely new pathological voice measures based on modulation spectra, inferior colliculus coefficients, bicepstrum, sample and approximate entropy and empirical mode decomposition are introduced, which means, that among all newly designed features those that quantify especially hoarseness or breathiness are good candidates for pathological speech identification.

...read moreread less

Book

The Psychophysiology Primer: A Guide to Methods and a Broad Review with a Focus on Human-Computer Interaction

Benjamin Ultan Cowley, +11 more

TL;DR: A foundational review of the field of psychophysiology is provided to serve as a primer for the novice, enabling rapid familiarisation with the core concepts, or as a quick-reference resource for advanced readers.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Acoustic theory of speech production

Gunnar Fant

Book

Digital Processing of Speech Signals

Lawrence R. Rabiner, +1 more

TL;DR: This paper presents a meta-modelling framework for digital Speech Processing for Man-Machine Communication by Voice that automates the very labor-intensive and therefore time-heavy and expensive process of encoding and decoding speech.

...read moreread less

Book

Principles of voice production

Ingo R. Titze, +1 more

TL;DR: Basic Anatomy of the Larynx, Biomechanics of Laryngeal Tissue, and Fluctuations and Perturbations in Vocal Output.

...read moreread less

Journal ArticleDOI

Vocal communication of emotion: a review of research paradigms

Klaus R. Scherer

- 01 Apr 2003 -

Speech Communication

TL;DR: It is suggested to use the Brunswikian lens model as a base for research on the vocal communication of emotion, which allows one to model the complete process, including both encoding, transmission, and decoding of vocal emotion communication.

...read moreread less

Journal ArticleDOI

Analysis, synthesis, and perception of voice quality variations among female and male talkers

Dennis H. Klatt, +1 more

- 01 Feb 1990 -

Journal of the Acoustical Society of Ame...

TL;DR: Perceptual validation of the relative importance of acoustic cues for signaling a breathy voice quality has been accomplished using a new voicing source model for synthesis of more natural male and female voices.

...read moreread less

Collapse

Glottal inverse filtering analysis of human voice production — A review of estimation and parameterization methods of the glottal excitation and their applications

Citations

COVAREP — A collaborative voice analysis repository for speech technologies

Videokymography : High-speed line scanning of vocal fold vibration

Quasi Closed Phase Glottal Inverse Filtering Analysis With Weighted Linear Prediction

Robust and complex approach of pathological speech signal analysis

The Psychophysiology Primer: A Guide to Methods and a Broad Review with a Focus on Human-Computer Interaction

References

Acoustic theory of speech production

Digital Processing of Speech Signals

Principles of voice production

Vocal communication of emotion: a review of research paradigms

Analysis, synthesis, and perception of voice quality variations among female and male talkers

Related Papers (5)

Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering

A four-parameter model of glottal flow

Vocal quality factors: analysis, synthesis, and perception.

Normalized amplitude quotient for parametrization of the glottal flow

Least squares glottal inverse filtering from the acoustic speech waveform