scispace - formally typeset
Open AccessJournal ArticleDOI

Glottal inverse filtering analysis of human voice production — A review of estimation and parameterization methods of the glottal excitation and their applications

Paavo Alku
- 22 Nov 2011 - 
- Vol. 36, Iss: 5, pp 623-650
Reads0
Chats0
TLDR
An era spanning five decades during which this topic has been under development is examined, including the estimation methods of the glottal source, the parameterization techniques that have been developed to express the estimatedglottal excitations in numerical forms, and the application areas of GIF.
Abstract
Glottal inverse filtering (GIF) refers to methods of estimating the source of voiced speech, the glottal volume velocity waveform. GIF is based on the idea of inversion, in which the effects of the vocal tract and lip radiation are cancelled from the output of the voice production mechanism, the speech signal. This article provides a review on GIF research by examining an era spanning five decades during which this topic has been under development. The topic is handled from three main perspectives: the estimation methods of the glottal source, the parameterization techniques that have been developed to express the estimated glottal excitations in numerical forms, and the application areas of GIF. Finally, the strengths and limitations of the GIF approach are discussed.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

COVAREP — A collaborative voice analysis repository for speech technologies

TL;DR: An overview of the current offerings of COVAREP is provided and a demonstration of the algorithms through an emotion classification experiment is included, to allow more reproducible research by strengthening complex implementations through shared contributions and openly available code.

Videokymography : High-speed line scanning of vocal fold vibration

TL;DR: In this paper, a video camera is used for high-speed visualization of the vocal folds of the human laryngeal larynx, where the camera selects one active horizontal line (transverse to the glottis) from the whole image and the successive line images are presented in real time o a commercial TV monitor.
Journal ArticleDOI

Quasi Closed Phase Glottal Inverse Filtering Analysis With Weighted Linear Prediction

TL;DR: The proposed quasi closed phase (QCP) analysis method utilizes weighted linear prediction with a specific attenuated main excitation (AME) weight function that attenuates the contribution of the glottal source in the linear prediction model optimization.
Journal ArticleDOI

Robust and complex approach of pathological speech signal analysis

TL;DR: 36 completely new pathological voice measures based on modulation spectra, inferior colliculus coefficients, bicepstrum, sample and approximate entropy and empirical mode decomposition are introduced, which means, that among all newly designed features those that quantify especially hoarseness or breathiness are good candidates for pathological speech identification.
Book

The Psychophysiology Primer: A Guide to Methods and a Broad Review with a Focus on Human-Computer Interaction

TL;DR: A foundational review of the field of psychophysiology is provided to serve as a primer for the novice, enabling rapid familiarisation with the core concepts, or as a quick-reference resource for advanced readers.
References
More filters
Book

Digital Processing of Speech Signals

TL;DR: This paper presents a meta-modelling framework for digital Speech Processing for Man-Machine Communication by Voice that automates the very labor-intensive and therefore time-heavy and expensive process of encoding and decoding speech.
Book

Principles of voice production

TL;DR: Basic Anatomy of the Larynx, Biomechanics of Laryngeal Tissue, and Fluctuations and Perturbations in Vocal Output.
Journal ArticleDOI

Vocal communication of emotion: a review of research paradigms

TL;DR: It is suggested to use the Brunswikian lens model as a base for research on the vocal communication of emotion, which allows one to model the complete process, including both encoding, transmission, and decoding of vocal emotion communication.
Journal ArticleDOI

Analysis, synthesis, and perception of voice quality variations among female and male talkers

TL;DR: Perceptual validation of the relative importance of acoustic cues for signaling a breathy voice quality has been accomplished using a new voicing source model for synthesis of more natural male and female voices.
Related Papers (5)