Glottal inverse filtering analysis of human voice production — A review of estimation and parameterization methods of the glottal excitation and their applications
Reads0
Chats0
TLDR
An era spanning five decades during which this topic has been under development is examined, including the estimation methods of the glottal source, the parameterization techniques that have been developed to express the estimatedglottal excitations in numerical forms, and the application areas of GIF.Abstract:
Glottal inverse filtering (GIF) refers to methods of estimating the source of voiced speech, the glottal volume velocity waveform. GIF is based on the idea of inversion, in which the effects of the vocal tract and lip radiation are cancelled from the output of the voice production mechanism, the speech signal. This article provides a review on GIF research by examining an era spanning five decades during which this topic has been under development. The topic is handled from three main perspectives: the estimation methods of the glottal source, the parameterization techniques that have been developed to express the estimated glottal excitations in numerical forms, and the application areas of GIF. Finally, the strengths and limitations of the GIF approach are discussed.read more
Citations
More filters
Proceedings ArticleDOI
COVAREP — A collaborative voice analysis repository for speech technologies
TL;DR: An overview of the current offerings of COVAREP is provided and a demonstration of the algorithms through an emotion classification experiment is included, to allow more reproducible research by strengthening complex implementations through shared contributions and openly available code.
Videokymography : High-speed line scanning of vocal fold vibration
Jan G. Švec,Harm K. Schutte +1 more
TL;DR: In this paper, a video camera is used for high-speed visualization of the vocal folds of the human laryngeal larynx, where the camera selects one active horizontal line (transverse to the glottis) from the whole image and the successive line images are presented in real time o a commercial TV monitor.
Journal ArticleDOI
Quasi Closed Phase Glottal Inverse Filtering Analysis With Weighted Linear Prediction
TL;DR: The proposed quasi closed phase (QCP) analysis method utilizes weighted linear prediction with a specific attenuated main excitation (AME) weight function that attenuates the contribution of the glottal source in the linear prediction model optimization.
Journal ArticleDOI
Robust and complex approach of pathological speech signal analysis
Jiri Mekyska,Eva Janoušová,Pedro Gómez-Vilda,Zdenek Smekal,Irena Rektorová,Ilona Eliasova,Milena Kostalova,Martina Mrackova,Jesús B. Alonso-Hernández,Marcos Faundez-Zanuy,Karmele López-de-Ipiña +10 more
TL;DR: 36 completely new pathological voice measures based on modulation spectra, inferior colliculus coefficients, bicepstrum, sample and approximate entropy and empirical mode decomposition are introduced, which means, that among all newly designed features those that quantify especially hoarseness or breathiness are good candidates for pathological speech identification.
Book
The Psychophysiology Primer: A Guide to Methods and a Broad Review with a Focus on Human-Computer Interaction
Benjamin Ultan Cowley,Marco Filetti,Kristian Lukander,Jari Torniainen,Andreas Henelius,Lauri Ahonen,Oswald Barral,Ilkka Kosunen,Teppo Valtonen,Minna Huotilainen,Niklas Ravaja,Giulio Jacucci +11 more
TL;DR: A foundational review of the field of psychophysiology is provided to serve as a primer for the novice, enabling rapid familiarisation with the core concepts, or as a quick-reference resource for advanced readers.
References
More filters
Book
Digital Processing of Speech Signals
TL;DR: This paper presents a meta-modelling framework for digital Speech Processing for Man-Machine Communication by Voice that automates the very labor-intensive and therefore time-heavy and expensive process of encoding and decoding speech.
Book
Principles of voice production
Ingo R. Titze,Daniel W. Martin +1 more
TL;DR: Basic Anatomy of the Larynx, Biomechanics of Laryngeal Tissue, and Fluctuations and Perturbations in Vocal Output.
Journal ArticleDOI
Vocal communication of emotion: a review of research paradigms
TL;DR: It is suggested to use the Brunswikian lens model as a base for research on the vocal communication of emotion, which allows one to model the complete process, including both encoding, transmission, and decoding of vocal emotion communication.
Journal ArticleDOI
Analysis, synthesis, and perception of voice quality variations among female and male talkers
Dennis H. Klatt,Laura C. Klatt +1 more
TL;DR: Perceptual validation of the relative importance of acoustic cues for signaling a breathy voice quality has been accomplished using a new voicing source model for synthesis of more natural male and female voices.