scispace - formally typeset
Proceedings ArticleDOI

Proposal and evaluation of models for the glottal source waveform

Hidehiko Fujisaki, +1 more
- Vol. 11, pp 1605-1608
TLDR
The results indicate the importance of detailed modeling of the period of glottal closure for accurate analysis and describe a method for simultaneously estimating theglottal source and vocal: tract parameters.
Abstract
Speech analysis for high quality speech synthesis or high accuracy speech recognition requires realistic models not only for the vocal tract but also for the voice source. In the present paper, we investigate models for the glottal volume velocity waveform. Previously proposed models are reviewed and classified according to their level of elaboration in expressing the glottal characteristics. A new model is then proposed which possesses all the important features of previously proposed models. A method is also described for simultaneously estimating the glottal source and vocal: tract parameters. Using this method, evaluation of glottal model parameters is carried out on real speech by varying the number of parameters in the proposed model. The results indicate the importance of detailed modeling of the period of glottal closure for accurate analysis.

read more

Citations
More filters
Journal ArticleDOI

Analysis, synthesis, and perception of voice quality variations among female and male talkers

TL;DR: Perceptual validation of the relative importance of acoustic cues for signaling a breathy voice quality has been accomplished using a new voicing source model for synthesis of more natural male and female voices.
Journal ArticleDOI

Review of text‐to‐speech conversion for English

TL;DR: This review traces the early work on the development of speech synthesizers, discovery of minimal acoustic cues for phonetic contrasts, evolution of phonemic rule programs, incorporation of prosodic rules, and formulation of techniques for text analysis.
Journal ArticleDOI

The role of voice quality in communicating emotion, mood and attitude

TL;DR: Listeners' reactions to an utterance synthesised with seven different voice qualities were elicited in terms of pairs of opposing affective attributes, suggesting that these qualities are considerably more effective in signalling milder affective states than the strong emotions.
Proceedings ArticleDOI

Tensor Fusion Network for Multimodal Sentiment Analysis

TL;DR: In this article, a tensor fusion network (Tensor fusion network) is proposed to model intra-modality and inter-modal dynamics for multimodal sentiment analysis.
Journal ArticleDOI

Vocal quality factors: analysis, synthesis, and perception.

TL;DR: A new voice source model that accounted for certain physiological aspects of vocal fold motion was developed and tested using speech synthesis, and applications include synthesis of natural sounding speech, synthesis and modeling of vocal disorders, and the development of speaker independent (or adaptive) speech recognition systems.
References
More filters
Journal ArticleDOI

Synthesis of voiced sounds from a two-mass model of the vocal cords

TL;DR: In this article, a model of voiced-sound generation is derived in which the detailed acoustic behavior of the human vocal cords and the vocal tract is computed, and the cord-tract system is programmed for interactive study on a DDP-516 computer.
Journal ArticleDOI

Effect of glottal pulse shape on the quality of natural vowels.

TL;DR: In this article, a male speaker recorded monosyllabic words and a continuous sentence and a pitch-synchronous analysis was carried out by a digital computer on the vowel portions of these samples, for every pitch period, the analysis provided: formant frequencies, waveform of the glottal excitation function, and an accurate pitch-period measurement.
Journal ArticleDOI

Parameterization of the glottal area, glottal flow, and vocal fold contact area

TL;DR: A new set of parameters is described for analysis and synthesis of glottal area, vocal fold contact area, andglottal volume flow, which show promise in interpretation of electroglottographic, photoglottography, and inverse filtered volume velocity waveforms in terms of the glOTTal configuration.
Proceedings ArticleDOI

A glottal LPC-vocoder

TL;DR: It is found that the additional glottal parameters can be coded effectively such that the total bit rate is in the same range as for conventional LPC.
Journal ArticleDOI

A model for the synthesis of natural sounding vowels

TL;DR: In this article, a parametrized function is used to produce an approximation to the cross-sectional area through the glottis, and the output of the model is generated by convolving the resulting glottal volume velocity with the transfer function impulse response of the tract.