scispace - formally typeset
Proceedings ArticleDOI

Timing patterns in fluent and disfluent spontaneous speech

TLDR
This work examines and model global speaking rate, how it varies for both fluent and disfluent spontaneous speech, in terms of the linguistic content of the utterances, and finds application in automatic speech synthesis and recognition.
Abstract
Most previous acoustic analysis of speech has examined data from speakers who carefully pronounce their speech, usually by reading prepared texts. Natural spontaneous or conversational speech differs from careful or read speech, especially concerning hesitation phenomena and variable speaking rates. We examine and model global speaking rate, how it varies for both fluent and disfluent spontaneous speech, in terms of the linguistic content of the utterances. Speakers tend to maintain a fixed speaking rate during most utterances, but often adopt a faster or slower rate, depending on the cognitive load (i.e., slowing down when having to make unanticipated choices, or accelerating when repeating some words). Such a model can find application in automatic speech synthesis and recognition, because most synthesizers maintain a constant (and unnatural) speaking rate and most recognizers are not capable of adapting their templates or probabilistic models to reflect global changes in speaking rate.

read more

Citations
More filters
Journal ArticleDOI

A review of large-vocabulary continuous-speech

TL;DR: The principles and architecture of current LVR systems are discussed and the key issues affecting their future deployment are identified; to illustrate the various points raised, the Cambridge University HTK system is described.
Journal ArticleDOI

Robust Speech Rate Estimation for Spontaneous Speech

TL;DR: This paper compares various spectral and temporal signal analysis and smoothing strategies to better characterize the underlying syllable structure to derive speech rate and describes an automated approach for learning algorithm parameters from data, and finds the optimal settings through Monte Carlo simulations and parameter sensitivity analysis.
Journal ArticleDOI

The Delta-Phase Spectrum With Application to Voice Activity Detection and Speaker Recognition

TL;DR: Experiments show that mel-frequency cepstral coefficients features derived from the delta-phase spectrum can produce broadly similar performance to equivalent magnitude domain features for both voice activity detection and speaker recognition tasks.
Book ChapterDOI

The significance of empty speech pauses: cognitive and algorithmic issues

TL;DR: The high consistency, among subjects, in the distribution of speech pauses suggests that, at least in the Italian context, the speaker in narration makes use of an intrinsic timing behavior, probably a general pattern of rules, to control speech flow for discourse organization.
Book ChapterDOI

On the Significance of Speech Pauses in Depressive Disorders: Results on Read and Spontaneous Narratives

TL;DR: The results suggest that depressive disorders affect speech quality and speech production through pause and clause durations, as well as, clause quantities, suggest a strong general effect of depressive symptoms on cognitive and psychomotor functions.
References
More filters
Journal ArticleDOI

Linguistic uses of segmental duration in English: Acoustic and perceptual evidence

TL;DR: It is concluded that duration often serves as a primary perceptual cue in the distinctions between inherently long verses short vowels, voiced verses voiceless fricatives, phrase‐final verses non‐final syllables, and the presence or absence of emphasis.
Journal ArticleDOI

Speaking Clearly for the Hard of Hearing II: Acoustic Characteristics of Clear and Conversational Speech.

TL;DR: In this article, the authors report the results of acoustic analyses performed on the conversational and clear speech, and show that speaking clearly cannot be regarded as equivalent to the application of high-frequency emphasis.
Journal ArticleDOI

Effects of noise on speech production: Acoustic and perceptual analyses

TL;DR: The nature of the acoustic changes that taken place when speakers produce speech under adverse conditions such as noise, psychological stress, or high cognitive load are discussed and the role of training and feedback in controlling and modifying a talker's speech to improve performance of current speech recognizers is discussed.
Journal ArticleDOI

Articulation Rate and Its Variability in Spontaneous Speech: A Reanalysis and Some Implications

TL;DR: A modified measurement procedure is used to reanalyze the speech data from 30 talkers in an interview situation and it is found that there was indeed substantial variation in articulation rate for these speakers, even within a single utterance of a single talker.
Journal ArticleDOI

Consonant duration in American English

TL;DR: In this paper, the temporal behavior of all measurable consonants, detailed in all possible conditions, in an extensive reading by one speaker, was discussed and a strong parallelism in duration distributions among similar kinds of consonants was found.