Proceedings ArticleDOI
Timing patterns in fluent and disfluent spontaneous speech
Douglas D. O'Shaughnessy
- Vol. 1, pp 600-603
TLDR
This work examines and model global speaking rate, how it varies for both fluent and disfluent spontaneous speech, in terms of the linguistic content of the utterances, and finds application in automatic speech synthesis and recognition.Abstract:
Most previous acoustic analysis of speech has examined data from speakers who carefully pronounce their speech, usually by reading prepared texts. Natural spontaneous or conversational speech differs from careful or read speech, especially concerning hesitation phenomena and variable speaking rates. We examine and model global speaking rate, how it varies for both fluent and disfluent spontaneous speech, in terms of the linguistic content of the utterances. Speakers tend to maintain a fixed speaking rate during most utterances, but often adopt a faster or slower rate, depending on the cognitive load (i.e., slowing down when having to make unanticipated choices, or accelerating when repeating some words). Such a model can find application in automatic speech synthesis and recognition, because most synthesizers maintain a constant (and unnatural) speaking rate and most recognizers are not capable of adapting their templates or probabilistic models to reflect global changes in speaking rate.read more
Citations
More filters
Journal ArticleDOI
A review of large-vocabulary continuous-speech
TL;DR: The principles and architecture of current LVR systems are discussed and the key issues affecting their future deployment are identified; to illustrate the various points raised, the Cambridge University HTK system is described.
Journal ArticleDOI
Robust Speech Rate Estimation for Spontaneous Speech
TL;DR: This paper compares various spectral and temporal signal analysis and smoothing strategies to better characterize the underlying syllable structure to derive speech rate and describes an automated approach for learning algorithm parameters from data, and finds the optimal settings through Monte Carlo simulations and parameter sensitivity analysis.
Journal ArticleDOI
The Delta-Phase Spectrum With Application to Voice Activity Detection and Speaker Recognition
TL;DR: Experiments show that mel-frequency cepstral coefficients features derived from the delta-phase spectrum can produce broadly similar performance to equivalent magnitude domain features for both voice activity detection and speaker recognition tasks.
Book ChapterDOI
The significance of empty speech pauses: cognitive and algorithmic issues
TL;DR: The high consistency, among subjects, in the distribution of speech pauses suggests that, at least in the Italian context, the speaker in narration makes use of an intrinsic timing behavior, probably a general pattern of rules, to control speech flow for discourse organization.
Book ChapterDOI
On the Significance of Speech Pauses in Depressive Disorders: Results on Read and Spontaneous Narratives
Anna Esposito,Antonietta M. Esposito,Laurence Likforman-Sulem,Mauro Maldonato,Alessandro Vinciarelli +4 more
TL;DR: The results suggest that depressive disorders affect speech quality and speech production through pause and clause durations, as well as, clause quantities, suggest a strong general effect of depressive symptoms on cognitive and psychomotor functions.
References
More filters
Journal ArticleDOI
Linguistic uses of segmental duration in English: Acoustic and perceptual evidence
TL;DR: It is concluded that duration often serves as a primary perceptual cue in the distinctions between inherently long verses short vowels, voiced verses voiceless fricatives, phrase‐final verses non‐final syllables, and the presence or absence of emphasis.
Journal ArticleDOI
Speaking Clearly for the Hard of Hearing II: Acoustic Characteristics of Clear and Conversational Speech.
TL;DR: In this article, the authors report the results of acoustic analyses performed on the conversational and clear speech, and show that speaking clearly cannot be regarded as equivalent to the application of high-frequency emphasis.
Journal ArticleDOI
Effects of noise on speech production: Acoustic and perceptual analyses
TL;DR: The nature of the acoustic changes that taken place when speakers produce speech under adverse conditions such as noise, psychological stress, or high cognitive load are discussed and the role of training and feedback in controlling and modifying a talker's speech to improve performance of current speech recognizers is discussed.
Journal ArticleDOI
Articulation Rate and Its Variability in Spontaneous Speech: A Reanalysis and Some Implications
TL;DR: A modified measurement procedure is used to reanalyze the speech data from 30 talkers in an interview situation and it is found that there was indeed substantial variation in articulation rate for these speakers, even within a single utterance of a single talker.
Journal ArticleDOI
Consonant duration in American English
TL;DR: In this paper, the temporal behavior of all measurable consonants, detailed in all possible conditions, in an extensive reading by one speaker, was discussed and a strong parallelism in duration distributions among similar kinds of consonants was found.