scispace - formally typeset
Search or ask a question

Showing papers in "Speech Communication in 1989"


Journal ArticleDOI
M. H. Savoji1
TL;DR: A robust new algorithm for accurate endpointing of speech signals is described in this paper after an overview of the literature, which uses simple measures based on energy and zero-crossing rate for speech/silence detection.

113 citations


Journal ArticleDOI
TL;DR: A four parameter model of the glottis is described with similar kinematic parameters to complement this approach and provides an alternative to flow pulse modeling because it can include some source-system interactions with relatively little computational overhead.

70 citations


Journal ArticleDOI
TL;DR: Anticipatory effects appear to be more tightly controlled than carryover effects presumably because of phonemic preplanning, and gestural antagonism in the contextual phonemes affects the two coarticulatory types differently.

64 citations


Journal ArticleDOI
TL;DR: In articulatory phonetics speech is described as a sequence of distinct articulatory gestures, each of which produces an acoustic event that should approximate a phonetic target as discussed by the authors, but due to the overlap of the gestures these phonetic targets are often only partly realized.

51 citations


Journal ArticleDOI
TL;DR: This large-scale (3–3.5 Bark) spectral integration theory derived from the work of Chistovich and colleagues and supposed to provide a basis for the computation of the F2 parameter is not in fact supported by an actual proof, since all presumed evidence can be understood without this theory.

50 citations


Journal ArticleDOI
Patti Price1
TL;DR: The magnitudes of the male-female differences are similar to those observed for the creaky-normal voicing differences and breathy-normal differences, and may arise from a combination of biological, sociological and acoustical effects.

47 citations


Journal ArticleDOI
TL;DR: The model presented here shows that syntax-driven and rhythm-driven strategies could be extreme cases of a more complex model which integrates both syntactic and rhythmic constraints.

42 citations


Journal ArticleDOI
TL;DR: It is concluded that the UP strongly mediates the recognition of spoken words with early UP, and the shadowing of late-UP items is best predicted by word length in slower, and by word frequency in faster subjects; this suggests the intervention of different mechanisms.

34 citations


Journal ArticleDOI
TL;DR: No intrinsic superiority in the discrimination performance of connected speech as opposed to sustained vowels could be found and in the case of running speech absolute microperturbation values appeared to be higher during inter-segment transitions and during voice onset and offset.

29 citations


Journal ArticleDOI
TL;DR: A tentative conclusion from these experiments is that it is easier for the perceptual system to compensate for the effects of a transmission channel if it only changes the relative amplitudes of formants than if it changes estimated formant frequencies.

24 citations


Journal ArticleDOI
TL;DR: A two-channel approach to speech analysis is recommended to aid the automatic processing of speech, where one channel is the conventional acoustic signal, while the other channel isThe electroglottogram (EGG).

Journal ArticleDOI
TL;DR: This paper compares the results obtained with nine different sets of speech parametes, including log- area parameters, formants, reflection coefficients and band-filter parameters and concludes that log-area parameters from the most suitable parameter set available for temporal decomposition are obtained.

Journal ArticleDOI
TL;DR: The algorithm is based on the iterative use of a linear filter with zero phase and monotonically decreasing frequency response, providing an estimate for the locations of the closure and opening of the vocal chords.

Journal ArticleDOI
TL;DR: The use of the quadratic classifier together with the individual feature space is shown to drastically improve recognition accuracy while the added memory requirements are shown to be negligible.

Journal ArticleDOI
TL;DR: In a paired comparison task, two factors appeared to affect the tempo judgements to a certain extent: the response category to be used by the listeners and the position of the stimulus with standard tempo.

Journal ArticleDOI
TL;DR: The LSP representation is studied for speech recognition, and the weighted LSP distance measure is found to perform significantly better than these popular LP distance measures.

Journal ArticleDOI
Joseph Picone1
TL;DR: A clustering algorithm based on the standard KMEANS procedure that generates reference models for continuous density Hidden Markov Model (HMM) based systems by simultaneously considering spectral and duration information is introduced.

Journal ArticleDOI
TL;DR: The results indicated that the effects of the parameters are additive and that, although presence/absence of periodicity (VOT and VTT) is the most important determinant of perceived voicing, perception is also to a large extent affected by “C2”-duration and “preceding vowel” duration.

Journal ArticleDOI
TL;DR: A CELP speech coding algorithm where the coder parameters are jointly optimized where the relation between pitch period, pitch predictor coefficient, codebook entry and scaling factor is derived.

Journal ArticleDOI
TL;DR: Investigations on a population of 22 speakers showed that the elimination of the time-invariant spectral components from the speech features, taking place when performing cepstral normalization or computing first-order orthogonal coefficients, brings a substantial reliability improvement.

Journal ArticleDOI
TL;DR: A Partial Connection Multilayered Network (PCMN), based on a technique of partial connection between layers, is presented, which permits the efficient treatment of temporal information, which is very important in speech processing, unlike image processing.

Journal ArticleDOI
TL;DR: The Dempster-Shafer formalism is applied in order to combine information in the lexicon, using a frequency distribution as the basis for evidence evaluation and has suitable properties in the case of an oral dialogue system, as it preserves module autonomy and allows backtracking at any time during the recognition process.


Journal ArticleDOI
TL;DR: This work used both a more conventional articulation test and a monosyllabic adaptive speech interference test to evaluate the intelligibility of nine different speech-coding techniques, and found different patterns of responses.

Journal ArticleDOI
TL;DR: It is proposed that the study of lexical stress in continuous speech be accompanied by theStudy of prosodics and their general use in sentences, to avoid the problem of syllable segmentation.

Journal ArticleDOI
TL;DR: An experimental Dutch keyboard-to-speech system has been developed to explore the possibilities and limitations of Dutch speech synthesis in a communication aid for the speech impaired as mentioned in this paper, using diphones and a formant synthesizer chip for speech synthesis.

Journal ArticleDOI
TL;DR: Discrete power spectrum features, i.e. the sign and rank-order functions of a bandpass filter output are analyzed together with more standard features such as LPC coefficients and the short-time spectrum measured by means of aBandpass filter bank.

Journal ArticleDOI
TL;DR: It is concluded that phonetic and psycholinguistic feature representations need not match.

Journal ArticleDOI
TL;DR: The gain portion of a shape-gain quantizer is made adaptive, yielding a vector quantizer that can adjust itself to the time-varying amplitude of a speech signal.