scispace - formally typeset
Search or ask a question

Showing papers in "Computer Speech & Language in 1991"


Journal ArticleDOI
TL;DR: The “Wizard of Oz” technique for simulating future interactive technology and a partial taxonomy of such simulations is reviewed and a general Wizard of Oz methodology is suggested.

425 citations


Journal ArticleDOI
TL;DR: The enhanced Good-Turing method is introduced, which is three to four times as efficient in its use of data as the enhanced deleted estimate method and provides accurate predictions of the variances of the standard probabilities estimated from the test corpus.

273 citations


Journal ArticleDOI
Biing-Hwang Juang1
TL;DR: This paper reviews several promising methods that were proposed in the past few years to deal with the problem of automatic speech recognition in an adverse environment, discussing methods or algorithms in six categories: signal enhancement preprocessing; special transducer arrangements; noise masking; stress compensation; robust distortion measures; and novel speech representations.

206 citations


Journal ArticleDOI
TL;DR: A speaker-independent phoneme and word recognition system based on a recurrent error propagation network trained on the TIMIT database and analysis of the phoneme recognition results shows that information available from bigram and durational constraints is adequately handled within the network allowing for efficient parsing of the network output.

170 citations


Journal ArticleDOI
TL;DR: A computer program called PRONOUNCE is developed that automatically generates a set of rank-ordered pronunciations, in the form of a sequence of phonetic segments, using pronunciation-by-analogy with a lexicon of approximately 20 000 words based on Webster's Pocket Dictionary, and suggests a new approach to spelling- to-sound conversion for text-to-speech conversion systems.

98 citations


Journal ArticleDOI
TL;DR: Two speech modalities that represent opposites on the spectrum of speaker interaction—the telephone dialogue and audiotape monologue are examined, providing a comprehensive analysis of basic differences in discourse organization, referential characteristics and performance efficiency.

79 citations


Journal ArticleDOI
TL;DR: In this article, the authors propose a language acquisition mechanism based on connectionist methods, in which the network builds associations between messages and meaningful responses to them, and evaluate the utility of this approach on an inward-call-management task.

55 citations


Journal ArticleDOI
Neri Merhav1, Yariv Ephraim1
TL;DR: Approximate maximum likelihood (ML) hidden Markov modeling using the most likely state sequence (MLSS) is examined and compared with the exact ML approach that considers all possible state sequences.

42 citations


Journal ArticleDOI
TL;DR: Results are presented which show that, using standard three-state models, the addition of the variable frame rate analysis results in considerably improved performance, which is close to that obtained using simple duration sensitive models.

41 citations


Journal ArticleDOI
TL;DR: A set of phonological rules are used to redefine word junctures, specifying how to replace or delete the boundary phones according to the neighboring phones, so avoiding most of the training issues on the 991-word speaker-independent DARPA task.

37 citations


Journal ArticleDOI
TL;DR: Markov models for Mandarin Chinese are developed to solve effectively the above decoding problem, and an efficient algorithm suitable for parallel processing based on dynamic programming is further proposed to search fully the solution space of exponential size in polynomial time.

Journal ArticleDOI
TL;DR: The results show that when the spoken input deviates from the predefined task grammar, a combination of weak a priori syntax rules in conjunction with full a posteriori parsing on a lattice of alternative word matches provides the most robust recognition performance.

Journal ArticleDOI
TL;DR: It is demonstrated that the new method is time optimal and that its performance tends to that of the VQ with the LBG algorithm, and the optimal design is derived for the Itakura distance measures.

Journal ArticleDOI
TL;DR: It is shown that isomorphic context-specific connectionist networks for phoneme recognition can be merged into a single context-modulated network that makes use of second-order unit interconnections.

Journal ArticleDOI
TL;DR: Several special speech recognition approaches based on hidden Markov models (HMMs) are presented for the highly confusing Mandarin syllables by considering the characteristics of the vocabulary to provide better performance.

Journal ArticleDOI
TL;DR: The method is based on the observation that function words such as case markers, pronouns and conjunctions occur frequently in Hindi text and spotting of these frequently occurring patterns is proposed as a means for hypothesizing word boundaries in a speech-to-text conversion system for Hindi.

Journal ArticleDOI
TL;DR: It is argued that successful segment labeling will not bring all the expected benefits, and that some higher-level knowledge is needed, as well as the advantages and remaining difficulties of phonology in an ASR system.

Journal ArticleDOI
TL;DR: Overall accuracy of 96% has been obtained for speaker independent recognition of a small vocabulary and the simplicity of the algorithm enables a low-cost real-time implementation of the recognizer.

Journal ArticleDOI
TL;DR: It is shown that the formant tracks of rapidly time varying speech are displayed correctly by spectrograms, and if the model of the time variant formant is based on the notion of instantaneous frequency, the discrepancies in the interpretation of the spectrogram disappear.

Journal ArticleDOI
Li Deng1
TL;DR: This paper presents an algorithm which exploits the fact that the entries of the trellis are essentially zero except near the block diagonal and hence achieves significant computational saving and is shown to be about an order of magnitude faster than the full Baum-Welch algorithm.