Showing papers in "Computer Speech & Language in 1991"
••
TL;DR: The “Wizard of Oz” technique for simulating future interactive technology and a partial taxonomy of such simulations is reviewed and a general Wizard of Oz methodology is suggested.
425 citations
••
TL;DR: The enhanced Good-Turing method is introduced, which is three to four times as efficient in its use of data as the enhanced deleted estimate method and provides accurate predictions of the variances of the standard probabilities estimated from the test corpus.
273 citations
••
TL;DR: This paper reviews several promising methods that were proposed in the past few years to deal with the problem of automatic speech recognition in an adverse environment, discussing methods or algorithms in six categories: signal enhancement preprocessing; special transducer arrangements; noise masking; stress compensation; robust distortion measures; and novel speech representations.
206 citations
••
TL;DR: A speaker-independent phoneme and word recognition system based on a recurrent error propagation network trained on the TIMIT database and analysis of the phoneme recognition results shows that information available from bigram and durational constraints is adequately handled within the network allowing for efficient parsing of the network output.
170 citations
••
TL;DR: A computer program called PRONOUNCE is developed that automatically generates a set of rank-ordered pronunciations, in the form of a sequence of phonetic segments, using pronunciation-by-analogy with a lexicon of approximately 20 000 words based on Webster's Pocket Dictionary, and suggests a new approach to spelling- to-sound conversion for text-to-speech conversion systems.
98 citations
••
TL;DR: Two speech modalities that represent opposites on the spectrum of speaker interaction—the telephone dialogue and audiotape monologue are examined, providing a comprehensive analysis of basic differences in discourse organization, referential characteristics and performance efficiency.
79 citations
••
TL;DR: In this article, the authors propose a language acquisition mechanism based on connectionist methods, in which the network builds associations between messages and meaningful responses to them, and evaluate the utility of this approach on an inward-call-management task.
55 citations
••
TL;DR: Approximate maximum likelihood (ML) hidden Markov modeling using the most likely state sequence (MLSS) is examined and compared with the exact ML approach that considers all possible state sequences.
42 citations
••
TL;DR: Results are presented which show that, using standard three-state models, the addition of the variable frame rate analysis results in considerably improved performance, which is close to that obtained using simple duration sensitive models.
41 citations
••
TL;DR: A set of phonological rules are used to redefine word junctures, specifying how to replace or delete the boundary phones according to the neighboring phones, so avoiding most of the training issues on the 991-word speaker-independent DARPA task.
37 citations
••
TL;DR: Markov models for Mandarin Chinese are developed to solve effectively the above decoding problem, and an efficient algorithm suitable for parallel processing based on dynamic programming is further proposed to search fully the solution space of exponential size in polynomial time.
••
TL;DR: The results show that when the spoken input deviates from the predefined task grammar, a combination of weak a priori syntax rules in conjunction with full a posteriori parsing on a lattice of alternative word matches provides the most robust recognition performance.
••
TL;DR: It is demonstrated that the new method is time optimal and that its performance tends to that of the VQ with the LBG algorithm, and the optimal design is derived for the Itakura distance measures.
••
TL;DR: It is shown that isomorphic context-specific connectionist networks for phoneme recognition can be merged into a single context-modulated network that makes use of second-order unit interconnections.
••
TL;DR: Several special speech recognition approaches based on hidden Markov models (HMMs) are presented for the highly confusing Mandarin syllables by considering the characteristics of the vocabulary to provide better performance.
••
TL;DR: The method is based on the observation that function words such as case markers, pronouns and conjunctions occur frequently in Hindi text and spotting of these frequently occurring patterns is proposed as a means for hypothesizing word boundaries in a speech-to-text conversion system for Hindi.
••
TL;DR: It is argued that successful segment labeling will not bring all the expected benefits, and that some higher-level knowledge is needed, as well as the advantages and remaining difficulties of phonology in an ASR system.
••
TL;DR: Overall accuracy of 96% has been obtained for speaker independent recognition of a small vocabulary and the simplicity of the algorithm enables a low-cost real-time implementation of the recognizer.
••
TL;DR: It is shown that the formant tracks of rapidly time varying speech are displayed correctly by spectrograms, and if the model of the time variant formant is based on the notion of instantaneous frequency, the discrepancies in the interpretation of the spectrogram disappear.
••
TL;DR: This paper presents an algorithm which exploits the fact that the entries of the trellis are essentially zero except near the block diagonal and hence achieves significant computational saving and is shown to be about an order of magnitude faster than the full Baum-Welch algorithm.