Showing papers in &quot;Computer Speech &amp; Language in 1991&quot;

A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams

TL;DR: The “Wizard of Oz” technique for simulating future interactive technology and a partial taxonomy of such simulations is reviewed and a general Wizard of Oz methodology is suggested.

...read moreread less

425 citations

Journal Article•DOI•

[...]

Kenneth Church¹, William A. Gale¹•Institutions (1)

Speech recognition in adverse environments

TL;DR: The enhanced Good-Turing method is introduced, which is three to four times as efficient in its use of data as the enhanced deleted estimate method and provides accurate predictions of the variances of the standard probabilities estimated from the test corpus.

...read moreread less

273 citations

Journal Article•DOI•

[...]

Biing-Hwang Juang¹•Institutions (1)

A recurrent error propagation network speech recognition system

TL;DR: This paper reviews several promising methods that were proposed in the past few years to deal with the problem of automatic speech recognition in an adverse environment, discussing methods or algorithms in six categories: signal enhancement preprocessing; special transducer arrangements; noise masking; stress compensation; robust distortion measures; and novel speech representations.

...read moreread less

206 citations

Journal Article•DOI•

[...]

Tony Robinson¹, Frank Fallside¹•Institutions (1)

University of Cambridge¹

Pronounce : a program for pronunciation by analogy

TL;DR: A speaker-independent phoneme and word recognition system based on a recurrent error propagation network trained on the TIMIT database and analysis of the phoneme recognition results shows that information available from bigram and durational constraints is adequately handled within the network allowing for efficient parsing of the network output.

...read moreread less

170 citations

Journal Article•DOI•

[...]

Michael J. Dedina¹, Howard C. Nusbaum²•Institutions (2)

Indiana University¹, University of Illinois at Chicago²

Discourse structure and performance efficiency in interactive and non-interactive spoken modalities☆

TL;DR: A computer program called PRONOUNCE is developed that automatically generates a set of rank-ordered pronunciations, in the form of a sequence of phonetic segments, using pronunciation-by-analogy with a lexicon of approximately 20 000 words based on Webster's Pocket Dictionary, and suggests a new approach to spelling- to-sound conversion for text-to-speech conversion systems.

...read moreread less

98 citations

Journal Article•DOI•

[...]

Sharon Oviatt¹, Philip R. Cohen¹•Institutions (1)

Artificial Intelligence Center¹

Adaptive acquisition of language

TL;DR: Two speech modalities that represent opposites on the spectrum of speaker interaction—the telephone dialogue and audiotape monologue are examined, providing a comprehensive analysis of basic differences in discourse organization, referential characteristics and performance efficiency.

...read moreread less

79 citations

Journal Article•DOI•

[...]

Allen Louis Gorin¹, Stephen E. Levinson¹, A.N. Gertner¹, E. Goldman¹•Institutions (1)

Hidden Markov modeling using a dominant state sequence with application to speech recognition

TL;DR: In this article, the authors propose a language acquisition mechanism based on connectionist methods, in which the network builds associations between messages and meaningful responses to them, and evaluate the utility of this approach on an inward-call-management task.

...read moreread less

55 citations

Journal Article•DOI•

[...]

Neri Merhav¹, Yariv Ephraim¹•Institutions (1)

The use of variable frame rate analysis in speech recognition

TL;DR: Approximate maximum likelihood (ML) hidden Markov modeling using the most likely state sequence (MLSS) is examined and compared with the exact ML approach that considers all possible state sequences.

...read moreread less

42 citations

Journal Article•DOI•

[...]

K. M. Ponting, S.M. Peeling

Word juncture modeling using phonological rules for HMM-based continuous speech recognition

TL;DR: Results are presented which show that, using standard three-state models, the addition of the variable frame rate analysis results in considerably improved performance, which is close to that obtained using simple duration sensitive models.

...read moreread less

41 citations

Journal Article•DOI•

[...]

E. Giachin¹, Aaron E. Rosenberg¹, Chin-Hui Lee¹•Institutions (1)

Markov modeling of Mandarin Chinese for decoding the phonetic sequence into Chinese characters

TL;DR: A set of phonological rules are used to redefine word junctures, specifying how to replace or delete the boundary phones according to the neighboring phones, so avoiding most of the training issues on the 991-word speaker-independent DARPA task.

...read moreread less

37 citations

Journal Article•DOI•

[...]

Hung-yan Gu¹, Chiu-yu Tseng², Lin-Shan Lee¹•Institutions (2)

National Taiwan University¹, Academia Sinica²

The use of syntax and multiple alternatives in the VODIS voice operated database inquiry system

TL;DR: Markov models for Mandarin Chinese are developed to solve effectively the above decoding problem, and an efficient algorithm suitable for parallel processing based on dynamic programming is further proposed to search fully the solution space of exponential size in polynomial time.

...read moreread less

Journal Article•DOI•

[...]

Steve Young¹, N.H. Russell¹, J.H.S. Thornton¹•Institutions (1)

University of Cambridge¹

On the design of connectionist vector quantizers

TL;DR: The results show that when the spoken input deviates from the predefined task grammar, a combination of weak a priori syntax rules in conjunction with full a posteriori parsing on a lattice of alternative word matches provides the most robust recognition performance.

...read moreread less

Journal Article•DOI•

[...]

Lizhong Wu¹, Frank Fallside¹•Institutions (1)

University of Cambridge¹

Context-modulated vowel discrimination using connectionist networks☆

TL;DR: It is demonstrated that the new method is time optimal and that its performance tends to that of the VQ with the LBG algorithm, and the optimal design is derived for the Itakura distance measures.

...read moreread less

Journal Article•DOI•

[...]

Raymond L. Watrous¹•Institutions (1)

University of Toronto¹

Special speech recognition approaches for the highly confusing Mandarin syllables based on hidden Markov models

TL;DR: It is shown that isomorphic context-specific connectionist networks for phoneme recognition can be merged into a single context-modulated network that makes use of second-order unit interconnections.

...read moreread less

Journal Article•DOI•

[...]

Lin-Shan Lee¹, Chiu-yu Tseng², Fu-hua Liu¹, C.H. Chang¹, Hung-yan Gu¹, S.H. Hsieh¹, C.H. Chen¹ - Show less +3 more•Institutions (2)

National Taiwan University¹, Academia Sinica²

Word boundary hypothesization in Hindi speech

TL;DR: Several special speech recognition approaches based on hidden Markov models (HMMs) are presented for the highly confusing Mandarin syllables by considering the characteristics of the vocabulary to provide better performance.

...read moreread less

Journal Article•DOI•

[...]

G. V. Ramana Rao¹, B. Yegnanarayana¹•Institutions (1)

Indian Institute of Technology Madras¹

On an application of phonological knowledge in automatic speech recognition

TL;DR: The method is based on the observation that function words such as case markers, pronouns and conjunctions occur frequently in Hindi text and spotting of these frequently occurring patterns is proposed as a means for hypothesizing word boundaries in a speech-to-text conversion system for Hindi.

...read moreread less

Journal Article•DOI•

[...]

Charles Hoequist¹, Francis Nolan²•Institutions (2)

Research Triangle Park¹, University of Cambridge²

A probability decision criterion for speech recognition

TL;DR: It is argued that successful segment labeling will not bring all the expected benefits, and that some higher-level knowledge is needed, as well as the advantages and remaining difficulties of phonology in an ASR system.

...read moreread less

Journal Article•DOI•

[...]

C.K. Yu¹, P. C. Ching¹•Institutions (1)

The Chinese University of Hong Kong¹

Comments on On the spectrographic representation of rapidly time varying speech

TL;DR: Overall accuracy of 96% has been obtained for speaker independent recognition of a small vocabulary and the simplicity of the algorithm enables a low-cost real-time implementation of the recognizer.

...read moreread less

Journal Article•DOI•

[...]

Wolfgang Wokurek¹•Institutions (1)

Vienna University of Technology¹

The semi-relaxed algorithm for estimating parameters of hidden Markov models

TL;DR: It is shown that the formant tracks of rapidly time varying speech are displayed correctly by spectrograms, and if the model of the time variant formant is based on the notion of instantaneous frequency, the discrepancies in the interpretation of the spectrogram disappear.

...read moreread less

Journal Article•DOI•

[...]

Li Deng¹•Institutions (1)

University of Waterloo¹