Showing papers on "Perplexity published in 1992"

PDF

Open Access

Proceedings Article•DOI•

The design for the wall street journal-based CSR corpus

[...]

Douglas B. Paul¹, Janet M. Baker•Institutions (1)

23 Feb 1992

TL;DR: This paper presents the motivating goals, acoustic data design, text processing steps, lexicons, and testing paradigms incorporated into the multi-faceted WSJ CSR Corpus, a corpus containing significant quantities of both speech data and text data.

...read moreread less

Abstract: The DARPA Spoken Language System (SLS) community has long taken a leadership position in designing, implementing, and globally distributing significant speech corpora widely used for advancing speech recognition research. The Wall Street Journal (WSJ) CSR Corpus described here is the newest addition to this valuable set of resources. In contrast to previous corpora, the WSJ corpus will provide DARPA its first general-purpose English, large vocabulary, natural language, high perplexity, corpus containing significant quantities of both speech data (400 hrs.) and text data (47M words), thereby providing a means to integrate speech recognition and natural language processing in application domains with high potential practical value. This paper presents the motivating goals, acoustic data design, text processing steps, lexicons, and testing paradigms incorporated into the multi-faceted WSJ CSR Corpus.

...read moreread less

1,100 citations

Proceedings Article•

The design for the wall street journal-based CSR corpus.

[...]

Douglas B. Paul¹, Janet M. Baker•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 1992

TL;DR: The WSJ CSR Corpus as mentioned in this paper is the first general-purpose English, large vocabulary, natural language, high perplexity, corpus containing significant quantities of both speech data (400 hrs.) and text data (47M words), thereby providing a means to integrate speech recognition and natural language processing in application domains with high potential practical value.

...read moreread less

1,032 citations

Journal Article•DOI•

The SPHINX-II Speech Recognition System: An Overview

[...]

Xuedong Huang¹, Fileno A. Alleva¹, Hsiao Hon¹, Mei Hwang¹, Ronald Rosenfeld¹ - Show less +1 more•Institutions (1)

Carnegie Mellon University¹

01 Jan 1992-Computer Speech & Language

TL;DR: The SPHINX-II speech recognition system is reviewed and recent efforts on improved speech recognition are summarized.

...read moreread less

576 citations

Journal Article•DOI•

DNA amplification at 11q13 in human cancer: from complexity to perplexity.

[...]

Patrick Gaudray¹, Pierre Szepetowski¹, Chantal Escot², Daniel Birnbaum², Charles Theillet¹ - Show less +1 more•Institutions (2)

Centre national de la recherche scientifique¹, French Institute of Health and Medical Research²

01 May 1992-Mutation Research\/reviews in Genetic Toxicology

100 citations

Proceedings Article•DOI•

Cooccurrence smoothing for stochastic language modeling

[...]

Ute Essen¹, Volker Steinbiss¹•Institutions (1)

Philips¹

23 Mar 1992

TL;DR: Using word-bigram language models, cooccurrence smoothing improved the test-set perplexity by 14% on a German 100000-word text corpus and by 10% on an English 1-million word corpus.

...read moreread less

Abstract: Training corpora for stochastic language models are virtually always too small for maximum-likelihood estimation, so smoothing the models is of great importance. The authors derive the cooccurrence smoothing technique for stochastic language modeling and give experimental evidence for its validity. Using word-bigram language models, cooccurrence smoothing improved the test-set perplexity by 14% on a German 100000-word text corpus and by 10% on an English 1-million word corpus. >

...read moreread less

91 citations

Proceedings Article•DOI•

Adaptive language modeling using minimum discriminant estimation

[...]

S. Della Pietra, V. Della Pietra, Robert Leroy Mercer, Salim Roukos

23 Feb 1992

TL;DR: An algorithm to adapt a n-gram language model to a document as it is dictated is presented and the resulting minimum discrimination information model results in a perplexity of 208 instead of 290 for the static trigram model on a document of 321 words.

...read moreread less

Abstract: We present an algorithm to adapt a n-gram language model to a document as it is dictated. The observed partial document is used to estimate a unigram distribution for the words that already occurred. Then, we find the closest n-gram distribution to the static n-gram distribution (using the discrimination information distance measure) and that satisfies the marginal constraints derived from the document. The resulting minimum discrimination information model results in a perplexity of 208 instead of 290 for the static trigram model on a document of 321 words.

...read moreread less

90 citations

Proceedings Article•DOI•

Adaptive language modeling using minimum discriminant estimation

[...]

S. Della Pietra¹, V. Della Pietra¹, Robert Leroy Mercer¹, Salim Roukos¹•Institutions (1)

IBM¹

23 Mar 1992

TL;DR: The authors present an algorithm to adapt a n-gram language model to a document as it is dictated that results in a perplexity of 208 instead of 290 for the static trigram model on a document of 321 words.

...read moreread less

Abstract: The authors present an algorithm to adapt a n-gram language model to a document as it is dictated. The observed partial document is used to estimate a unigram distribution for the words that already occurred. Then, they find the closest n-gram distribution to the static n-gram distribution (using the discrimination information distance measure) that satisfies the marginal constraints derived from the document. The resulting minimum discrimination information model results in a perplexity of 208 instead of 290 for the static trigram model on a document of 321 words. >

...read moreread less

88 citations

Proceedings Article•DOI•

Human-machine problem solving using spoken language systems (SLS): factors affecting performance and user satisfaction

[...]

Elizabeth Shriberg¹, Elizabeth Wade², Patti Price³•Institutions (3)

University of California, Berkeley¹, Stanford University², SRI International³

23 Feb 1992

TL;DR: While users may adapt to some aspects of an SLS, certain types of user behavior may require technological solutions and hyperarticulation increases recognition errors, and while instructions can reduce this behavior, they do not result in improved recognition performance.

...read moreread less

Abstract: We have analyzed three factors affecting user satisfaction and system performance using an SLS implemented in the ATIS domain. We have found that: (1) trade-offs between speed and accuracy have different implications for user satisfaction; (2) recognition performance improves over time, at least in part because of a reduction in sentence perplexity; and (3) hyperarticulation increases recognition errors, and while instructions can reduce this behavior, they do not result in improved recognition performance. We conclude that while users may adapt to some aspects of an SLS, certain types of user behavior may require technological solutions.

...read moreread less

81 citations

Proceedings Article•DOI•

Improvements in stochastic language modeling

[...]

Ronald Rosenfeld¹, Xuedong Huang¹•Institutions (1)

Carnegie Mellon University¹

23 Feb 1992

TL;DR: Two attempt to improve stochastic language models are described, and a new type of adaptive language model is proposed, using a framework where one word sequence triggers another, causing its estimated probability to be raised.

...read moreread less

Abstract: We describe two attempt to improve our stochastic language models. In the first, we identify a systematic overestimation in the traditional backoff model, and use statistical reasoning to correct it. Our modification results in up to 6% reduction in the perplexity of various tasks. Although the improvement is modest, it is achieved with hardly any increase in the complexity of the model. Both analysis and empirical data suggest that the modification is most suitable when training data is sparse.In the second attempt, we propose a new type of adaptive language model. Existing adaptive models use a dynamic cache, based on the history of the document seen up to that point. But another source of information in the history, within-document word sequence correlations, has not yet been tapped. We describe a model that attempts to capture this information, using a framework where one word sequence triggers another, causing its estimated probability to be raised. We discuss various issues in the design of such a model, and describe our first attempt at building one. Our preliminary results include a perplexity reduction of between 10% and 32%, depending on the test set.

...read moreread less

45 citations

Proceedings Article•DOI•

An automatic technique to include grammatical and morphological information in a trigram-based statistical language model

[...]

Giulio Maltese¹, F. Mancini¹•Institutions (1)

IBM¹

23 Mar 1992

TL;DR: A technique to take into account grammatical and morphological information in a trigram-based statistical language model is presented, which reduces the effect of data sparseness in the trigram model due also to the way interpolation coefficients are chosen.

...read moreread less

Abstract: A technique to take into account grammatical and morphological information in a trigram-based statistical language model is presented. This is automatically achieved by interpolating the trigram model (which uses sequences of words) with statistical models based on sequences of grammatical categories and/or lemmas. Such an approach reduces the effect of data sparseness in the trigram model due also to the way interpolation coefficients are chosen. With respect to trigrams, the authors obtained a significant reduction in perplexity on various texts even when combining a well-trained trigram model with a small grammatical/morphological model. >

...read moreread less

35 citations

Proceedings Article•DOI•

An LVQ based reference model for speaker-adaptive speech recognition

[...]

Otto Schmidbauer¹, J. Tebelskis¹•Institutions (1)

Carnegie Mellon University¹

23 Mar 1992

TL;DR: A novel type of hierarchical phoneme model for speaker adaptation, based on both hidden Markov models (HMM) and learned vector quantization (LVQ) networks is presented, achieving 82% word accuracy for speaker-dependent recognition and 73% in the speaker-adaptive mode.

...read moreread less

Abstract: A novel type of hierarchical phoneme model for speaker adaptation, based on both hidden Markov models (HMM) and learned vector quantization (LVQ) networks is presented. Low-level tied LVQ phoneme models are trained speaker-dependently and independently, yielding a pool of speaker-biased phoneme models which can be mixed into high-level speaker-adaptive phoneme models. Rapid speaker adaptation is performed by finding an optimal mixture for these models at recognition time, given only a small amount of speech data; subsequently, the models are fine-tuned to the new speaker's voice by further parameter reestimation. In preliminary experiments with a continuous speech task using 40 context-free phoneme models at task perplexity 111, the authors achieved 82% word accuracy for speaker-dependent recognition and 73% in the speaker-adaptive mode. >

...read moreread less

Proceedings Article•DOI•

Task adaptation in stochastic language models for continuous speech recognition

[...]

S. Matsunaga, T. Yamada, K. Shikano

23 Mar 1992

TL;DR: Two approaches for adapting a specific syllable trigram model to a new task are described, one uses a small amount of text data similar to the target task, and the other uses supervised learning using the most recent input phrases.

...read moreread less

Abstract: The authors describe two approaches for adapting a specific syllable trigram model to a new task. One uses a small amount of text data similar to the target task, and the other uses supervised learning using the most recent input phrases. The effect of each adaptation is verified with syllable perplexity and phrase recognition. Where the syntactic knowledge was only the syllable trigram model, the perplexity was reduced from 54.5 to 18.1 for the adaptation using 100 phrases of similar text, and was reduced to 14.6 by the supervised learning. The recognition rates were also improved from 42.3% to 46.6% and 50.9%, respectively. Text similarity for speech recognition is also studied. >

...read moreread less

Proceedings Article•DOI•

Continuous speech recognition by context-dependent phonetic HMM and an efficient algorithm for finding N-Best sentence hypotheses

[...]

I. Katunobu¹, H. Satoru², T. Hozumi²•Institutions (2)

Tokyo Institute of Technology¹, IBM²

23 Mar 1992

TL;DR: A continuous speech recognition system 'niNja' (Natural language INterface in JApanese), is presented and an LR parsing algorithm with context-dependent phone models is proposed to get high accuracy and to reduce the required computations.

...read moreread less

Abstract: A continuous speech recognition system 'niNja' (Natural language INterface in JApanese), is presented. Efficient search algorithms are proposed to get high accuracy and to reduce the required computations. First, an LR parsing algorithm with context-dependent phone models is proposed. Second, scores of the same phone models in different hypotheses at the phone-level are represented by the single score of the best hypotheses. The system is tested for the task with a 113 word vocabulary, with a word perplexity of 4.1. It produces a sentence accuracy of 97.3% for the 10 open speakers' 110 sentences and the error reduction is as much as 77% compared with using context independent phone models. >

...read moreread less

Book Chapter•DOI•

Speaker Independent Continuous Speech Recognition Using Continuous Density Hidden Markov Models

[...]

Chin-Hui Lee¹, Lawrence R. Rabiner¹, Roberto Pieraccini¹•Institutions (1)

Bell Labs¹

01 Jan 1992

TL;DR: A large vocabulary continuous speech recognition system developed at AT&T Bell Laboratories is described, and the methods used to provide high word recognition accuracy are discussed, focusing on the techniques adopted to select the set of fundamental speech units and to provide the acoustic models of these sub-word units based on a continuous density HMM (CDHMM) framework.

...read moreread less

Abstract: The field of large vocabulary continuous speech recognition has advanced to the point where there are several systems capable of providing greater than 95% word accuracy for speaker independent recognition, of a 1000 word vocabulary, spoken fluently for a task with a perplexity of about 60. There are several factors which account for the high performance achieved by these systems, including the use of effective feature analysis, the use of hidden Markov model (HMM) methodology, the use of context-dependent sub-word units to capture intra-word and inter-word phonemic variations, and the use of corrective training techniques to emphasize differences between acoustically similar words in the vocabulary. In this paper we describe a large vocabulary continuous speech recognition system developed at AT&T Bell Laboratories, and discuss the methods used to provide high word recognition accuracy. In particular we focus our discussion on the techniques adopted to select the set of fundamental speech units and to provide the acoustic models of these sub-word units based on a continuous density HMM (CDHMM) framework. Different modeling approaches, such as a discrete HMM and a tied-mixture HMM, will also be discussed and compared to the CDHMM approach.

...read moreread less

Proceedings Article•DOI•

Relationship among phoneme/word recognition rate, perplexity and sentence recognition and comparison of language models

[...]

Seiichi Nakagawa¹, I. Murase¹•Institutions (1)

Toyohashi University of Technology¹

23 Mar 1992

TL;DR: The authors describe their evaluation method, which is based on a relationship among perplexity (V/sub p/) on word-unit, sentence length, word (or phoneme) recognition rate (R/sub w/), and sentence recognition rate, which can predict the sentence recognition rates.

...read moreread less

Abstract: An evaluation technique is very important for developing a successful continuous speech recognition system. The branching factor and the perplexity have been used to measure the complexity of speech recognition task. The authors describe their evaluation method, which is based on such a measure. They found the relationship among perplexity (V/sub p/) on word-unit (or phoneme-unit), sentence length (L), word (or phoneme) recognition rate (R/sub w/), and sentence recognition rate. From this relationship, they can predict the sentence recognition rate, if the word (or phoneme) recognition performance and task definition are given. The approximate equation is as follows: sentence recognition rate=(f(V/sub p/, R/sub w/))/sup L/, where f(V/sub p/,R/sub w/) denotes the word recognition rate for the vocabulary size V/sub p/ obtained by using this recognizer (R/sub w/) and this is estimated from the relationship between the number of categories and recognition rate. >

...read moreread less

Proceedings Article•DOI•

Recent topics in speech recognition research at NTT laboratories

[...]

Sadaoki Furui, Kiyohiro Shikano, Shoichi Matsunaga, Tatsuo Matsuoka, Satoshi Takahashi, Tomokazu Yamada - Show less +2 more

23 Feb 1992

TL;DR: This paper introduces three recent topics in speech recognition research at NTT (Nippon Telegraph and Telephone) Human Interface Laboratories, including a new HMM (hidden Markov model) technique that uses VQ-code bigrams to constrain the output probability distribution of the model according to theVQ-codes of previous frames.

...read moreread less

Abstract: This paper introduces three recent topics in speech recognition research at NTT (Nippon Telegraph and Telephone) Human Interface Laboratories.The first topic is a new HMM (hidden Markov model) technique that uses VQ-code bigrams to constrain the output probability distribution of the model according to the VQ-codes of previous frames. The output probability distribution changes depending on the previous frames even in the same state, so this method reduces the overlap of feature distributions with different phonemes.The second topic is approaches for adapting a syllable trigram model to a new task in Japanese continuous speech recognition. An approach which uses the most recent input phrases for adaptation is effective in reducing the perplexity and improving phrase recognition rates.The third topic is stochastic language models for sequences of Japanese characters to be used in a Japanese dictation system with unlimited vocabulary. Japanese characters consist of Kanji (Chinese characters) and Kana (Japanese alphabets), and each Kanji has several readings depending on the context. Our dictation system uses character-trigram probabilities as a source model obtained from a text database consisting of both Kanji and Kana, and generates Kanji-and-Kana sequences directly from input speech.

...read moreread less

Proceedings Article•DOI•

DARPA February 1992 pilot corpus CSR dry run benchmark test results

[...]

David S. Pallett¹•Institutions (1)

National Institute of Standards and Technology¹

23 Feb 1992

TL;DR: A large, multi-component "general-purpose English, large vocabulary, natural language, high perplexity corpus" known as the DARPA Continuous speech Recognition (CSR) Corpus is developed.

...read moreread less

Abstract: Continuous speech recognition research activities within the DARPA Spoken Language community have, within the past several years, been focussed on the Resource Management (RM) and Air Travel Information System (ATIS) corpora Within the past year, plans have been developed for a large, multi-component "general-purpose English, large vocabulary, natural language, high perplexity corpus" known as the DARPA [Wall Street Journal-based] Continuous speech Recognition (CSR) Corpus [1] Doug Paul, of MIT Lincoln Laboratory (MIT/LL), and Janet Baker, of Dragon Systems, are responsible for many of the details of these plans This corpus is intended to supplant the RM corpora and to supplement the ATIS corpora as resources for the DARPA speech recognition research community

...read moreread less

Book Chapter•DOI•

Experiments in Dialogue Context Dependent Language Modelling

[...]

Petra Witschel¹, Gerhard Niedermair¹•Institutions (1)

Siemens¹

01 Jan 1992

TL;DR: It is shown that the bigram language model component which can be dynamically changed according to the context of the dialogue, can further reduce this perplexity in the continuous speech recognizer.

...read moreread less

Abstract: The German prototype (SUn Germ) of the European project SUndial 1 aims at a telephone based real time system for oral dialogues to inquire a database on intercity train schedules. To enhance system Performance one of our goals is to improve the recognition accuracy by minimizing the perplexity. We will show that our bigram language model component which can be dynamically changed according to the context of the dialogue, can further reduce this perplexity. This paper describes the integration of linguistic knowledge into the continuous speech recognizer2. Especially our experiments in the interaction of the language model component and the dialogue manager are presented. In our evaluation part we compare our approaches with and without dynamically varied language models by an overview of results of the Performance measurement.

...read moreread less

Book Chapter•DOI•

Syllable-based stochastic models for continuous speech recognition

[...]

G. Ruske¹, W. Weigel¹•Institutions (1)

Technische Universität München¹

01 Jan 1992

TL;DR: An automatic speech recognition system which is based on syllabic segmentation of the speech signal and stochastic models (HMMs) are used for representing demisyllable segments.

...read moreread less

Abstract: The paper describes an automatic speech recognition system which is based on syllabic segmentation of the speech signal. Stochastic models (HMMs) are used for representing demisyllable segments. The advantages of syllabic processing within the different stages of the system (i.e. segmentation, phonetic classification, word and sentence recognition) are demonstrated and discussed on the basis of experimental results. Word and sentence recognition with a perplexity of 27 reached 74% and 96%, respectively.

...read moreread less

Proceedings Article•DOI•

Expanding the vocabulary of a connectionist recognizer trained on the DARPA Resource Management corpus

[...]

H. Lucke¹, Frank Fallside¹•Institutions (1)

University of Cambridge¹

23 Mar 1992

TL;DR: It is shown how the compositional representation (CR) previously used for lexical access from sub-word recognizers for a relatively small word vocabulary can be extended to much larger vocabularies without further training.

...read moreread less

Abstract: It is shown how the compositional representation (CR) previously used for lexical access from sub-word recognizers for a relatively small word vocabulary can be extended to much larger vocabularies without further training. This is demonstrated for the DARPA Resource Management database where, using sub-word units as input, words are presented distributively over a fixed number of units and classified using a simple network. Initially, the architecture is trained on 147 words achieving an accuracy 91.2%. Then, leaving the recognizer unchanged, it is shown how additional output units can be added to the network to increase the vocabulary to the complete set of 975 phonetically distinct words. On this extended vocabulary the performance dropped to 66% but this drop is less than the expected drop due to the perplexity increase. Further improvement would be achieved by improving the performance on the original data set. >

...read moreread less

Paul's expression of perplexity in Galatians 1:6: The force of emotive argumentation

[...]

J.H. Roberts

01 Mar 1992

TL;DR: The use of the 'thaumazo' expression in papyri letters has been investigated in this article, where it is shown how Paul's use of it concurs with this.

...read moreread less

Abstract: The results of a previous study on the 'thaumazo' expression in a number of papyri letters are summarised and it is shown how Paul's use of the expression concurs with this. By means of xxxxxxx 'thaumazo', Paul expresses his perplexity about the conduct of the Galatians. The use of this expression results in a number of emotive implications regarding writer and recipients that have direct bearing on the function it performs in the letter. Among others it implies a severe rebuke of the recipients, with a view, however, to challenging them to return to the only message that holds good news for them. Since it also implies that returning to the one true gospel of salvation by faith is the only way open to them as people called by God, it is possible that this emotional transition to the arguments of the letter may already have won the day at this stage, at least in the case of some recipients.

...read moreread less