Comparison of part-of-speech and automatically derived category-based language models for speech recognition

doi:10.1109/ICASSP.1998.674396

Proceedings ArticleDOI

Comparison of part-of-speech and automatically derived category-based language models for speech recognition

Thomas Niesler, +2 more

- Vol. 1, pp 177-180

Chats0

TLDR

This paper compares various category-based language models when used in conjunction with a word-based trigram by means of linear interpolation to find the largest improvement with a model using automatically determined categories.

Abstract:

This paper compares various category-based language models when used in conjunction with a word-based trigram by means of linear interpolation. Categories corresponding to parts-of-speech as well as automatically clustered groupings are considered. The category-based model employs variable-length n-grams and permits each word to belong to multiple categories. Relative word error rate reductions of between 2 and 7% over the baseline are achieved in N-best rescoring experiments on the Wall Street Journal corpus. The largest improvement is obtained with a model using automatically determined categories. Perplexities continue to decrease as the number of different categories is increased, but improvements in the word error rate reach an optimum.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A neural probabilistic language model

Yoshua Bengio, +3 more

- 01 Mar 2003 -

Journal of Machine Learning Research

TL;DR: The authors propose to learn a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences, which can be expressed in terms of these representations.

...read moreread less

Proceedings Article

Hierarchical Probabilistic Neural Network Language Model.

Frederic Morin, +1 more

TL;DR: A hierarchical decomposition of the conditional probabilities that yields a speed-up of about 200 both during training and recognition, constrained by the prior knowledge extracted from the WordNet semantic hierarchy is introduced.

...read moreread less

Book ChapterDOI

Neural Probabilistic Language Models

Yoshua Bengio, +4 more

TL;DR: This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences, and incorporates this new language model into a state-of-the-art speech recognizer of conversational speech.

...read moreread less

Posted Content

A Bit of Progress in Language Modeling

Joshua T. Goodman

- 09 Aug 2001 -

arXiv: Computation and Language

TL;DR: A combination of all techniques together to a Katz smoothed trigram model with no count cutoffs achieves perplexity reductions between 38 and 50% (1 bit of entropy), depending on training data size, as well as a word error rate reduction of 8.9%.

...read moreread less

Journal ArticleDOI

A bit of progress in language modeling

Joshua T. Goodman

- 01 Oct 2001 -

Computer Speech & Language

TL;DR: The authors compare a combination of all of these techniques together to a Katz smoothed trigram model with no count cutoffs, achieving perplexity reductions between 38 and 50% depending on training data size, as well as a word error rate reduction of 8.9%.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Estimation of probabilities from sparse data for the language model component of a speech recognizer

S. Katz

- 01 Mar 1987 -

IEEE Transactions on Acoustics, Speech, ...

TL;DR: The model offers, via a nonlinear recursive procedure, a computation and space efficient solution to the problem of estimating probabilities from sparse data, and compares favorably to other proposed methods.

...read moreread less

Proceedings ArticleDOI

The design for the wall street journal-based CSR corpus

Douglas B. Paul, +1 more

TL;DR: This paper presents the motivating goals, acoustic data design, text processing steps, lexicons, and testing paradigms incorporated into the multi-faceted WSJ CSR Corpus, a corpus containing significant quantities of both speech data and text data.

...read moreread less

Proceedings Article

The design for the wall street journal-based CSR corpus.

Douglas B. Paul, +1 more

TL;DR: The WSJ CSR Corpus as mentioned in this paper is the first general-purpose English, large vocabulary, natural language, high perplexity, corpus containing significant quantities of both speech data (400 hrs.) and text data (47M words), thereby providing a means to integrate speech recognition and natural language processing in application domains with high potential practical value.

...read moreread less

Proceedings Article

Statistical Language Modeling using the CMU-Cambridge Toolkit

Philip Clarkson, +1 more

TL;DR: The CMU Statistical Language Modeling toolkit was re leased in in order to facilitate the construction and testing of bigram and trigram language models and the technology as implemented in the toolkit is outlined.

...read moreread less

Journal ArticleDOI

On structuring probabilistic dependences in stochastic language modelling

Hermann Ney, +2 more

- 01 Jan 1994 -

Computer Speech & Language

TL;DR: The problem of stochastic language modelling is studied from the viewpoint of introducing suitable structures into the conditional probability distributions, and nonlinear interpolation as an alternative to linear interpolation; equivalence classes for word histories and single words; cache memory and word associations are considered.

...read moreread less

Comparison of part-of-speech and automatically derived category-based language models for speech recognition

Citations

A neural probabilistic language model

Hierarchical Probabilistic Neural Network Language Model.

Neural Probabilistic Language Models

A Bit of Progress in Language Modeling

A bit of progress in language modeling

References

Estimation of probabilities from sparse data for the language model component of a speech recognizer

The design for the wall street journal-based CSR corpus

The design for the wall street journal-based CSR corpus.

Statistical Language Modeling using the CMU-Cambridge Toolkit

On structuring probabilistic dependences in stochastic language modelling

Related Papers (5)

Class-based n -gram models of natural language

Estimation of probabilities from sparse data for the language model component of a speech recognizer

An empirical study of smoothing techniques for language modeling

Interpolated estimation of Markov source parameters from sparse data

Improved backing-off for M-gram language modeling