scispace - formally typeset
Proceedings ArticleDOI

Comparison of part-of-speech and automatically derived category-based language models for speech recognition

Reads0
Chats0
TLDR
This paper compares various category-based language models when used in conjunction with a word-based trigram by means of linear interpolation to find the largest improvement with a model using automatically determined categories.
Abstract
This paper compares various category-based language models when used in conjunction with a word-based trigram by means of linear interpolation. Categories corresponding to parts-of-speech as well as automatically clustered groupings are considered. The category-based model employs variable-length n-grams and permits each word to belong to multiple categories. Relative word error rate reductions of between 2 and 7% over the baseline are achieved in N-best rescoring experiments on the Wall Street Journal corpus. The largest improvement is obtained with a model using automatically determined categories. Perplexities continue to decrease as the number of different categories is increased, but improvements in the word error rate reach an optimum.

read more

Citations
More filters
Journal ArticleDOI

A neural probabilistic language model

TL;DR: The authors propose to learn a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences, which can be expressed in terms of these representations.
Proceedings Article

Hierarchical Probabilistic Neural Network Language Model.

TL;DR: A hierarchical decomposition of the conditional probabilities that yields a speed-up of about 200 both during training and recognition, constrained by the prior knowledge extracted from the WordNet semantic hierarchy is introduced.
Book ChapterDOI

Neural Probabilistic Language Models

TL;DR: This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences, and incorporates this new language model into a state-of-the-art speech recognizer of conversational speech.
Posted Content

A Bit of Progress in Language Modeling

TL;DR: A combination of all techniques together to a Katz smoothed trigram model with no count cutoffs achieves perplexity reductions between 38 and 50% (1 bit of entropy), depending on training data size, as well as a word error rate reduction of 8.9%.
Journal ArticleDOI

A bit of progress in language modeling

TL;DR: The authors compare a combination of all of these techniques together to a Katz smoothed trigram model with no count cutoffs, achieving perplexity reductions between 38 and 50% depending on training data size, as well as a word error rate reduction of 8.9%.
References
More filters
Journal ArticleDOI

Estimation of probabilities from sparse data for the language model component of a speech recognizer

TL;DR: The model offers, via a nonlinear recursive procedure, a computation and space efficient solution to the problem of estimating probabilities from sparse data, and compares favorably to other proposed methods.
Proceedings ArticleDOI

The design for the wall street journal-based CSR corpus

TL;DR: This paper presents the motivating goals, acoustic data design, text processing steps, lexicons, and testing paradigms incorporated into the multi-faceted WSJ CSR Corpus, a corpus containing significant quantities of both speech data and text data.
Proceedings Article

The design for the wall street journal-based CSR corpus.

TL;DR: The WSJ CSR Corpus as mentioned in this paper is the first general-purpose English, large vocabulary, natural language, high perplexity, corpus containing significant quantities of both speech data (400 hrs.) and text data (47M words), thereby providing a means to integrate speech recognition and natural language processing in application domains with high potential practical value.
Proceedings Article

Statistical Language Modeling using the CMU-Cambridge Toolkit

TL;DR: The CMU Statistical Language Modeling toolkit was re leased in in order to facilitate the construction and testing of bigram and trigram language models and the technology as implemented in the toolkit is outlined.
Journal ArticleDOI

On structuring probabilistic dependences in stochastic language modelling

TL;DR: The problem of stochastic language modelling is studied from the viewpoint of introducing suitable structures into the conditional probability distributions, and nonlinear interpolation as an alternative to linear interpolation; equivalence classes for word histories and single words; cache memory and word associations are considered.
Related Papers (5)