scispace - formally typeset
Journal ArticleDOI

Variable-length sequence modeling: multigrams

F. Bimbot, +3 more
- 01 Jun 1995 - 
- Vol. 2, Iss: 6, pp 111-113
TLDR
A model that represents sentences as a concatenation of variable-length sequences of units and an algorithm for unsupervised estimation of the model parameters is presented and an approach is illustrated for the segmentation of sequences of letters into subword-like units.
Abstract
The conventional n-gram language model exploits dependencies between words and their fixed-length past. This letter presents a model that represents sentences as a concatenation of variable-length sequences of units and describes an algorithm for unsupervised estimation of the model parameters. The approach is illustrated for the segmentation of sequences of letters into subword-like units. It is evaluated as a language model on a corpus of transcribed spoken sentences. Multigrams can provide a significantly lower test set perplexity than n-gram models. >

read more

Citations
More filters
Patent

Disambiguating user intent in conversational interaction system for large corpus information retrieval

TL;DR: In this paper, a method of disambiguating user intent in conversational interactions for information retrieval is presented, which includes access to a set of content items with metadata describing the content items and providing access to structural knowledge showing semantic relationships and links among the contents.
Journal ArticleDOI

Inference of variable-length linguistic and acoustic units by multigrams

TL;DR: A general formulation of the multigram model, applicable to single or multiple parallel strings of data having either discrete or continuous values, and used to infer a set of variable-length acoustic units, directly from speech data.
Patent

Method of and system for inferring user intent in search input in a conversational interaction system

TL;DR: In this article, a method of inferring user intent in search input in a conversational interaction system is disclosed, which includes providing a user preference signature that describes preferences of the user, and determining that a portion of the search input contains an ambiguous identifier.
Patent

Method of and system for using conversation state information in a conversational interaction system

TL;DR: In this paper, a method for inferring a change of a conversation session during continuous user interaction with an interactive content providing system includes receiving input from the user including linguistic elements intended by the user to identify an item, associating a linguistic element of the input with a first conversation session, and providing a response based on the input.
Proceedings ArticleDOI

Segmental vocoder-going beyond the phonetic approach

TL;DR: The problem of very low bit rate segmental speech coding is addressed and future extensions of the scheme (diphone-like synthesis and speaker adaptation) as well as possible use of automatically derived units in recognition are discussed.
References
More filters
Proceedings ArticleDOI

Multi-site data collection for a spoken language corpus

TL;DR: A recently collected spoken language corpus for the ATIS (Air Travel Information System) domain is described and the motivation for this effort, the goals, the implementation of a multi-site data collection paradigm, and the accomplishments of MADCOW are summarized.