scispace - formally typeset
Search or ask a question

Showing papers by "R. De Mori published in 1996"


Proceedings ArticleDOI
03 Oct 1996
TL;DR: The paper presents a mixed approach to spoken language understanding that tries to make best use of the advantages of both statistical and knowledge-based algorithms.
Abstract: The paper presents a mixed approach to spoken language understanding that tries to make best use of the advantages of both statistical and knowledge-based algorithms. Results obtained on the ATIS (Air Travel Information System) scenario transferred to the Italian language are presented and discussed.

10 citations


Proceedings ArticleDOI
07 May 1996
TL;DR: A search technique incorporating the automatic modeling of lexical variability is introduced for medium or large-vocabulary speaker-independent speech recognition and a new approach for word hypothesization is proposed, based on an acoustic-phonetic unit called the pseudo-syllable segment.
Abstract: A search technique incorporating the automatic modeling of lexical variability is introduced for medium or large-vocabulary speaker-independent speech recognition. Current state-of-art systems depend on being able to model the entire language based on acoustic features and the constraints of syntax or inter-word probabilities. These methods often fail in the presence of multiple speakers, new vocabulary, noise, and spontaneous speech phenomena. A new approach for word hypothesization is proposed, based on an acoustic-phonetic unit called the pseudo-syllable segment. An algorithm is described for transforming a sequence of syllables into words. Techniques are suggested for controlling the accuracy of the syllabic hypothesis set, and learning the phonotactics of syllables automatically in a statistical framework.

7 citations


Proceedings ArticleDOI
07 May 1996
TL;DR: The proposed multilevel semantic classification trees allows one to combine sources of different types it is no longer necessary for each source to yield a probability, and the tree can look at several information sources simultaneously.
Abstract: We propose multilevel semantic classification trees to combine different information sources for predicting speech events (e.g. word chains, phrases, etc.). Traditionally in speech recognition systems these information sources (acoustic evidence, language model) are calculated independently and combined via Bayes rule. The proposed approach allows one to combine sources of different types it is no longer necessary for each source to yield a probability. Moreover the tree can look at several information sources simultaneously. The approach is demonstrated for the prediction of prosodically marked phrase boundaries, combining information about the spoken word chain, word category information, prosodic parameters, and the result of a neural network predicting the boundary on the basis of acoustic-prosodic features. The recognition rates of up to 90% for the two class problem boundary vs. no boundary are already comparable to results achieved with the above mentioned Bayes rule approach that combines the acoustic classifier with a 5-gram categorical language model. This is remarkable, since so far only a small set of questions combining information from different sources have been implemented.

7 citations