scispace - formally typeset
Search or ask a question

Showing papers by "R. De Mori published in 2004"


Proceedings ArticleDOI
17 May 2004
TL;DR: Non-linear evaluations of the noise overestimation factor and spectral floor are used in the same way for the proposed gain modification and for non-linear spectral subtraction (NSS) and consistent and statistically significant ASR improvements of the proposed approach with respect to NSS are observed.
Abstract: A soft decision gain modification is introduced and applied to the Ephraim-Malah gain function based on maximum mean square error estimation (MMSE) (Ephraim, Y. and Malah, D., IEEE Trans. Acoust. Speech Sig. Process., vol.ASSP-32, no.6, p.1109-21, 1984; vol.ASSP-33, no.2, p.443-5, 1985) after amplitude compression. Non-linear evaluations of the noise overestimation factor and spectral floor are used in the same way for the proposed gain modification and for non-linear spectral subtraction (NSS). Consistent and statistically significant ASR improvements of the proposed approach with respect to NSS are observed for different noise conditions considered in the AURORA2 and AURORA3 corpora. As the non-linearity affects the two approaches in the same way, the comparison result is particularly interesting.

17 citations


Proceedings ArticleDOI
17 May 2004
TL;DR: A new application of automatically trained decision trees to derive the interpretation of a spoken sentence with a relative reduction in the understanding error rate and a new strategy for building structured cohorts of candidates is described.
Abstract: The paper proposes a new application of automatically trained decision trees to derive the interpretation of a spoken sentence. A new strategy for building structured cohorts of candidates is also described. By evaluating predicates related to the acoustic confidence of the words expressing a concept, the linguistic and semantic consistency of candidates in the cohort and the rank of a candidate within a cohort, the decision tree automatically learns a decision strategy for rescoring or rejecting an n-best list of candidates representing a user's utterance. A relative reduction of 18.6% in the understanding error rate is obtained by our rescoring strategy with no utterance rejection and a relative reduction of 43.1% of the same error rate is achieve with a rejection rate of only 8% of the utterances.

13 citations


Journal ArticleDOI
TL;DR: Experimental results on a spoken dialogue corpus show the performance of the proposed augmentation method, combined with maximum a posteriori probability adaptation, in terms of word error rate reduction.

5 citations