Lattice kernels for spoken-dialog classification

doi:10.1109/ICASSP.2003.1198859

Proceedings ArticleDOI

Lattice kernels for spoken-dialog classification

Corinna Cortes, +2 more

- Vol. 1, pp 628-631

Chats0

TLDR

This paper presents the first principled approach for classification based on full lattices with efficient algorithms for computing kernels for arbitrary lattices and reports experiments using the algorithm in a difficult call-classification task with 38 categories.

Abstract:

Classification is a key task in spoken-dialog systems. The response of a spoken-dialog system is often guided by the category assigned to the speaker's utterance. Unfortunately, classifiers based on the one-best transcription of the speech utterances are not satisfactory because of the high word error rate of conversational speech recognition systems. Since the correct transcription may not be the highest ranking one, but often will be represented in the word lattices output by the recognizer, the classification accuracy can be much higher if the full lattice is exploited both during training and classification. In this paper we present the first principled approach for classification based on full lattices. For this purpose, we use the support vector machine framework with kernels for lattices. The lattice kernels we define belong to the general class of rational kernels. We give efficient algorithms for computing kernels for arbitrary lattices and report experiments using the algorithm in a difficult call-classification task with 38 categories. Our experiments with a trigram lattice kernel show a 15% reduction in error rate at a 30% rejection level.

Citations

PDF

Open Access

More filters

Book ChapterDOI

Theory and Algorithms

Peter Schlattmann

Journal ArticleDOI

From sample similarity to ensemble similarity: probabilistic distance measures in reproducing kernel Hilbert space

Shaohua Zhou, +1 more

- 01 Jun 2006 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This paper addresses the problem of characterizing ensemble similarity from sample similarity in a principled manner by using a reproducing kernel as a characterization of sample similarity, and suggests a probabilistic distance measure in the reproducingkernel Hilbert space (RKHS) as the ensemble similarity.

...read moreread less

Journal Article

Rational Kernels: Theory and Algorithms

Corinna Cortes, +2 more

- 01 Dec 2004 -

Journal of Machine Learning Research

TL;DR: A general family of kernels based on weighted transducers or rational relations, rational kernels, that extend kernel methods to the analysis of variable-length sequences or more generally weighted automata and show that rational kernels are easy to design and implement and lead to substantial improvements of the classification accuracy.

...read moreread less

Journal ArticleDOI

Beyond ASR 1-best: Using word confusion networks in spoken language understanding

Dilek Hakkani-Tur, +3 more

- 01 Oct 2006 -

Computer Speech & Language

TL;DR: This paper proposes methods for a tighter integration of ASR and SLU using word confusion networks (WCNs), which provide a compact representation of multiple aligned ASR hypotheses along with word confidence scores, without compromising recognition accuracy.

...read moreread less

Journal ArticleDOI

Optimal Transport in Reproducing Kernel Hilbert Spaces: Theory and Applications

Zhen Zhang, +2 more

- 01 Jul 2020 -

IEEE Transactions on Pattern Analysis an...

TL;DR: The case in which data distributions in RKHS are Gaussian is explored, obtaining closed-form expressions of both the estimated Wasserstein distance and optimal transport map via kernel matrices, and the Bures metric on covariance matrices is generalized to infinite-dimensional settings, providing a new metric between covariance operators.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Support-Vector Networks

Corinna Cortes, +1 more

- 15 Sep 1995 -

Machine Learning

TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

...read moreread less

Statistical learning theory

Vladimir Vapnik

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

...read moreread less

Proceedings ArticleDOI

A training algorithm for optimal margin classifiers

Bernhard E. Boser, +2 more

TL;DR: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented, applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions.

...read moreread less

Book

Learning with kernels

Bernhard Schölkopf

Journal ArticleDOI

Efficient string matching: an aid to bibliographic search

Alfred V. Aho, +1 more

- 01 Jun 1975 -

Communications of The ACM

TL;DR: A simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text that has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.

...read moreread less