scispace - formally typeset
Search or ask a question

Showing papers by "Patrick Haffner published in 2003"


Proceedings ArticleDOI
06 Apr 2003
TL;DR: A global optimization process based on an optimal channel communication model that allows a combination of possibly heterogeneous binary classifiers to decrease the call-type classification error rate for AT&T's How May I Help You (HMIHY/sup (sm)/) natural dialog system by 50 % is proposed.
Abstract: Large margin classifiers such as support vector machines (SVM) or Adaboost are obvious choices for natural language document or call routing. However, how to combine several binary classifiers to optimize the whole routing process and how this process scales when it involves many different decisions (or classes) is a complex problem that has only received partial answers. We propose a global optimization process based on an optimal channel communication model that allows a combination of possibly heterogeneous binary classifiers. As in Markov modeling, computational feasibility is achieved through simplifications and independence assumptions that are easy to interpret. Using this approach, we have managed to decrease the call-type classification error rate for AT&T's How May I Help You (HMIHY/sup (sm)/) natural dialog system by 50 %.

238 citations


Book ChapterDOI
01 Jan 2003
TL;DR: It is shown that under some conditions these kernels are closed under sum, product, or Kleene-closure and a general method for constructing a PDS rational kernel from an arbitrary transducer defined on some non-idempotent semirings is given.
Abstract: Kernel methods are widely used in statistical learning techniques. We recently introduced a general kernel framework based on weighted transducers or rational relations, rational kernels, to extend kernel methods to the analysis of variable-length sequences or more generally weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification. Not all rational kernels are positive definite and symmetric (PDS) however, a sufficient property for guaranteeing the convergence of discriminant classification algorithms such as Support Vector Machines. We present several theoretical results related to PDS rational kernels. We show in particular that under some conditions these kernels are closed under sum, product, or Kleene-closure and give a general method for constructing a PDS rational kernel from an arbitrary transducer defined on some non-idempotent semirings. We also show that some commonly used string kernels or similarity measures such as the edit-distance, the convolution kernels of Haussler, and some string kernels used in the context of computational biology are specific instances of rational kernels. Our results include the proof that the edit-distance over a non-trivial alphabet is not negative definite, which, to the best of our knowledge, was never stated or proved before.

44 citations


Patent
04 Apr 2003
TL;DR: In this article, the user selects some utterances in the speech data, the selected utterances are included in a class and/or call type, and after all call types are completed, the annotation guide is generated.
Abstract: Systems and methods for generating an annotation guide. Speech data is organized and presented to a user. After the user selects some of the utterances in the speech data, the selected utterances are included in a class and/or call type. Additional utterances that belong to the class and/or call type can be found in the speech data using relevance feedback, data mining, data clustering, support vector machines, and the like. After a call type is complete, it is committed to the annotation guide. After all call types are completed, the annotation guide is generated.

26 citations


Proceedings ArticleDOI
06 Apr 2003
TL;DR: This paper presents the first principled approach for classification based on full lattices with efficient algorithms for computing kernels for arbitrary lattices and reports experiments using the algorithm in a difficult call-classification task with 38 categories.
Abstract: Classification is a key task in spoken-dialog systems. The response of a spoken-dialog system is often guided by the category assigned to the speaker's utterance. Unfortunately, classifiers based on the one-best transcription of the speech utterances are not satisfactory because of the high word error rate of conversational speech recognition systems. Since the correct transcription may not be the highest ranking one, but often will be represented in the word lattices output by the recognizer, the classification accuracy can be much higher if the full lattice is exploited both during training and classification. In this paper we present the first principled approach for classification based on full lattices. For this purpose, we use the support vector machine framework with kernels for lattices. The lattice kernels we define belong to the general class of rational kernels. We give efficient algorithms for computing kernels for arbitrary lattices and report experiments using the algorithm in a difficult call-classification task with 38 categories. Our experiments with a trigram lattice kernel show a 15% reduction in error rate at a 30% rejection level.

24 citations


Proceedings Article
01 Jan 2003
TL;DR: This paper introduced a general kernel framework based on weighted transducers, rational kernels, and presented a constructive algorithm for ensuring that rational kernels are positive definite symmetric, a property which guarantees the convergence of discriminant classification algorithms such as Support Vector Machines.
Abstract: Kernel methods have found in recent years wide use in statistical learning techniques due to their good performance and their computational efficiency in high-dimensional feature space. However, text or speech data cannot always be represented by the fixed-length vectors that the traditional kernels handle. We recently introduced a general kernel framework based on weighted transducers, rational kernels ,t o extend kernel methods to the analysis of variable-length sequences and weighted automata [5] and described their application to spoken-dialog applications. We presented a constructive algorithm for ensuring that rational kernels are positive definite symmetric, a property which guarantees the convergence of discriminant classification algorithms such as Support Vector Machines, and showed that many string kernels previously introduced in the computational biology literature are special instances of such positive definite symmetric rational kernels [4]. This paper reviews the essential results given in [5, 3, 4] and presents them in the form of a short tutorial.

11 citations