scispace - formally typeset
Search or ask a question

Showing papers by "Patrick Haffner published in 2008"


Patent
28 Apr 2008
TL;DR: In this paper, a method for creating a personal animated entity for delivering a multi-media message from a sender to a recipient is described. Butler et al. present a method to create a personal animator for multi-modal messages.
Abstract: In an embodiment, a method is provided for creating a personal animated entity for delivering a multi-media message from a sender to a recipient. An image file from the sender may be received by a server. The image file may include an image of an entity. The sender may be requested to provide input with respect to facial features of the image of the entity in preparation for animating the image of the entity. After the sender provides the input with respect to the facial features of the image of the entity, the image of the entity may be presented as a personal animated entity to the sender to preview. Upon approval of the preview from the sender, the image of the entity may be presented as a sender-selectable personal animated entity for delivering the multi-media message to the recipient.

65 citations


Patent
31 Dec 2008
TL;DR: In this paper, a method and apparatus for using a classifier for processing a query are disclosed, where the discriminative classifier is trained with a plurality of artificial query examples.
Abstract: A method and apparatus for using a classifier for processing a query are disclosed. For example, the method receives a query from a user, and processes the query to locate one or more documents in accordance with a search engine having a discriminative classifier, wherein the discriminative classifier is trained with a plurality of artificial query examples. The method then presents a result of the processing to the user.

17 citations


Book ChapterDOI
01 Jan 2008
TL;DR: This chapter presents a general kernel-based learning framework for the design of classification algorithms for weighted automata, and introduces a family of kernels, rational kernels, that combined with support vector machines form powerful techniques for spoken-dialog classification and other classification tasks in text and speech processing.
Abstract: One of the key tasks in the design of large-scale dialog systems is classification. This consists of assigning, out of a finite set, a specific category to each spoken utterance, based on the output of a speech recognizer. Classification in general is a standard machine-learning problem, but the objects to classify in this particular case are word lattices, or weighted automata, and not the fixed-size vectors for which learning algorithms were originally designed. This chapter presents a general kernel-based learning framework for the design of classification algorithms for weighted automata. It introduces a family of kernels, rational kernels, that combined with support vector machines form powerful techniques for spoken-dialog classification and other classification tasks in text and speech processing. It describes efficient algorithms for their computation and reports the results of their use in several difficult spoken-dialog classification tasks based on deployed systems. Our results show that rational kernels are easy to design and implement, and lead to substantial improvements of the classification accuracy. The chapter also provides some theoretical results helpful for the design of rational kernels.

9 citations


Proceedings ArticleDOI
12 May 2008
TL;DR: This work investigates N-best SID accuracy for matched (telephone/telephone) and mismatched (far-field/ telephone) train/test channel conditions and reduces matched channel error rate by over 25% relative to the baseline (GMM-UBM), for top-1, and achieved mismatched N- best accuracy comparable to the benchmark.
Abstract: Under severe channel mismatch conditions, such as training with far-field speech and testing with telephone data, performance of speaker identification (SID) degrades significantly, often below practical use. But for many SID tasks, it is sufficient to recognize an N-best list of speakers for further human analysis. We investigate N-best SID accuracy for matched (telephone/telephone) and mismatched (far-field/telephone) train/test channel conditions. Using an SVM-GMM supervector (GSV), pitch and formant frequency histograms (PFH) and cross-channel adaptation using cohorts, we reduced matched channel error rate by over 25% relative to the baseline (GMM-UBM), for top-1, and achieved mismatched N-best accuracy comparable to the baseline.

7 citations