Home
/
Authors
/
Patrick Haffner

Author

Patrick Haffner

Other affiliations: Nuance Communications, Carnegie Mellon University, Orange S.A. ...read more

Bio: Patrick Haffner is an academic researcher from AT&T Labs. The author has contributed to research in topics: Support vector machine & Speaker recognition. The author has an hindex of 32, co-authored 97 publications receiving 42604 citations. Previous affiliations of Patrick Haffner include Nuance Communications & Carnegie Mellon University.

Papers published on a yearly basis

2021
2017
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1995
1994
1993
1991
1989

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

A Machine Learning Framework for Spoken-Dialog Classification

[...]

Corinna Cortes¹, Patrick Haffner, Mehryar Mohri²•Institutions (2)

Google¹, Courant Institute of Mathematical Sciences²

01 Jan 2008

TL;DR: This chapter presents a general kernel-based learning framework for the design of classification algorithms for weighted automata, and introduces a family of kernels, rational kernels, that combined with support vector machines form powerful techniques for spoken-dialog classification and other classification tasks in text and speech processing.

...read moreread less

Abstract: One of the key tasks in the design of large-scale dialog systems is classification. This consists of assigning, out of a finite set, a specific category to each spoken utterance, based on the output of a speech recognizer. Classification in general is a standard machine-learning problem, but the objects to classify in this particular case are word lattices, or weighted automata, and not the fixed-size vectors for which learning algorithms were originally designed. This chapter presents a general kernel-based learning framework for the design of classification algorithms for weighted automata. It introduces a family of kernels, rational kernels, that combined with support vector machines form powerful techniques for spoken-dialog classification and other classification tasks in text and speech processing. It describes efficient algorithms for their computation and reports the results of their use in several difficult spoken-dialog classification tasks based on deployed systems. Our results show that rational kernels are easy to design and implement, and lead to substantial improvements of the classification accuracy. The chapter also provides some theoretical results helpful for the design of rational kernels.

...read moreread less

9 citations

Posted Content•

Accelerated Parallel Optimization Methods for Large Scale Machine Learning.

[...]

Haipeng Luo¹, Patrick Haffner², Jean-Francois Paiement²•Institutions (2)

Princeton University¹, AT&T²

25 Nov 2014-arXiv: Learning

TL;DR: This work considers combining two techniques, parallelism and Nesterov's acceleration, to design faster algorithms for L1-regularized loss, and simplifies BOOM, a variant of gradient descent, and proposes an efficient accelerated version of Shotgun, improving the convergence rate.

...read moreread less

Abstract: The growing amount of high dimensional data in different machine learning applications requires more efficient and scalable optimization algorithms. In this work, we consider combining two techniques, parallelism and Nesterov's acceleration, to design faster algorithms for L1-regularized loss. We first simplify BOOM, a variant of gradient descent, and study it in a unified framework, which allows us to not only propose a refined measurement of sparsity to improve BOOM, but also show that BOOM is provably slower than FISTA. Moving on to parallel coordinate descent methods, we then propose an efficient accelerated version of Shotgun, improving the convergence rate from $O(1/t)$ to $O(1/t^2)$. Our algorithm enjoys a concise form and analysis compared to previous work, and also allows one to study several connected work in a unified way.

...read moreread less

9 citations

Conversion of digital documents to multilayer raster formats

[...]

Léon Bottou¹, Léon Bottou², Patrick Haffner¹, Yann LeCun•Institutions (2)

AT&T¹, École Normale Supérieure²

01 Jan 2001

TL;DR: A new algorithm that prevents overlaps between foreground components while optimizing both the document quality and compression ratio is derived from the minimum description length (MDL) criterion, which makes the DjVu compression format significantly, more efficient on electronically produced documents.

...read moreread less

Abstract: How can we turn the description of a digital (i.e. electronically produced) document into something that is efficient for multi-layer raster formats? It is first shown that a foreground/background segmentation without overlapping foreground components can be more efficient for viewing or printing. Then, a new algorithm that prevents overlaps between foreground components while optimizing both the document quality and compression ratio is derived from the minimum description length (MDL) criterion. This algorithm makes the DjVu compression format significantly, more efficient on electronically produced documents. Comparisons with other formats are provided.

...read moreread less

8 citations

AT&T Research at 2007.

[...]

Zhu Liu, Eric Zavesky, David Crawford Gibbon, Behzad Shahraray, Patrick Haffner - Show less +1 more

01 Jan 2007

TL;DR: A multimodal rushes summarization method that relies on both face and speech information is proposed that is concise and easy to understand and shows that the new SBD system was enhanced for robustness and efficiency and is highly effective.

...read moreread less

Abstract: AT&T participated in two tasks at TRECVID 2007: shot boundary detection (SBD) and rushes summarization. The SBD system developed for TRECVID 2006 was enhanced for robustness and efficiency. New visual features are extracted for cut, dissolve, and fast dissolve detectors, and SVM based verification method is used to boost the accuracy. The speed is improved by a more streamlined processing with on-the-fly result fusion. We submitted 10 runs for SBD evaluation task. The best result (TT05) was achieved with the following configuration: SVM based verification method; more training data that includes 2004, 2005, and 2006 SBD data; no SVM boundary adjustment; training SVM with high generalization capability (e.g., a smaller value of C). As a pilot task, rushes summarization aims to show the main objects and events in the raw material with least redundancy while maximizing the usability. We proposed a multimodal rushes summarization method that relies on both face and speech information. Evaluation results show that the new SBD system is highly effective and the human centric rushes summarization approach is concise and easy to understand.

...read moreread less

8 citations

Proceedings Article•DOI•

GMM/SVM N-best speaker identification under mismatch channel conditions

[...]

I. Zeljkovic¹, Patrick Haffner¹, B. Amento¹, J. Wilpon¹•Institutions (1)

AT&T Labs¹

12 May 2008

TL;DR: This work investigates N-best SID accuracy for matched (telephone/telephone) and mismatched (far-field/ telephone) train/test channel conditions and reduces matched channel error rate by over 25% relative to the baseline (GMM-UBM), for top-1, and achieved mismatched N- best accuracy comparable to the benchmark.

...read moreread less

Abstract: Under severe channel mismatch conditions, such as training with far-field speech and testing with telephone data, performance of speaker identification (SID) degrades significantly, often below practical use. But for many SID tasks, it is sufficient to recognize an N-best list of speakers for further human analysis. We investigate N-best SID accuracy for matched (telephone/telephone) and mismatched (far-field/telephone) train/test channel conditions. Using an SVM-GMM supervector (GSV), pitch and formant frequency histograms (PFH) and cross-channel adaptation using cohorts, we reduced matched channel error rate by over 25% relative to the baseline (GMM-UBM), for top-1, and achieved mismatched N-best accuracy comparable to the baseline.

...read moreread less

7 citations

1
2
3
4
5
6
7
8
9
10
…
11
12
13
14
15
16
17
…
18
19
20

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Deep learning

[...]

Yann LeCun¹, Yann LeCun², Yoshua Bengio³, Geoffrey E. Hinton⁴, Geoffrey E. Hinton⁵ - Show less +1 more•Institutions (5)

New York University¹, Facebook², Université de Montréal³, University of Toronto⁴, Google⁵

28 May 2015-Nature

TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.

...read moreread less

Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

...read moreread less

46,982 citations

Journal Article•DOI•

Gradient-based learning applied to document recognition

[...]

Yann LeCun¹, Léon Bottou², Léon Bottou³, Yoshua Bengio⁴, Yoshua Bengio³, Yoshua Bengio⁵, Patrick Haffner³ - Show less +3 more•Institutions (5)

Bell Labs¹, École Normale Supérieure², AT&T³, École Polytechnique de Montréal⁴, Alcatel-Lucent⁵

01 Jan 1998

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

...read moreread less

42,067 citations

Proceedings Article•DOI•

Going deeper with convolutions

[...]

Christian Szegedy¹, Wei Liu², Yangqing Jia¹, Pierre Sermanet¹, Scott Reed³, Dragomir Anguelov¹, Dumitru Erhan¹, Vincent Vanhoucke¹, Andrew Rabinovich - Show less +5 more•Institutions (3)

Google¹, University of North Carolina at Chapel Hill², University of Michigan³

07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

...read moreread less

40,257 citations

Book•

The Nature of Statistical Learning Theory

[...]

Vladimir Vapnik¹•Institutions (1)

Bell Labs¹

01 Jan 1995

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?

...read moreread less

Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

...read moreread less

40,147 citations

Journal Article•DOI•

Generative Adversarial Nets

[...]

Ian Goodfellow¹, Jean Pouget-Abadie¹, Mehdi Mirza¹, Bing Xu¹, David Warde-Farley¹, Sherjil Ozair², Aaron Courville¹, Yoshua Bengio¹ - Show less +4 more•Institutions (2)

Université de Montréal¹, Indian Institute of Technology Delhi²

08 Dec 2014

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to ½ everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

...read moreread less

38,211 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse