scispace - formally typeset
Search or ask a question

Showing papers by "Patrick Haffner published in 2006"


Journal ArticleDOI
Patrick Haffner1
TL;DR: This paper provides an original and unified presentation of these algorithms within the framework of regularized and large margin linear classifiers, reviews some available optimization techniques, and offers practical solutions to scaling issues.

46 citations


Patent
20 Apr 2006
TL;DR: In this article, a method for identifying traffic to an application including the steps of monitoring communication traffic in a network, identifying data from communication traffic content, and constructing a model for mapping the communication traffic for an application derived from data identified from the traffic content is described.
Abstract: A method for identifying traffic to an application including the steps of monitoring communication traffic in a network, identifying data from communication traffic content, and constructing a model for mapping the communication traffic for an application derived from data identified from the communication traffic content is described. A related system and computer readable medium for performing the method is also described. The described method and system has utility in a wide array of networks including IP networks.

31 citations


01 Jan 2006
TL;DR: AT&T participated in one evaluation task at TRECVID 2009: the content-based copy detection task, and submitted three runs: one for NoFA (no false alarm) profile, and two for balanced profile.

18 citations


Proceedings ArticleDOI
Patrick Haffner1
25 Jun 2006
TL;DR: A new method based on transposition is proposed to speedup this computation on sparse data, instead of dot-products over sparse feature vectors, that incrementally merges lists of training examples and minimizes access to the data.
Abstract: Kernel-based learning algorithms, such as Support Vector Machines (SVMs) or Perceptron, often rely on sequential optimization where a few examples are added at each iteration. Updating the kernel matrix usually requires matrix-vector multiplications. We propose a new method based on transposition to speedup this computation on sparse data. Instead of dot-products over sparse feature vectors, our computation incrementally merges lists of training examples and minimizes access to the data. Caching and shrinking are also optimized for sparsity. On very large natural language tasks (tagging, translation, text classification) with sparse feature representations, a 20 to 80-fold speedup over LIBSVM is observed using the same SMO algorithm. Theory and experiments explain what type of sparsity structure is needed for this approach to work, and why its adaptation to Maxent sequential optimization is inefficient.

13 citations


01 Jan 2006
TL;DR: A novel approach to machine translation that uses a maximum entropy model for parameter estimation and its performance to the finite-state translation model on the IWSLT Chinese-English data sets is compared.
Abstract: In this paper, we present our system for statistical machine translation that is based on weighted finite-state transducers. We describe the construction of the transducer, the estimation of the weights, acquisition of phrases (locally ordered tokens) and the mechanism we use for global reordering. We also present a novel approach to machine translation that uses a maximum entropy model for parameter estimation and contrast its performance to the finite-state translation model on the IWSLT Chinese-English data sets.

4 citations


Patent
Patrick Haffner1
31 Dec 2006
TL;DR: In this article, a method and apparatus based on transposition to speed up learning computations on sparse data are disclosed, where the method receives an support vector comprising at least one feature represented by one non-zero entry.
Abstract: A method and apparatus based on transposition to speed up learning computations on sparse data are disclosed. For example, the method receives an support vector comprising at least one feature represented by one non-zero entry. The method then identifies at least one column within a matrix with non-zero entries, wherein the at least one column is identified in accordance with the at least one feature of the support vector. The method then performs kernel computations using successive list merging on the at least one identified column of the matrix and the support vector to derive a result vector, wherein the result vector is used in a data learning function.

3 citations


Journal ArticleDOI
TL;DR: The main modifications were changing the dependent variable in the training set to account for multiple PEs per patient, and incorporating neighborhood information through augmentation of the set of predictor variables, which resulted in measurable predictive improvement.
Abstract: Task 1 of the 2006 KDD Challenge Cup required classification of pulmonary embolisms (PEs) using variables derived from computed tomography angiography. We present our approach to the challenge and justification for our choices. We used boosted trees to perform the main classification task, but modified the algorithm to address idiosyncrasies of the scoring criteria. The two main modifications were: 1) changing the dependent variable in the training set to account for multiple PEs per patient, and 2) incorporating neighborhood information through augmentation of the set of predictor variables. Both of these resulted in measurable predictive improvement. In addition, we discuss a statistically based method for setting the classification threshold.

3 citations