Top 9 papers published by Patrick Haffner from AT&T Labs in 2007

Proceedings Article•

Exploiting network structure for proactive spam mitigation

[...]

Shobha Venkataraman¹, Subhabrata Sen², Oliver Spatscheck², Patrick Haffner², Dawn Song¹ - Show less +1 more•Institutions (2)

Carnegie Mellon University¹, AT&T²

06 Aug 2007

TL;DR: It is demonstrated that the history and the structure of the IP addresses can reduce the adverse impact of mail server overload, by increasing the number of legitimate e-mails accepted by a factor of 3.

...read moreread less

Abstract: E-mail has become indispensable in today's networked society. However, the huge and ever-growing volume of spam has become a serious threat to this important communication medium. It not only affects e-mail recipients, but also causes a significant overload to mail servers which handle the e-mail transmission. We perform an extensive analysis of IP addresses and IP aggregates given by network-aware clusters in order to investigate properties that can distinguish the bulk of the legitimate mail and spam. Our analysis indicates that the bulk of the legitimate mail comes from long-lived IP addresses. We also find that the bulk of the spam comes from network clusters that are relatively long-lived. Our analysis suggests that network-aware clusters may provide a good aggregation scheme for exploiting the history and structure of IP addresses. We then consider the implications of this analysis for prioritizing legitimate mail. We focus on the situation when mail server is overloaded, and the goal is to maximize the legitimate mail that it accepts. We demonstrate that the history and the structure of the IP addresses can reduce the adverse impact of mail server overload, by increasing the number of legitimate e-mails accepted by a factor of 3.

...read moreread less

64 citations

Proceedings Article•

Statistical Machine Translation through Global Lexical Selection and Sentence Reconstruction

[...]

Srinivas Bangalore¹, Patrick Haffner¹, Stephan Kanthak¹•Institutions (1)

AT&T¹

01 Jun 2007

TL;DR: This paper presents a novel approach to lexical selection where the target words are associated with the entire source sentence (global) without the need to compute local associations.

...read moreread less

Abstract: Machine translation of a source language sentence involves selecting appropriate target language words and ordering the selected words to form a well-formed target language sentence. Most of the previous work on statistical machine translation relies on (local) associations of target words/phrases with source words/phrases for lexical selection. In contrast, in this paper, we present a novel approach to lexical selection where the target words are associated with the entire source sentence (global) without the need to compute local associations. Further, we present a technique for reconstructing the target language sentence from the selected words. We compare the results of this approach against those obtained from a finite-state based statistical machine translation system which relies on local lexical associations.

...read moreread less

61 citations

At&t research at trecvid 2007

[...]

Zhu Liu, Eric Zavesky, David Crawford Gibbon, Behzad Shahraray, Patrick Haffner¹ - Show less +1 more•Institutions (1)

AT&T¹

01 Jan 2007

TL;DR: In this paper, a multimodal rushes summarization method that relies on both face and speech information was proposed to show the main objects and events in the raw material with least redundancy while maximizing the usability.

...read moreread less

Abstract: ATT more training data that includes 2004, 2005, and 2006 SBD data; no SVM boundary adjustment; training SVM with high generalization capability (e.g., a smaller value of C). As a pilot task, rushes summarization aims to show the main objects and events in the raw material with least redundancy while maximizing the usability. We proposed a multimodal rushes summarization method that relies on both face and speech information. Evaluation results show that the new SBD system is highly effective and the human centric rushes summarization approach is concise and easy to understand.

...read moreread less

43 citations

Proceedings Article•DOI•

A Fast, Comprehensive Shot Boundary Determination System

[...]

Zhu Liu¹, David Crawford Gibbon¹, Eric Zavesky², Behzad Shahraray¹, Patrick Haffner¹ - Show less +1 more•Institutions (2)

AT&T Labs¹, Columbia University²

02 Jul 2007

TL;DR: The proposed shot boundary determination (SBD) algorithm contains a set of finite state machine (FSM) based detectors for pure cut, fast dissolve, fade in, fade out, dissolve, and wipe.

...read moreread less

Abstract: The proposed shot boundary determination (SBD) algorithm contains a set of finite state machine (FSM) based detectors for pure cut, fast dissolve, fade in, fade out, dissolve, and wipe. Support vector machines (SVM) are applied to the cut and dissolve detectors to further boost performance. Our SBD system was highly effective when evaluated in TRECVID 2006 (TREC video retrieval evaluation) and its performance was ranked highest overall.

...read moreread less

40 citations

Patent•

On-Demand Language Translation for Television Programs

[...]

Srinivas Bangalore¹, David Crawford Gibbon¹, Mazin Gilbert¹, Patrick Haffner¹, Zhu Liu¹, Behzad Shahraray¹ - Show less +2 more•Institutions (1)

AT&T¹

13 Apr 2007

TL;DR: In this paper, a translation module including at least one language pair module for translating a source language to a target language may be made available for use by a subscriber in exchange for displaying commercial messages to the subscriber.

...read moreread less

Abstract: A method, a system and a machine-readable medium are provided for an on demand translation service. A translation module including at least one language pair module for translating a source language to a target language may be made available for use by a subscriber. The subscriber may be charged a fee for use of the requested on demand translation service or may be provided use of the on demand translation service for free in exchange for displaying commercial messages to the subscriber. A video signal may be received including information in the source language, which may be obtained as text from the video signal and may be translated from the source language to the target language by use of the translation module. Translated information, based on the translated text, may be added into the received video signal. The video signal including the translated information in the target language may be sent to a display device.

...read moreread less

30 citations

Patent•

Sequence classification for machine translation

[...]

Srinivas Bangalore¹, Patrick Haffner¹, Stephan Kanthak¹•Institutions (1)

AT&T¹

11 Dec 2007

TL;DR: In this paper, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word.

...read moreread less

Abstract: Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word. The weights comprising the vectors are associated with respective ones of the features; each weight is a measure of the extent to which the presence of that feature for the source word makes it more probable that the target word in question is the correct one.

...read moreread less

27 citations

Patent•

Discriminative training of models for sequence classification

[...]

Srinivas Bangalore¹, Patrick Haffner¹, Stephan Kanthak¹•Institutions (1)

AT&T¹

11 Dec 2007

TL;DR: In this paper, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word.

...read moreread less

Abstract: Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word. The weights comprising the vectors are associated with respective ones of the features; each weight is a measure of the extent to which the presence of that feature for the source word makes it more probable that the target word in question is the correct one.

...read moreread less

21 citations

Patent•

Managing email servers by prioritizing emails

[...]

Subhabrata Sen¹, Patrick Haffner, Oliver Spatscheck, Shobha Venkataraman•Institutions (1)

AT&T¹

24 Oct 2007

TL;DR: In this article, email server management methods and systems that protect the ability of the infrastructure of the email server to process legitimate emails in the presence of large spam volumes are discussed. But, the authors do not discuss how to identify the priority classes of emails.

...read moreread less

Abstract: Disclosed are email server management methods and systems that protect the ability of the infrastructure of the email server to process legitimate emails in the presence of large spam volumes. During a period of server overload, priority classes of emails are identified, and emails are processed according to priority. In a typical embodiment, the server sends emails sequentially in a queue, and the queue has a limited capacity. When the server nears or reaches that capacity, the emails in the queue are analyzed to identify priority emails, and the priority emails are moved to the head of the queue.

...read moreread less

21 citations

AT&T Research at 2007.

[...]

Zhu Liu, Eric Zavesky, David Crawford Gibbon, Behzad Shahraray, Patrick Haffner - Show less +1 more

01 Jan 2007

TL;DR: A multimodal rushes summarization method that relies on both face and speech information is proposed that is concise and easy to understand and shows that the new SBD system was enhanced for robustness and efficiency and is highly effective.

...read moreread less

Abstract: AT&T participated in two tasks at TRECVID 2007: shot boundary detection (SBD) and rushes summarization. The SBD system developed for TRECVID 2006 was enhanced for robustness and efficiency. New visual features are extracted for cut, dissolve, and fast dissolve detectors, and SVM based verification method is used to boost the accuracy. The speed is improved by a more streamlined processing with on-the-fly result fusion. We submitted 10 runs for SBD evaluation task. The best result (TT05) was achieved with the following configuration: SVM based verification method; more training data that includes 2004, 2005, and 2006 SBD data; no SVM boundary adjustment; training SVM with high generalization capability (e.g., a smaller value of C). As a pilot task, rushes summarization aims to show the main objects and events in the raw material with least redundancy while maximizing the usability. We proposed a multimodal rushes summarization method that relies on both face and speech information. Evaluation results show that the new SBD system is highly effective and the human centric rushes summarization approach is concise and easy to understand.

...read moreread less

8 citations

Showing papers by "Patrick Haffner published in 2007"