scispace - formally typeset
Search or ask a question

Showing papers by "Patrick Haffner published in 2010"


Proceedings ArticleDOI
30 Nov 2010
TL;DR: Extensive evaluations using an entire year worth of customer tickets and measurement data from a large network show that the method can predict thousands of future customer tickets per week with high accuracy and signifcantly reduce the time and effort for diagnosing these tickets.
Abstract: Traditional DSL troubleshooting solutions are reactive, relying mainly on customers to report problems, and tend to be labor-intensive, time consuming, prone to incorrect resolutions and overall can contribute to increased customer dissatisfaction. In this paper, we propose a proactive approach to facilitate troubleshooting customer edge problems and reducing customer tickets. Our system consists of: i) a ticket predictor which predicts future customer tickets; and ii) a trouble locator which helps technicians accelerate the troubleshooting process during field dispatches. Both components infer future tickets and trouble locations based on existing sparse line measurements, and the inference models are constructed automatically using supervised machine learning techniques. We propose several novel techniques to address the operational constraints in DSL networks and to enhance the accuracy of NEVERMIND. Extensive evaluations using an entire year worth of customer tickets and measurement data from a large network show that our method can predict thousands of future customer tickets per week with high accuracy and signifcantly reduce the time and effort for diagnosing these tickets. This is benefcial as it has the effect of both reducing the number of customer care calls and improving customer satisfaction.

177 citations


Patent
30 Sep 2010
TL;DR: In this article, the authors present systems, methods and non-transitory computer-readable media for performing speech recognition across different applications or environments without model customization or prior knowledge of the received speech.
Abstract: Disclosed herein are systems, methods and non-transitory computer-readable media for performing speech recognition across different applications or environments without model customization or prior knowledge of the domain of the received speech. The disclosure includes recognizing received speech with a collection of domain-specific speech recognizers, determining a speech recognition confidence for each of the speech recognition outputs, selecting speech recognition candidates based on a respective speech recognition confidence for each speech recognition output, and combining selected speech recognition candidates to generate text based on the combination.

152 citations


Proceedings Article
05 Jun 2010
TL;DR: This work addresses the task of automatically predicting ratings, for given aspects of a textual review, by assigning a numerical score to each evaluated aspect in the reviews by handling this task as both a regression and a classification modeling problem.
Abstract: Bloggers, professional reviewers, and consumers continuously create opinion--rich web reviews about products and services, with the result that textual reviews are now abundant on the web and often convey a useful overall rating (number of stars). However, an overall rating cannot express the multiple or conflicting opinions that might be contained in the text, or explicitly rate the different aspects of the evaluated entity. This work addresses the task of automatically predicting ratings, for given aspects of a textual review, by assigning a numerical score to each evaluated aspect in the reviews. We handle this task as both a regression and a classification modeling problem and explore several combinations of syntactic and semantic features. Our results suggest that classification techniques perform better than ranking modeling when handling evaluative text.

38 citations


Patent
03 May 2010
TL;DR: In this paper, a labeler is presented with utterances that have already been identified as belonging to a particular class or call type, and is asked to assign a call type to the utterances.
Abstract: Systems and methods for monitoring labelers of speech data. To test or train labelers, a labeler is presented with utterances that have already been identified as belonging to a particular class or call type. The labeler is asked to assign a call type to the utterances. The performance of the labeler is measured by comparing the call types assigned by the labeler with the existing call types of the utterances. The performance of a labeler can also be monitored as the labeler labels speech data by occasionally having the labeler label an utterance that is already labeled and by storing the results.

24 citations


Proceedings ArticleDOI
Yu Jin1, Nick Duffield2, Patrick Haffner2, Subhabrata Sen2, Zhi-Li Zhang1 
14 Jun 2010
TL;DR: This paper proposes a novel technique for inferring the distribution of application classes present in the aggregated traffic flows between endpoints, that exploits both the measured statistics of the traffic flows, and the spatial distribution of those flows across the network.
Abstract: In this paper, we propose a novel technique for inferring the distribution of application classes present in the aggregated traffic flows between endpoints, which exploits both the statistics of the traffic flows, and the spatial distribution of those flows across the network. Our method employs a two-step supervised model, where the bootstrapping step provides initial (inaccurate) inference on the traffic application classes, and the graph-based calibration step adjusts the initial inference through the collective spatial traffic distribution. In evaluations using real traffic flow measurements from a large ISP, we show how our method can accurately classify application types within aggregate traffic between endpoints, even without the knowledge of ports and other traffic features. While the bootstrap estimate classifies the aggregates with 80% accuracy, incorporating spatial distributions through calibration increases the accuracy to 92%, i.e., roughly halving the number of errors.

16 citations


Patent
17 Aug 2010
TL;DR: In this paper, the authors present a method and apparatus for classifying applications using the collective properties of network traffic, which they call traffic activity graph (TAG) for short.
Abstract: In one embodiment, the present disclosure is a method and apparatus for classifying applications using the collective properties of network traffic. In one embodiment, a method for classifying traffic in a communication network includes receiving a traffic activity graph, the traffic activity graph comprising a plurality of nodes interconnected by a plurality of edges, where each of the nodes represents an endpoint associated with the communication network and each of the edges represents traffic between a corresponding pair of the nodes, generating an initial set of inferences as to an application class associated with each of the edges, based on at least one measured statistic related to at least one traffic flow in the communication network, and refining the initial set of inferences based on a spatial distribution of the traffic flows, to produce a final traffic activity graph.

9 citations


Proceedings ArticleDOI
25 Oct 2010
TL;DR: This paper proposes a novel technique for inferring the distribution of application classes present in the aggregated traffic flows between endpoints, that exploits both the measured statistics of the traffic flows, and the spatial distribution of those flows across the network.
Abstract: Operating, managing and securing networks require a thorough understanding of the demands placed on the network by the endpoints it interconnects, the characteristics of the traffic the endpoints generate, and the distribution of that traffic over the resources of the network infrastructure. A major differentiator in the types of resource required by traffic is the class of endpoint application that generates it. Service providers determine the application mix present in traffic via measurements, e.g., flow measurements furnished by routers. Previous work has shown that a fairly accurate determination of application type can be made from this data. However, protocol level information, such as TCP/UDP ports and other parts of the transport header, and also parts of the network header in some cases, may not be accessible due to the use of encryption or tunneling protocols by endpoints or gateways. Furthermore, the utility of ports as signifiers of application type has some limitations due to abuse and non-standard usage, amongst other reasons. These factors reduce the classification accuracy. In this paper, we propose a novel technique for inferring the distribution of application classes present in the aggregated traffic flows between endpoints, that exploits both the measured statistics of the traffic flows, and the spatial distribution of those flows across the network. Our method employs a two-step supervised model, where the bootstrapping step provides initial (inaccurate) inference on the traffic application classes, and the graph-based calibration step adjusts the initial inference through the collective spatial traffic distribution. In evaluations using real traffic flow measurements from a large ISP, we show how our method can accurately classify application types within aggregate traffic between endpoints, even without knowledge of ports and other traffic features. While the bootstrap estimate classifies the aggregates with 80% accuracy, incorporating spatial distributions through calibration increases the accuracy to 92%, i.e., roughly halving the number of errors.

6 citations


Patent
17 Aug 2010
TL;DR: In this paper, the authors present a method and apparatus for classifying applications using the collective properties of network traffic, which they call traffic activity graph (TAG) for short.
Abstract: In one embodiment, the present disclosure is a method and apparatus for classifying applications using the collective properties of network traffic. In one embodiment, a method for classifying traffic in a communication network includes receiving a traffic activity graph, the traffic activity graph comprising a plurality of nodes interconnected by a plurality of edges, where each of the nodes represents an endpoint associated with the communication network and each of the edges represents traffic between a corresponding pair of the nodes, generating an initial set of inferences as to an application class associated with each of the edges, based on at least one measured statistic related to at least one traffic flow in the communication network, and refining the initial set of inferences based on a spatial distribution of the traffic flows, to produce a final traffic activity graph.

2 citations