scispace - formally typeset
Search or ask a question
Author

David Plonka

Bio: David Plonka is an academic researcher from Akamai Technologies. The author has contributed to research in topics: The Internet & Address space. The author has an hindex of 15, co-authored 30 publications receiving 1818 citations. Previous affiliations of David Plonka include University of Wisconsin-Madison & Wisconsin Alumni Research Foundation.

Papers
More filters
Proceedings ArticleDOI
06 Nov 2002
TL;DR: This paper reports results of signal analysis of four classes of network traffic anomalies: outages, flash crowds, attacks and measurement failures, and shows that wavelet filters are quite effective at exposing the details of both ambient and anomalous traffic.
Abstract: Identifying anomalies rapidly and accurately is critical to the efficient operation of large computer networks. Accurately characterizing important classes of anomalies greatly facilitates their identification; however, the subtleties and complexities of anomalous traffic can easily confound this process. In this paper we report results of signal analysis of four classes of network traffic anomalies: outages, flash crowds, attacks and measurement failures. Data for this study consists of IP flow and SNMP measurements collected over a six month period at the border router of a large university. Our results show that wavelet filters are quite effective at exposing the details of both ambient and anomalous traffic. Specifically, we show that a pseudo-spline filter tuned at specific aggregation levels will expose distinct characteristics of each class of anomaly. We show that an effective way of exposing anomalies is via the detection of a sharp increase in the local variance of the filtered data. We evaluate traffic anomaly signals at different points within a network based on topological distance from the anomaly source or destination. We show that anomalies can be exposed effectively even when aggregated with a large amount of additional traffic. We also compare the difference between the same traffic anomaly signals as seen in SNMP and IP flow data, and show that the more coarse-grained SNMP data can also be used to expose anomalies effectively.

919 citations

Proceedings ArticleDOI
01 Nov 2001
TL;DR: Flow level data is demonstrated to be sufficient for exposing many different types of aberrant network traffic behavior in close to real time and has the benefit of generating much smaller data sets than packet level measurements which can become a significant issue in large, heavily used networks.
Abstract: One of the primary tasks of network administrators is monitoring routers and switches for anomalous traffic behavior such as outages, configuration changes, flash crowds and abuse. Recognizing and identifying anomalous behavior is often based on ad hoc methods developed from years of experience in managing networks. A variety of commercial and open source tools have been developed to assist in this process, however these require policies and/or or thresholds to be defined by the user in order to trigger alerts. The better the description of the anomalous behavior, the more effective these tools become. In this extended abstract we describe a project focused on precise characterization of anomalous network traffic behavior. The first step in our project is to gather passive measurements of network traffic at the IP flow level. IP flow level data as defined in [1] is a unidirectional series of IP packets of a given protocol traveling between a source and a destination IP/port pair within a certain period of time. While flow level data is certainly not as precise as passive measurements of packet level data, we demonstrate that it is sufficient for exposing many different types of aberrant network traffic behavior in close to real time. It also has the benefit of generating much smaller data sets than packet level measurements which can become a significant issue in large, heavily used networks. We use the FlowScan [2] open source software to gather and analyze network flow data. FlowScan takes Netflow [3] feeds from Cisco or other Lightweight Flow Accounting Protocol (LFAP) enabled routers, processes the data and then it in an efficient data structure. FlowScan also has a graphical interface which is currently the principal means for anomaly identification by network managers. FlowScan is currently deployed at the border router at the University of Wisconsin Madison as well as over 100 other sites nation wide.

244 citations

Book ChapterDOI
15 Sep 2004
TL;DR: The architecture and implementation of the Internet Sink system, which measures packet traffic on unused IP addresses in an efficient, extensible and scalable fashion, and includes an active component that generates response packets to incoming traffic is described.
Abstract: Monitoring unused or dark IP addresses offers opportunities to significantly improve and expand knowledge of abuse activity without many of the problems associated with typical network intrusion detection and firewall systems. In this paper, we address the problem of designing and deploying a system for monitoring large unused address spaces such as class A telescopes with 16M IP addresses. We describe the architecture and implementation of the Internet Sink (iSink) system which measures packet traffic on unused IP addresses in an efficient, extensible and scalable fashion. In contrast to traditional intrusion detection systems or firewalls, iSink includes an active component that generates response packets to incoming traffic. This gives the iSink an important advantage in discriminating between different types of attacks (through examination of the response payloads). The key feature of iSink’s design that distinguishes it from other unused address space monitors is that its active response component is stateless and thus highly scalable. We report performance results of our iSink implementation in both controlled laboratory experiments and from a case study of a live deployment. Our results demonstrate the efficiency and scalability of our implementation as well as the important perspective on abuse activity that is afforded by its use.

200 citations

Proceedings ArticleDOI
20 Oct 2008
TL;DR: A context-aware clustering methodology is described that is applied to DNS query-responses to generate the desired aggregates and enables the analysis to be scaled to expose the desired level of detail of each traffic type, and to expose their time varying characteristics.
Abstract: The Domain Name System (DNS) is a one of the most widely used services in the Internet. In this paper, we consider the question of how DNS traffic monitoring can provide an important and useful perspective on network traffic in an enterprise. We approach this problem by considering three classes of DNS traffic: canonical (i.e., RFC-intended behaviors), overloaded (e.g.,black-list services), and unwanted (i.e., queries that will never succeed). We describe a context-aware clustering methodology that is applied to DNS query-responses to generate the desired aggregates. Our method enables the analysis to be scaled to expose the desired level of detail of each traffic type, and to expose their time varying characteristics. We implement our method in a tool we call TreeTop, which can be used to analyze and visualize DNS traffic in real-time. We demonstrate the capabilities of our methodology and the utility of TreeTop using a set of DNS traces that we collected from our campus network over a period of three months. Our evaluation highlights both the coarse and fine level of detail that can be revealed by our method. Finally, we show preliminary results on how DNS analysis can be coupled with general network traffic monitoring to provide a useful perspective for network management and operations.

72 citations

Patent
29 Aug 2005
TL;DR: In this article, the active responder provides a response based only on the previous statement from the malicious source, which in most cases is sufficient to promote additional communication with the malicious sources, presenting a complete record of the transaction for analysis and signature extraction.
Abstract: A monitor of malicious network traffic attaches to unused addresses and monitors communications with an active responder that has constrained-state awareness to be highly scalable. In a preferred embodiment, the active responder provides a response based only on the previous statement from the malicious source, which in most cases is sufficient to promote additional communication with the malicious source, presenting a complete record of the transaction for analysis and possible signature extraction.

50 citations


Cited by
More filters
Journal ArticleDOI
01 Apr 2004
TL;DR: This paper presents two taxonomies for classifying attacks and defenses in distributed denial-of-service (DDoS) and provides researchers with a better understanding of the problem and the current solution space.
Abstract: Distributed denial-of-service (DDoS) is a rapidly growing problem. The multitude and variety of both the attacks and the defense approaches is overwhelming. This paper presents two taxonomies for classifying attacks and defenses, and thus provides researchers with a better understanding of the problem and the current solution space. The attack classification criteria was selected to highlight commonalities and important features of attack strategies, that define challenges and dictate the design of countermeasures. The defense taxonomy classifies the body of existing DDoS defenses based on their design decisions; it then shows how these decisions dictate the advantages and deficiencies of proposed solutions.

1,866 citations

Proceedings ArticleDOI
22 Aug 2005
TL;DR: It is argued that the distributions of packet features observed in flow traces reveals both the presence and the structure of a wide range of anomalies, and that using feature distributions, anomalies naturally fall into distinct and meaningful clusters that can be used to automatically classify anomalies and to uncover new anomaly types.
Abstract: The increasing practicality of large-scale flow capture makes it possible to conceive of traffic analysis methods that detect and identify a large and diverse set of anomalies. However the challenge of effectively analyzing this massive data source for anomaly diagnosis is as yet unmet. We argue that the distributions of packet features (IP addresses and ports) observed in flow traces reveals both the presence and the structure of a wide range of anomalies. Using entropy as a summarization tool, we show that the analysis of feature distributions leads to significant advances on two fronts: (1) it enables highly sensitive detection of a wide range of anomalies, augmenting detections by volume-based methods, and (2) it enables automatic classification of anomalies via unsupervised learning. We show that using feature distributions, anomalies naturally fall into distinct and meaningful clusters. These clusters can be used to automatically classify anomalies and to uncover new anomaly types. We validate our claims on data from two backbone networks (Abilene and Geant) and conclude that feature distributions show promise as a key element of a fairly general network anomaly diagnosis framework.

1,228 citations

Proceedings ArticleDOI
30 Aug 2004
TL;DR: A general method based on a separation of the high-dimensional space occupied by a set of network traffic measurements into disjoint subspaces corresponding to normal and anomalous network conditions to diagnose anomalies is proposed.
Abstract: Anomalies are unusual and significant changes in a network's traffic levels, which can often span multiple links. Diagnosing anomalies is critical for both network operators and end users. It is a difficult problem because one must extract and interpret anomalous patterns from large amounts of high-dimensional, noisy data.In this paper we propose a general method to diagnose anomalies. This method is based on a separation of the high-dimensional space occupied by a set of network traffic measurements into disjoint subspaces corresponding to normal and anomalous network conditions. We show that this separation can be performed effectively by Principal Component Analysis.Using only simple traffic measurements from links, we study volume anomalies and show that the method can: (1) accurately detect when a volume anomaly is occurring; (2) correctly identify the underlying origin-destination (OD) flow which is the source of the anomaly; and (3) accurately estimate the amount of traffic involved in the anomalous OD flow.We evaluate the method's ability to diagnose (i.e., detect, identify, and quantify) both existing and synthetically injected volume anomalies in real traffic from two backbone networks. Our method consistently diagnoses the largest volume anomalies, and does so with a very low false alarm rate.

1,157 citations

Proceedings ArticleDOI
06 Nov 2002
TL;DR: This paper reports results of signal analysis of four classes of network traffic anomalies: outages, flash crowds, attacks and measurement failures, and shows that wavelet filters are quite effective at exposing the details of both ambient and anomalous traffic.
Abstract: Identifying anomalies rapidly and accurately is critical to the efficient operation of large computer networks. Accurately characterizing important classes of anomalies greatly facilitates their identification; however, the subtleties and complexities of anomalous traffic can easily confound this process. In this paper we report results of signal analysis of four classes of network traffic anomalies: outages, flash crowds, attacks and measurement failures. Data for this study consists of IP flow and SNMP measurements collected over a six month period at the border router of a large university. Our results show that wavelet filters are quite effective at exposing the details of both ambient and anomalous traffic. Specifically, we show that a pseudo-spline filter tuned at specific aggregation levels will expose distinct characteristics of each class of anomaly. We show that an effective way of exposing anomalies is via the detection of a sharp increase in the local variance of the filtered data. We evaluate traffic anomaly signals at different points within a network based on topological distance from the anomaly source or destination. We show that anomalies can be exposed effectively even when aggregated with a large amount of additional traffic. We also compare the difference between the same traffic anomaly signals as seen in SNMP and IP flow data, and show that the more coarse-grained SNMP data can also be used to expose anomalies effectively.

919 citations

Proceedings ArticleDOI
17 May 2004
TL;DR: In this article, the authors identify the application level signatures by examining some available documentations, and packet-level traces, and then utilize the identified signatures to develop online filters that can efficiently and accurately track the P2P traffic even on high-speed network links.
Abstract: The ability to accurately identify the network traffic associated with different P2P applications is important to a broad range of network operations including application-specific traffic engineering, capacity planning, provisioning, service differentiation,etc. However, traditional traffic to higher-level application mapping techniques such as default server TCP or UDP network-port baseddisambiguation is highly inaccurate for some P2P applications.In this paper, we provide an efficient approach for identifying the P2P application traffic through application level signatures. We firstidentify the application level signatures by examining some available documentations, and packet-level traces. We then utilize the identified signatures to develop online filters that can efficiently and accurately track the P2P traffic even on high-speed network links.We examine the performance of our application-level identification approach using five popular P2P protocols. Our measurements show thatour technique achieves less than 5% false positive and false negative ratios in most cases. We also show that our approach only requires the examination of the very first few packets (less than 10packets) to identify a P2P connection, which makes our approach highly scalable. Our technique can significantly improve the P2P traffic volume estimates over what pure network port based approaches provide. For instance, we were able to identify 3 times as much traffic for the popular Kazaa P2P protocol, compared to the traditional port-based approach.

856 citations