scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Statistical approaches to DDoS attack detection and response

22 Apr 2003-Vol. 1, pp 303-314
TL;DR: Methods to identify DDoS attacks by computing entropy and frequency-sorted distributions of selected packet attributes and how the detectors can be extended to make effective response decisions are presented.
Abstract: The nature of the threats posed by distributed denial of service (DDoS) attacks on large networks, such as the Internet, demands effective detection and response methods. These methods must be deployed not only at the edge but also at the core of the network This paper presents methods to identify DDoS attacks by computing entropy and frequency-sorted distributions of selected packet attributes. The DDoS attacks show anomalies in the characteristics of the selected packet attributes. The detection accuracy and performance are analyzed using live traffic traces from a variety of network environments ranging from points in the core of the Internet to those inside an edge network The results indicate that these methods can be effective against current attacks and suggest directions for improving detection of more stealthy attacks. We also describe our detection-response prototype and how the detectors can be extended to make effective response decisions.

Summary (5 min read)

1. Introduction

  • Powerful DDoS toolkits are available to potential attackers, and essential networks are ill prepared for defense.
  • False positives can lead to inappropriate responses that cause denial of service to legitimate users.
  • The PHAD component clusters observed values and then compares the size of the clusters to accepted thresholds to determine anomalies.

2. Detection Algorithms

  • The authors detection algorithms measure statistical properties of specific fields in the packet headers at various points in the Internet.
  • If a detector captures 1000 consecutive packets at a peering point and computes the frequency of occurrence of each unique source IP address in those 1000 packets, then the detector will have a model of the distribution of the source address.
  • Further computations with this distribution allow us to measure the randomness or uniformity of the addresses as well as the “goodness-of-fit” of the distribution with respect to prior measurements.

2.1. Entropy

  • Let an information source have n independent symbols each with probability of choice pi.
  • The authors have observed through experimentation that while a network is not under attack, the entropy values for various header fields each fall in a narrow range.
  • Isolate the term in the summation corresponding to the probability of the symbol acquired from shifting the window.
  • Using the values computed in step 6, add the two terms missing from the entropy summation back in and compare this new entropy value to the previous entropy computations.
  • Increasing W will reduce the variation in entropy and may reduce the rate of falsepositives resulting from brief and presumably insignificant anomalies.

2.2. Chi-Square Statistic

  • Pearson’s chi-square (χ2) Test is used for distribution comparison in cases where the measurements involved are discrete values.
  • Hence, comparison with the chi-square distribution is of limited utility.
  • Apply exponential decay to the stored frequency for v based on its age (time since last update).
  • Group the attribute values into bins based on frequency.
  • Each time the current-traffic bin frequencies are computed, the average is updated as follows: 1. Exponential decay is applied to the stored binfrequency averages, using a significantly longer half-life than is used for the current-traffic profile.

3. Detector Evaluation

  • In order to evaluate thoroughly the potential effectiveness of DDoS detection methods such as those described in Section 2, the authors must address the following questions.
  • Ideally, a detector should pick up not only attacks generated by tools found “in the wild” to date, but also more stealthy attacks using more sophisticated tools wielded by attackers familiar with the detection method and detector’s network environment.
  • Characteristics of the monitored network traffic will vary significantly depending on where detectors are deployed.
  • The remainder of this section describes attempts answer these questions for the entropy and chi-square DDoS detection methods.

3.1. Prototype Implementation

  • To evaluate the DDoS attack detection methods described in Section 2 under realistic conditions, the authors implemented prototype detector modules as plug-ins for Snort, the popular, open-source network intrusion detection system [13], [14].
  • In addition to real-time traffic monitoring, Snort supports off-line processing of previously captured network traffic, making it possible to conduct reproducible detection experiments with traffic data from a variety of environments.
  • Conf configuration file, and can trigger alarms through Snort’s modular alerting facility.
  • The chi-square detector logs the periodically computed chi-square statistics for each of the specified packet attributes, along with the current and baseline bin frequency values used to compute those statistics.
  • This data can be useful for manual or automatic detector tuning and alert threshold setting.

3.2. Network Trace Data

  • This allows us to determine how stable the traffic statistics monitored by the detectors are in those environments, and how effectively the detectors can identify DDoS attack traffic in different contexts.
  • To test the effects of DDoS attacks, the authors simulate these attacks by overlaying the kind of attack traffic generated by some existing DDoS attack tools onto the traces at various concentrations [10].
  • The traces used were drawn from a variety of network environments, as described below, and most have IP addresses that have been transformed via an unknown but one-to-one function for privacy purposes.
  • This trace, from July 2000, includes five consecutive days of IP headers sent through the New Zealand Internet Exchange (NZIX), a peering point for several major New Zealand ISPs and the University of Waikato; throughput ranges roughly from 4 to 12 Mbits/s. One 24-hour weekday trace was used for experimentation.

3.3. Detection Example

  • To illustrate the effects of an attack on the entropy and chi-square statistics, the authors examined a 1,000,000- packet excerpt from the NZIX data set with a simulated DDoS attack comprising 25% of all packets, starting at packet number 700,000 and ending at packet number 800,000.
  • Before the attack begins, source address entropy measurements fall entirely within the range 7.0-7.5.
  • Any maximum-entropy threshold setting between 7.5 and 8.75 would detect this attack without generating any false-positives in this example.
  • In Figure 2, the bin frequency profile for a source address chi-square detector (.

4096, and the remainder) is displayed for the same example. The six colored regions represent the percent-

  • When the attack begins at packet 700,000, the total frequency of bin 6 (representing packets whose source addresses are least frequently seen) grows noticeably, as the authors would expect since the source addresses in the attack traffic are drawn from a uniform, rather than power-law, distribution.
  • The chi-square values for this trace are shown in Figure 3, using a baseline profile taken from the previous day’s traffic.
  • Any chi-square threshold between 1500 and 5000 would catch the attack without generating false positives.
  • An attack in which the source addresses were fixed or drawn from a small set would produce similarly dramatic results for both entropy and chi-square detectors.

3.4. Distribution of Statistics

  • The authors now look more closely at the distribution of chisquare and entropy measurements for legitimate traffic traces and for the same traces with different kinds of simulated DDoS attack traffic overlaid.
  • The simulated attack traffic thus has the same source-address frequency distribution as the legitimate traffic, but uses a different set of source addresses.
  • This way, an attacker armed with this knowledge of the detector environment could produce attack traffic that would produce little change in the entropy observed at the detector.
  • Distribution of chi-square values for source address under normal and stealthyDDoS attack conditions Tables 1 and 2 show the results of running similar experiments on the different traffic traces described in Section 3.2, with a variety of attack and detector combinations, also known as Figure 7.
  • The authors contend that the prospects for detecting stealthy attacks are not as bleak as they might appear, for several reasons.

3.5. Detector Performance

  • Since the authors are proposing to use these detection methods in high-speed core routers, it is imperative that they have low computational cost, especially for the operations that must be carried out for each packet.
  • The prototype Snort detector implementation exhibits adequate performance for its purposes: on a 1GHz Pentium-III-based machine, a Snort process running a single chi-square detector observing source addresses can process 240,000-270,000 packets per second (pps) offline.
  • A single-attribute entropy detector can manage about 294,000 pps, while adding six others yields 130,000 pps.
  • These speeds are roughly in the OC3 range.
  • The authors expect to achieve improved performance by implementing some optimizations that approximate the true frequency profile while reducing or eliminating floating-point operations in the packet-handling code.

4. Response

  • The authors defense approach involves response modules that use a characterization of the attack provided by the detection module to take defensive measures.
  • The response module classifies individual packets as benign or suspect based on the attack characteristics provided by the detector.
  • Once identified, the suspect packets are subjected to rate limiting or packet-filtering methods based on the intensity of the attack or pre-defined response policies.
  • In the case of stealthy DDoS attacks, the response module should communicate with the detector and share the data structures and statistical models maintained by the detector to identify the attack packets with high confidence; the prototype described below does not yet offer such coordination.

4.1. Prototype DDoS Response Module

  • It uses netfilter and Linux Advanced Routing and Traffic Control to filter and rate-limit packets [15],[4].
  • Currently, the response module implements three packet-filtering rules.
  • The random filter rule can be applied to the IP header fields, TCP source and destination ports, UDP source and destination ports, and ICMP type and code fields.
  • Since this simple approach allows all packets with a given value to pass after the threshold is reached for that value, an attacker could choose a distribution of attack packets that limits the filter’s effectiveness.
  • The clear option is used to remove filter rules.

4.2. Extending detectors to recommend response

  • Both chi-square and entropy DDoS detectors can be extended to provide attack characterization information that can be used to target packet-filtering or ratelimiting responses to mitigate the effects of DDoS attacks.
  • In order to determine the most anomalous bin, the detector need only find the largest terms in the chi-square sum.
  • Conversely, an unusually high entropy value suggests that the low-frequency values are causing trouble, so the detector might suggest that packets having highfrequency values be given preferential treatment.
  • Second, the authors modified the Snort-based chi-square and entropy detectors to issue rate-limiting directives to the iptables-based response module described in Section 4.1.
  • This approach would allow response decisions to take full advantage of the information already collected by the detector.

4.3. DDoS Response Module Evaluation

  • The current response prototype is an initial implementation of the response system.
  • Initial experimental results have indicated that the response prototype blocks substantial DDoS attack traffic generated by the Stacheldraht attack tool.
  • The random rule has the basic drawback of dropping the first few packets of every new good connection.
  • These two rules could potentially increase the false negatives.
  • The allow rule could allow through some of the DDoS attack traffic that matches the rule, increasing the false positives.

5. Summary and Future Extensions

  • The focus thus far has been on detection and response algorithms and the implementation of these algorithms in software.
  • Against today’s relatively unsophisticated DDoS toolkits, their prototype detector is able to determine that the network is under attack and deploy accurate filtering rules.
  • The filtering effort is immediate and reduces the impact of the attack downstream almost instantly.
  • Another approach to providing more narrowly targeted response while avoiding computationally expensive analysis would be to enable detectors to dynamically tune themselves and “drill down” to investigate detected anomalies more closely.
  • The Linux implementation of this system has been appropriate for demonstration environments and evaluation of alternative detection approaches.

7. References

  • [1] D. Dittrich, “The ‘Stacheldraht’ Distributed Denial of Service Attack Tool”, http://staff.washington.edu/dittrich/ misc/stacheldraht.analysis, 1999. [2].
  • D. Knuth, The Art of Computer Programming: Seminumerical Algorithms, Third edition, Vol. 2, AddisonWesley, Reading, Massachusetts, 1997. [6].
  • O. Pomerantz, “Linux Kernel Module Programming Guide”, http://www.tldp.org/LDP/lkmpg/mpg.html. [12].
  • M. Roesch, “Snort - Lightweight Intrusion Detection for Networks” Proceedings of the 13th Systems Administration Conference (LISA'99), USENIX Association, 1999, pp. 229- 238, http://www.snort.org/docs/lisapaper.txt.
  • Proceedings of the DARPA Information Survivability Conference and Exposition (DISCEX’03).

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Statistical Approaches to DDoS Attack Detection and Response
1
1
This research was supported by DARPA under contract N66001-01-C-8048.
Laura Feinstein, Dan Schnackenberg
The Boeing Company, Phantom Works
Laura.C.Feinstein@boeing.com
Daniel.D.Schnackenberg@boeing.com
Ravindra Balupari, Darrell Kindred
Network Associates Laboratories
Ravindra_Balupari@nai.com
Darrell_Kindred@nai.com
Abstract
The nature of the threats posed by Distributed Denial of
Service (DDoS) attacks on large networks, such as the
Internet, demands effective detection and response
methods. These methods must be deployed not only at
the edge but also at the core of the network. This paper
presents methods to identify DDoS attacks by comput-
ing entropy and frequency-sorted distributions of
selected packet attributes. The DDoS attacks show
anomalies in the characteristics of the selected packet
attributes. The detection accuracy and performance are
analyzed using live traffic traces from a variety of
network environments ranging from points in the core
of the Internet to those inside an edge network. The
results indicate that these methods can be effective
against current attacks and suggest directions for
improving detection of more stealthy attacks. We also
describe our detection-response prototype and how the
detectors can be extended to make effective response
decisions.
1. Introduction
Powerful DDoS toolkits are available to potential
attackers, and essential networks are ill prepared for
defense. The security community has long known that
DDoS attacks are possible, but only in the past three
years have such attacks become popular with hackers.
As ominous as the threat is today, it will only worsen as
tools are built to evade defenses. Soon, DDoS floods
will appear that are difficult to distinguish from legiti-
mate traffic, and packet rates from individual flood
sources will be low enough to escape notice by local
administrators. To meet the increasing need for detec-
tion and response, researchers face these major issues:
A stand-alone router on the attack path should
automatically recognize that the network is under
attack and adjust its traffic flow to ease the attack
impact downstream.
The detection and response techniques should be
adaptable to a wide range of network environ-
ments, preferably without significant manual tun-
ing.
Attack detection should be as accurate as possible.
False positives can lead to inappropriate responses
that cause denial of service to legitimate users.
False negatives result in attacks going unnoticed.
Attack response should employ intelligent packet
discard mechanisms to reduce the downstream im-
pact of the flood while preserving and routing the
non-attack packets.
The detection method should be effective against a
variety of attack tools available today and also ro-
bust against future attempts by attackers to evade
detection.
These are demanding goals, but we contend that
there are several reasons to believe that satisfactory
detection and response methods can be designed. DDoS
traffic generated by todays tools often has packet-
crafting characteristics that make it possible to distin-
guish from normal traffic. For example, in some con-
figurations the Stacheldraht attack tool crafts packets so
that the source port is random and the destination port
is sequentially increased from one packet to the next
[1],[10]. Future DDoS tools may include improvements
to packet crafting. However, we claim that these tools
are unlikely to model legitimate traffic closely enough
to produce crafted packets that do not distort statistical
measurements of the composition of the traffic. Our
hypothesis is that relatively simple statistical measures
can be used to discriminate DDoS traffic from legiti-
mate traffic in core routers with sufficient accuracy to
mitigate the effect of the attack downstream.
Research conducted by other organizations suggests
that statistical measurements and statistical processing
are an effective approach to the DDoS problem. The
EMERALD project at SRI International uses intrusion
Proceedings of the DARPA Information Survivability Conference and Exposition (DISCEX’03)
0-7695-1897-4/03 $17.00 © 2003 IEEE

detection signatures with Bayesian inference to detect
distributed attacks [12].
Researchers at Florida Institute of Technology have
created an intrusion detection system (IDS) that is non-
stationary and models probabilities based on time since
the last event rather than on average rate [6]. This IDS
operates on many of the same fields our detector
monitors and has similar training requirements to set up
initial thresholds and baselines. The system has two
components, PHAD and ALAD. PHAD operates on the
packet header while ALAD operates on an incoming
server TCP connection. The PHAD component clusters
observed values and then compares the size of the
clusters to accepted thresholds to determine anomalies.
Mazu Networks uses a similar architecture to PHAD
and our chi-square detector. The Mazu system collects
network statistics through a monitoring device and
similarly sorts the collected items into buckets [3]. An
algorithm determines whether buckets should be
divided or combined and a threshold detects anomalies
depending on the number and size of the buckets.
We have imposed some significant constraints on
our DDoS defense development: no explicit coordina-
tion (e.g., pushback [7]) between defending network
components, no built-in knowledge of applications or
protocols, and no instrumentation at end hosts. These
approaches are being actively explored in other re-
search, and we believe that the techniques described
here can complement these others in a comprehensive
DDoS defense solution.
2. Detection Algorithms
Our detection algorithms measure statistical proper-
ties of specific fields in the packet headers at various
points in the Internet. For instance, if a detector cap-
tures 1000 consecutive packets at a peering point and
computes the frequency of occurrence of each unique
source IP address in those 1000 packets, then the
detector will have a model of the distribution of the
source address. Further computations with this distribu-
tion allow us to measure the randomness or uniformity
of the addresses as well as the goodness-of-fitof the
distribution with respect to prior measurements.
2.1. Entropy
Let an information source have n independent sym-
bols each with probability of choice p
i
. Then, the
entropy H is defined as [17]:
i
n
i
i
p
H
2
1
log
=
=
Hence, entropy can be computed on a sample of con-
secutive packets. Comparing the value for entropy of
some sample of packet header fields to that of another
sample of packet header fields from the same peering
point provides a mechanism for detecting changes in
the randomness. We have observed through experimen-
tation that while a network is not under attack, the
entropy values for various header fields each fall in a
narrow range. While the network is under attack with
current attack tools, these entropy values exceed these
ranges in a detectable manner.
The algorithm to compute entropy can be optimized
to perform only a few simple computations per packet.
In our implementation, the entropy of a source will be
calculated through a sliding window of fixed width, W.
The probability value p
i
in this algorithm is actually the
frequency of occurrence of each unique symbol divided
by the total number of symbols in the sample. The
process of computing entropy of W packets is as fol-
lows:
1. Compute the entropy of the first W packets with
reference to a specific header parameter (e.g.
source IP address).
2. Isolate the term in the summation corresponding to
the probability of the first symbol in the window
(label this symbol with i=1) and also the value for
the corresponding probability (p
i-1
).
3. Slide the window so the new first term was previ-
ously the second term and the next W-1 consecu-
tive terms are contained in the window.
4. Isolate the term in the summation corresponding to
the probability of the symbol acquired from shift-
ing the window.
5. Subtract off the terms isolated in steps 2 and 4
from the value computed in step 1.
6. Recompute the affected probabilities for the
current window of data. That is, recompute p
i-1
and
the probability of the symbol that was added by
sliding the window.
7. Using the values computed in step 6, add the two
terms missing from the entropy summation back in
and compare this new entropy value to the previous
entropy computations.
8. Repeat steps 2-7 to determine subsequent entropy
values.
A sophisticated attacker would likely attempt to de-
feat the detection algorithm by creating stealthy traffic
floods that mimic the legitimate traffic the detector
would expect. An attacker who knew that the entropy
of various packet attributes was being monitored could
build an attack tool that generates floods with tunable
entropy levels. Through guesswork, penetration, or trial
and error, the attacker could determine typical entropy
levels seen at the detector and tune the flood to match.
This may not be as easy as it sounds, particularly if
there are multiple detectors deployed between the flood
sources and the targets, as the typical entropy values
Proceedings of the DARPA Information Survivability Conference and Exposition (DISCEX’03)
0-7695-1897-4/03 $17.00 © 2003 IEEE

seen by detectors in different network environments are
likely to differ. Stealthy attacks are explored further in
Section 3.4.
The window size, W, is a tunable parameter that con-
trols how much smoothing of short-term fluctuations
the detector will do. Increasing W will reduce the
variation in entropy and may reduce the rate of false-
positives resulting from brief and presumably insignifi-
cant anomalies. However, W should be kept small
enough that attacks are detected quickly. We have
found that a window size of 10,000 packets is a reason-
able compromise in the network environments we have
explored.
2.2. Chi-Square Statistic
Pearsons chi-square (
χ
2
) Test is used for distribu-
tion comparison in cases where the measurements
involved are discrete values. For example, it could be
used to test the distribution of TCP SYN flag values (0
or 1) or protocol numbers. The test works best when the
number of possible values is small. In particular, a rule
of thumb is that the expected number of packets in a
sample having each possible value be at least five.
However, this can often be achieved through binning,
that is combining a set or range of possible values and
treating them as one. For example, the chi-square test
can be applied to service ports by considering four
values: HTTP, FTP, DNS, and other.Similarly,
packet lengths can be binned into ranges such as 0-64
bytes, 65-128 bytes, 129-255 bytes, etc.
For a sample of N packets, let B be the number of
available bins. Define N
i
as the number of packets
whose value falls in the ith bin and n
i
as the expected
number of packets in the ith bin under the typical
distribution. Then the chi-square statistic is computed
as follows:
=
=
B
i
i
i
i
n
n
1
2
2
)
(
χ
.
When the N
i
and n
i
values are large and the N meas-
urements are independent and drawn from the expected
distribution, this value follows the well-known chi-
square distribution with B-1 degrees of freedom. These
assumptions (in particular, independence) do not
typically hold for packet field values even under normal
conditions. Hence, comparison with the chi-square
distribution is of limited utility. However, the chi-
square statistic does provide a useful measure of the
deviation of a current traffic profile from the baseline.
A current-traffic profile, mapping packet attribute
values to frequencies, is maintained as follows:
1. For each packet that arrives, extract the value, v,
of the desired attribute (e.g., source address).
2. Apply exponential decay to the stored frequency
for v based on its age (time since last update).
The stored frequency is multiplied by
halflife
age
)
5
.
0
ln(
exp
.
3. Increment the frequency for v and store the cur-
rent time (or packet count) as its last-update
time.
Periodically, this current-traffic profile is compared
with a baseline profile using the chi-square statistic, as
follows:
1. Apply exponential decay to the stored current-
traffic frequencies, as above.
2. Group the attribute values into bins based on
frequency. For example, the 16 most common
values might go in one bin, the next 64 in an-
other, the next 256 in another, and the rest in
another.
3. Calculate the total frequency for each bin.
4. Calculate the chi-square statistic, comparing
these bin-frequency totals with the bin-
frequency values in the baseline profile.
The baseline profile can be maintained as decaying
averages of the current-traffic bin frequencies. Each
time the current-traffic bin frequencies are computed,
the average is updated as follows:
1. Exponential decay is applied to the stored bin-
frequency averages, using a significantly longer
half-life than is used for the current-traffic pro-
file.
2. The new set of bin frequencies is multiplied by
halflife
baseline
age
_
)
5
.
0
ln(
exp
1
and the result is added to the decayed average.
The user can tune the detector by modifying the fol-
lowing parameters: traffic profile half-life, baseline
profile half-life, bin definitions and hash function
range. Values in the current-traffic profile whose
frequencies decay below a certain threshold can be
purged without substantially affecting the chi-square
computation. This purging reduces memory consump-
tion and processing requirements. For packet attributes
such as IP addresses that have a very large range, a
hash of the attribute's value may be used instead of the
value itself in order to reduce memory consumption and
processing requirements in the worst case (many
distinct values). When the baseline frequency value for
a given bin is very low, the chi-square statistic may be
excessively influenced by that bin's value. Ideally, the
bins will be defined such that this is unlikely, but as a
fallback, low-value bins can be automatically merged
with adjoining bins prior to computing the chi-square
statistic.
Proceedings of the DARPA Information Survivability Conference and Exposition (DISCEX’03)
0-7695-1897-4/03 $17.00 © 2003 IEEE

It is unlikely that an outside attacker without access
to the detector itself or a large fraction of its network
neighbors will know the exact characteristics of net-
work traffic typically seen by the detector. Therefore,
we hypothesize that the attack traffic will differ from
typical traffic in measurable ways.
3. Detector Evaluation
In order to evaluate thoroughly the potential effec-
tiveness of DDoS detection methods such as those
described in Section 2, we must address the following
questions.
How well can the method distinguish attack condi-
tions from normal conditions? To answer this question,
we must determine what kinds of DDoS attacks the
method can detect, and what fraction of the monitored
traffic the attacks must comprise in order to be detected.
Ideally, a detector should pick up not only attacks
generated by tools found in the wildto date, but also
more stealthy attacks using more sophisticated tools
wielded by attackers familiar with the detection method
and detectors network environment. Finally, we must
assess the frequency and consequences of false-
positives, ordinary fluctuations in legitimate traffic
interpreted by the detector as attacks.
To what network environments and platforms is the
method best suited? Characteristics of the monitored
network traffic will vary significantly depending on
where detectors are deployed. The protocols used,
diversity of addresses seen, typical session durations,
response latency, and daily volume fluctuations will
differ dramatically among LAN environments, edge
routers, and core routers. A detection method effective
in one of these environments may fare poorly in others.
In addition, if the method is to be applied in core
routers, its per-packet computational requirements and
memory usage must be modest in order to make real-
time processing at high bandwidths practical (see
Section 3.5).
Once an attack is detected, can the detector charac-
terize the attack traffic sufficiently to produce a tar-
geted response that mitigates the attacks effects?
Detection alone may be useful for alerting human
administrators to attacks in progress or notifying
upstream (closer to attack sources) devices that some-
thing should be done. However, many DDoS attacks
today are only two minutes in duration [8], so the
ability to generate automated responses, at least as a
preliminary measure, is important. A detection method
that can effectively describe the nature of the attack will
make such automated response more practical.
The remainder of this section describes attempts an-
swer these questions for the entropy and chi-square
DDoS detection methods.
3.1. Prototype Implementation
To evaluate the DDoS attack detection methods de-
scribed in Section 2 under realistic conditions, we
implemented prototype detector modules as plug-ins for
Snort, the popular, open-source network intrusion
detection system [13], [14]. In addition to real-time
traffic monitoring, Snort supports off-line processing of
previously captured network traffic, making it possible
to conduct reproducible detection experiments with
traffic data from a variety of environments.
The chi-square and entropy detectors were built as
Snort preprocessors, operating on every IP datagram
received by Snort prior to stream reassembly and other
packet manipulation. The two detectors can be indi-
vidually enabled and configured in the
snort.conf
configuration file, and can trigger alarms through
Snorts modular alerting facility.
In addition to issuing alerts, these plug-ins record
data to log files in the Snort log directory. The entropy
detector logs periodically computed entropy values for
each packet attribute specified in the initialization file
(e.g., source and destination IP addresses and
TCP/UDP ports, datagram length, and TCP window
size). The chi-square detector logs the periodically
computed chi-square statistics for each of the specified
packet attributes, along with the current and baseline
bin frequency values used to compute those statistics.
This data can be useful for manual or automatic detec-
tor tuning and alert threshold setting.
3.2. Network Trace Data
A critical element of evaluating these detectors is
exposing them to traffic from a variety of network
environments. This allows us to determine how stable
the traffic statistics monitored by the detectors are in
those environments, and how effectively the detectors
can identify DDoS attack traffic in different contexts.
For this purpose, we obtained several publicly avail-
able network traces as well as some traces collected
specifically for our experiments. These traces are not
known to contain substantial DDoS attacks, so we treat
them as consisting of legitimate traffic. To test the
effects of DDoS attacks, we simulate these attacks by
overlaying the kind of attack traffic generated by some
existing DDoS attack tools onto the traces at various
concentrations [10]. Ideally, we would make use of
traces containing identifiable periods during which
actual DDoS attacks were in progress, but few of these
are publicly available.
The traces used were drawn from a variety of net-
work environments, as described below, and most have
IP addresses that have been transformed via an un-
known but one-to-one function for privacy purposes.
Proceedings of the DARPA Information Survivability Conference and Exposition (DISCEX’03)
0-7695-1897-4/03 $17.00 © 2003 IEEE

This address re-mapping is irrelevant to the currently
implemented detectors, since they make no assumptions
about relationships between different IP addresses.
The following traces were used:
NZIX. This trace, from July 2000, includes five
consecutive days of IP headers sent through the
New Zealand Internet Exchange (NZIX), a peer-
ing point for several major New Zealand ISPs
and the University of Waikato; throughput
ranges roughly from 4 to 12 Mbits/s. Two six-
hour periods were used for detector experimen-
tation.
Bell Labs. This trace contains one week of IP
headers observed outside the firewall for Bell
Labs, a 9Mbit/s connection serving a staff of
about 450. One full day of this traffic was used
for experimentation.
University. This trace was collected from the
Stocker Engineering and Technology network at
Ohio University. It contains all the packets en-
tering and leaving the network, with throughput
ranging from 8 to 16 Mbit/s. Three sets of data,
each having around 30,000,000 packets, were
collected at different times during a day for ex-
perimentation.
Small Company. This trace contains one week
of network traffic observed outside the firewall
of a small technology company in the United
States. The connection served a staff of about
200 users in the company. One 24-hour week-
day trace was used for experimentation.
3.3. Detection Example
To illustrate the effects of an attack on the entropy
and chi-square statistics, we examined a 1,000,000-
packet excerpt from the NZIX data set with a simulated
DDoS attack comprising 25% of all packets, starting at
packet number 700,000 and ending at packet number
800,000. (Packets in this excerpt are numbered from
200,000 to 1,200,000.) In this attack, IP source ad-
dresses are chosen at random from a uniform distribu-
tion; we will focus on source-address-based detection.
Figure 1 shows the output (entropy values) of an
entropy detector examining the IP source address
packet attribute with a window size of 10,000 packets.
Before the attack begins, source address entropy meas-
urements fall entirely within the range 7.0-7.5. During
the attack, the entropy increases by approximately 1.5.
Any maximum-entropy threshold setting between 7.5
and 8.75 would detect this attack without generating
any false-positives in this example.
In Figure 2, the bin frequency profile for a source
address chi-square detector (current traffic half-life is
Figure 1: Entropy for a brief DDoS attack
Figure 2: Bin frequencies for a brief attack
Figure 3: Chi-square values for a brief attack
20000 packets; bins defined as most frequent source
address, next 4 most frequent, next 16, next 256, next
4096, and the remainder) is displayed for the same
example. The six colored regions represent the percent-
Proceedings of the DARPA Information Survivability Conference and Exposition (DISCEX’03)
0-7695-1897-4/03 $17.00 © 2003 IEEE

Citations
More filters
Proceedings ArticleDOI
22 Aug 2005
TL;DR: It is argued that the distributions of packet features observed in flow traces reveals both the presence and the structure of a wide range of anomalies, and that using feature distributions, anomalies naturally fall into distinct and meaningful clusters that can be used to automatically classify anomalies and to uncover new anomaly types.
Abstract: The increasing practicality of large-scale flow capture makes it possible to conceive of traffic analysis methods that detect and identify a large and diverse set of anomalies. However the challenge of effectively analyzing this massive data source for anomaly diagnosis is as yet unmet. We argue that the distributions of packet features (IP addresses and ports) observed in flow traces reveals both the presence and the structure of a wide range of anomalies. Using entropy as a summarization tool, we show that the analysis of feature distributions leads to significant advances on two fronts: (1) it enables highly sensitive detection of a wide range of anomalies, augmenting detections by volume-based methods, and (2) it enables automatic classification of anomalies via unsupervised learning. We show that using feature distributions, anomalies naturally fall into distinct and meaningful clusters. These clusters can be used to automatically classify anomalies and to uncover new anomaly types. We validate our claims on data from two backbone networks (Abilene and Geant) and conclude that feature distributions show promise as a key element of a fairly general network anomaly diagnosis framework.

1,228 citations


Cites background from "Statistical approaches to DDoS atta..."

  • ...for example for problems in intrusion detection by [26], and to detect DOS attacks [9]....

    [...]

Journal ArticleDOI
TL;DR: This paper proposes protocols, as components of a framework, for the identification and local containment of misbehaving or faulty nodes, and then for their eviction from the system, and shows that the distributed approach to contain nodes and contribute to their eviction is efficiently feasible and achieves a sufficient level of robustness.
Abstract: Vehicular networks (VNs) are emerging, among civilian applications, as a convincing instantiation of the mobile networking technology. However, security is a critical factor and a significant challenge to be met. Misbehaving or faulty network nodes have to be detected and prevented from disrupting network operation, a problem particularly hard to address in the life-critical VN environment. Existing networks rely mainly on node certificate revocation for attacker eviction, but the lack of an omnipresent infrastructure in VNs may unacceptably delay the retrieval of the most recent and relevant revocation information; this will especially be the case in the early deployment stages of such a highly volatile and large-scale system. In this paper, we address this specific problem. We propose protocols, as components of a framework, for the identification and local containment of misbehaving or faulty nodes, and then for their eviction from the system. We tailor our design to the VN characteristics and analyze our system. Our results show that the distributed approach to contain nodes and contribute to their eviction is efficiently feasible and achieves a sufficient level of robustness.

433 citations


Cites background from "Statistical approaches to DDoS atta..."

  • ...[35], [36], but mainly in the context of the wired Internet....

    [...]

Journal ArticleDOI
TL;DR: Although each detector shows promise in limited testing, none completely solve the detection problem and combining various approaches with experienced network operators most likely produce the best results.
Abstract: Denial-of-service (DoS) detection techniques - such as activity profiling, change-point detection, and wavelet-based signal analysis - face the considerable challenge of discriminating network-based flooding attacks from sudden increases in legitimate activity or flash events. This survey of techniques and testing results provides insight into our ability to successfully identify DoS flooding attacks. Although each detector shows promise in limited testing, none completely solve the detection problem. Combining various approaches with experienced network operators most likely produce the best results.

421 citations

Journal ArticleDOI
TL;DR: Two new information metrics such as the generalized entropy metric and the information distance metric are proposed to detect low-rate DDoS attacks by measuring the difference between legitimate traffic and attack traffic.
Abstract: A low-rate distributed denial of service (DDoS) attack has significant ability of concealing its traffic because it is very much like normal traffic. It has the capacity to elude the current anomaly-based detection schemes. An information metric can quantify the differences of network traffic with various probability distributions. In this paper, we innovatively propose using two new information metrics such as the generalized entropy metric and the information distance metric to detect low-rate DDoS attacks by measuring the difference between legitimate traffic and attack traffic. The proposed generalized entropy metric can detect attacks several hops earlier (three hops earlier while the order α = 10 ) than the traditional Shannon metric. The proposed information distance metric outperforms (six hops earlier while the order α = 10) the popular Kullback-Leibler divergence approach as it can clearly enlarge the adjudication distance and then obtain the optimal detection sensitivity. The experimental results show that the proposed information metrics can effectively detect low-rate DDoS attacks and clearly reduce the false positive rate. Furthermore, the proposed IP traceback algorithm can find all attacks as well as attackers from their own local area networks (LANs) and discard attack traffic.

351 citations

Proceedings ArticleDOI
20 Oct 2008
TL;DR: This work considers two classes of distributions: flow-header features (IP addresses, ports, and flow-sizes), and behavioral features (degree distributions measuring the number of distinct destination/source IPs that each host communicates with) and observes that the timeseries of entropy values of the address and port distributions are strongly correlated with each other and provide very similar anomaly detection capabilities.
Abstract: Entropy-based approaches for anomaly detection are appealing since they provide more fine-grained insights than traditional traffic volume analysis. While previous work has demonstrated the benefits of entropy-based anomaly detection, there has been little effort to comprehensively understand the detection power of using entropy-based analysis of multiple traffic distributions in conjunction with each other. We consider two classes of distributions: flow-header features (IP addresses, ports, and flow-sizes), and behavioral features (degree distributions measuring the number of distinct destination/source IPs that each host communicates with). We observe that the timeseries of entropy values of the address and port distributions are strongly correlated with each other and provide very similar anomaly detection capabilities. The behavioral and flow size distributions are less correlated and detect incidents that do not show up as anomalies in the port and address distributions. Further analysis using synthetically generated anomalies also suggests that the port and address distributions have limited utility in detecting scan and bandwidth flood anomalies. Based on our analysis, we discuss important implications for entropy-based anomaly detection.

328 citations


Cites methods from "Statistical approaches to DDoS atta..."

  • ...Categories and Subject Descriptors C.2.3 [Computer-Communication-Networks]: Network Operations network management, network monitoring General Terms Management, Measurement Keywords Entropy, Anomaly Detection 1....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This final installment of the paper considers the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now.
Abstract: In this final installment of the paper we consider the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now. To a considerable extent the continuous case can be obtained through a limiting process from the discrete case by dividing the continuum of messages and signals into a large but finite number of small regions and calculating the various parameters involved on a discrete basis. As the size of the regions is decreased these parameters in general approach as limits the proper values for the continuous case. There are, however, a few new effects that appear and also a general change of emphasis in the direction of specialization of the general results to particular cases.

65,425 citations

Book
01 Jan 1948
TL;DR: The Mathematical Theory of Communication (MTOC) as discussed by the authors was originally published as a paper on communication theory more than fifty years ago and has since gone through four hardcover and sixteen paperback printings.
Abstract: Scientific knowledge grows at a phenomenal pace--but few books have had as lasting an impact or played as important a role in our modern world as The Mathematical Theory of Communication, published originally as a paper on communication theory more than fifty years ago. Republished in book form shortly thereafter, it has since gone through four hardcover and sixteen paperback printings. It is a revolutionary work, astounding in its foresight and contemporaneity. The University of Illinois Press is pleased and honored to issue this commemorative reprinting of a classic.

10,215 citations

Proceedings ArticleDOI
30 Aug 1999
TL;DR: These power-laws hold for three snapshots of the Internet, between November 1997 and December 1998, despite a 45% growth of its size during that period, and can be used to generate and select realistic topologies for simulation purposes.
Abstract: Despite the apparent randomness of the Internet, we discover some surprisingly simple power-laws of the Internet topology. These power-laws hold for three snapshots of the Internet, between November 1997 and December 1998, despite a 45% growth of its size during that period. We show that our power-laws fit the real data very well resulting in correlation coefficients of 96% or higher.Our observations provide a novel perspective of the structure of the Internet. The power-laws describe concisely skewed distributions of graph properties such as the node outdegree. In addition, these power-laws can be used to estimate important parameters such as the average neighborhood size, and facilitate the design and the performance analysis of protocols. Furthermore, we can use them to generate and select realistic topologies for simulation purposes.

5,023 citations


"Statistical approaches to DDoS atta..." refers background in this paper

  • ...Like many network characteristics [2], source address frequency for this trace follows roughly a power-law distribution, so the bins of exponentially increasing size have roughly equal frequencies....

    [...]

Proceedings Article
12 Nov 1999
TL;DR: Snort provides a layer of defense which monitors network traffic for predefined suspicious activity or patterns, and alert system administrators when potential hostile traffic is detected.
Abstract: Network intrusion detection systems (NIDS) are an important part of any network security architecture. They provide a layer of defense which monitors network traffic for predefined suspicious activity or patterns, and alert system administrators when potential hostile traffic is detected. Commercial NIDS have many differences, but Information Systems departments must face the commonalities that they share such as significant system footprint, complex deployment and high monetary cost. Snort was designed to address these issues.

3,490 citations


"Statistical approaches to DDoS atta..." refers methods in this paper

  • ...To evaluate the DDoS attack detection methods described in Section 2 under realistic conditions, we implemented prototype detector modules as plug-ins for Snort, the popular, open-source network intrusion detection system [13], [14]....

    [...]

Frequently Asked Questions (2)
Q1. What contributions have the authors mentioned in the paper "Statistical approaches to ddos attack detection and response" ?

This paper presents methods to identify DDoS attacks by computing entropy and frequency-sorted distributions of selected packet attributes. The authors also describe their detection-response prototype and how the detectors can be extended to make effective response The results indicate that these methods can be effective against current attacks and suggest directions for improving detection of more stealthy attacks. 

Future research and development will focus on tighter integration of detection and response modules. In the current implementation, detectors generate concise recommended rules for responders to impose, and there is no further detector/responder coordination. By implementing detection and response methods on this platform and testing their performance, the authors can validate the claim that they are appropriate for use in future high-speed routers.