scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Adaptive network intrusion detection system using a hybrid approach

TL;DR: An adaptive network intrusion detection system, that uses a two stage architecture, where in the first stage a probabilistic classifier is used to detect potential anomalies in the traffic and in the second stage a HMM based traffic models are used to narrow down the potential attack IP addresses.
Abstract: Any activity aimed at disrupting a service or making a resource unavailable or gaining unauthorized access can be termed as an intrusion. Examples include buffer overflow attacks, flooding attacks, system break-ins, etc. Intrusion detection systems (IDSs) play a key role in detecting such malicious activities and enable administrators in securing network systems. Two key criteria should be met by an IDS for it to be effective: (i) ability to detect unknown attack types, (ii) having very less miss classification rate. In this paper we describe an adaptive network intrusion detection system, that uses a two stage architecture. In the first stage a probabilistic classifier is used to detect potential anomalies in the traffic. In the second stage a HMM based traffic model is used to narrow down the potential attack IP addresses. Various design choices that were made to make this system practical and difficulties faced in integrating with existing models are also described. We show that this system achieves good performance empirically.

Content maybe subject to copyright    Report

Adaptive Network Intrusion Detection System using
a Hybrid Approach
R Rangadurai Karthick
Department of Computer Science
and Engineering
IIT Madras, India
ranga@cse.iitm.ac.in
Vipul P. Hattiwale
Department of Computer Science
and Engineering
IIT Madras, India
vipul.hattiwale@gmail.com
Balaraman Ravindran
Department of Computer Science
and Engineering
IIT Madras, India
ravi@cse.iitm.ac.in
Abstract—Any activity aimed at disrupting a service or making
a resource unavailable or gaining unauthorized access can be
termed as an intrusion. Examples include buffer overflow attacks,
flooding attacks, system break-ins, etc. Intrusion detection sys-
tems (IDSs) play a key role in detecting such malicious activities
and enable administrators in securing network systems. Two
key criteria should be met by an IDS for it to be effective: (i)
ability to detect unknown attack types, (ii) having very less miss
classification rate.
In this paper we describe an adaptive network intrusion
detection system, that uses a two stage architecture. In the first
stage a probabilistic classifier is used to detect potential anomalies
in the traffic. In the second stage a HMM based traffic model is
used to narrow down the potential attack IP addresses. Various
design choices that were made to make this system practical
and difficulties faced in integrating with existing models are also
described. We show that this system achieves good performance
empirically.
I. INTRODUCTION
Any attempt made to gain unauthorized access to a com-
puter or disrupt the availability of a service/resource is termed
as an intrusion. Intrusion Detection Systems (IDS) refers to
a software or a system built to detect intrusions. In general,
detection mechanism used by IDS can be classified into two
major categories.
1) Signature based detection: Models built from well
known attack types, i.e., from already known attack
patterns.
2) Anomaly based detection: Modeled using normal traffic
and deviation from this profile is considered anomalous.
Anomaly based techniques are preferred over signature
based techniques owing to their ability to detect novel intru-
sions. Signature based techniques
The key aspects that we considered for building an anomaly
based IDS are
Choice of attributes: The model is proposed to be imple-
mented in a web server or at network gateway, where the
inflow of traffic is huge. We considered to use information
available from a packet’s header as features to build the
model. This way we don’t incur much overhead on the
server and does not become a bottleneck.
Handling infrequent patterns: All normal network traffic
do not follow uniform flow pattern. Any model proposed
should be able to handle those normal traffic that are
infrequent. Our model uses boosting techniques to learn
over these infrequent patterns in order to classify them
correctly.
False alarm rate: The main drawback of an anomaly
based detection is the high false alarm rate. Boosting
technique used for the proposed model takes care of this
problem and had very low false alarm rate.
We use Hidden Markov Model (HMM), a generative model,
for modeling input data. The model is proposed to profile TCP
based communication channel for intrusions. Any normal TCP
connection would have three phases during their connection
time, i.e., connection establishment, data transmission and
connection termination phase. There is an inherent sequential
nature in such mode of communication and makes it conve-
nient for us to model them using HMM, which can exploit
this nature of TCP traffic to build models.
A brief description of the HMM model is as follows. The
first step in our approach is source separating traffic. It is
performed on both training and testing traffic in order to
preserve the sequence information of TCP traffic. HMM is
used to profile source separated clean traffic and the model
thus built is used to classify test traffic. This approach had
high attack detection rate and also high false positive rate.
High false positive rate corresponds to flagging legitimate
traffic as attack and cannot be accepted when designing such
systems. Hence various design choices, port based separation,
cascading of HMMs, were considered for traffic profiling.
These approaches increased the accuracy of the classifier
with very low false positive rate. Intrusion detection data set
released by DARPA [3] is used to train and test HMM models.
The HMM based model had few shortcomings when we
tried to implementing it in real time. This lead us to look for
alternative methods that could be compounded with HMMs
to make it work in real time. Vijayasarathy et al. [2] had
proposed a Naive Bayesian (NB) based model for profiling
traffic. This approach handles the skewness in network traffic,
i.e., the amount of anomalous traffic to a server is very low
compared to that of clean traffic. NB based model runs faster,978-1-4673-0298-2/12/$31.00
c
2012 IEEE

close to line speed, and computes probability of occurrence of
groups of incoming packets, windows.
The NB model is used for online classification and HMM
model is used for offline analysis of traffic. Traffic that were
flagged as anomalous by the NB model were fed into an offline
HMM that computed the probability of connections present in
the window. Thus combining NB and HMM models we form
a hybrid model, where in NB model computes probability of
occurrence of windows and HMM computes the probability of
each connection within the window. This way the output from
the HMM, list of attacking IPs, can be used as an update
to firewall on what IPs to block. HMM now can be used
to generate IP blacklist and makes the hybrid model more
efficient.
The rest of the paper is organized as follows. A brief
description of HMM is presented in Section 2. Section 3
describes our proposed HMM model and preliminary results
obtained are presented in Section 4. Section 5 explains the
problems that we anticipated that could be faced while imple-
menting this system in real time. Section 6 describes hybrid
model and Section 7 describes various experiments and results
obtained. Section 8 describes related work that has been done
by research community in this area.
II. HIDDEN MARKOV MODEL
HMM is a generative model that can model data which
is sequential in nature. It is used to model data where the
assumption of i.i.d. is too restrictive, like speech processing
applications. A detailed tutorial on HMM is available in [1].
Markov Property: Consider a system with N states and at
discrete time intervals, there is transition among states. Let
these instances be t, t = 1, 2, 3, · · · . Any process is Markovian
if the conditional probability of future states, given the present
state and past states, depend only upon the present state. In
order to predict future state, the process by which the current
state is obtained does not matter, i.e.,
P r[q
t
= S
i
|q
t1
= S
j
, q
t2
= S
k
, · · · ]
= P r[q
t
= S
i
|q
t1
= S
j
]
(1)
We have used HMM that follows the above first order
Markov property.
In a HMM, the states and their transitions are not visible.
Instead an output symbol, from a discrete set of symbols, is
emitted during every transition. This sequence of symbols are
the observables used to train a HMM. The following figure
explains this.
Definition of a HMM:
HMM [λ] is a five tuple, i.e., λ = [N, M, A, B, π].
The parameters of the model are
N, number of states in the model, S = {S
1
, S
2
, · · · , S
N
}.
M, number of observation symbols, V = {V
1
, V
2
, · · · , V
M
}.
Fig. 1. HMM Architecture
A, state transition probability matrix, A={a
ij
}, where
a
ij
= P r[q
t+1
= S
j
|q
t
= S
i
] 1 i, j N
(2)
It is a N*N matrix.
B, observation symbol probability matrix, B={b
j
(k)},
where
b
j
(k) = P r[v
k
at t|q
t
= S
j
] 1 j N
1 k M
(3)
It is a N*M matrix.
π, initial state probability matrix, π = {π
i
}, where
π
i
= P r[q
1
= S
i
] 1 i N
(4)
It is a 1*N matrix.
Algorithms for HMM: The following two algorithms are
used to model and use the HMM.
1) Baum-Welch algorithm is used to learn the parameters
of the model, {A, B, π}, from input data.
2) Forward-Backward algorithm is used to learn the prob-
ability of occurrence of an observation sequence given
the model, P[O|λ].
III. DESIGN CHOICES
Web servers in general use Transmission Control Protocol
(TCP) for communication between clients and server. TCP
is a state based protocol, i.e., any TCP connection would
progress through set of state transitions during its life time.
This inherent stateful and temporal nature of TCP traffic could
be captured well by using a HMM based classifier. This lead us
to use HMM as our basic building block in our system design.
In the remainder of this section, we describe parameters that
were used to build our model and other design considerations
that shaped our model design.

A. Choosing Parameters
The key aspect in building a HMM is to decide the states
and symbols that are to be used to build the model. Choosing
right set of attributes for a model is very important as this
step would ensure effective usage of available data. For
our experiments, we use TCP header information present in
packets as features.
States of the model are called hidden or latent variables and
are used to describe the underlying distribution generating the
data. In our approach, states of the HMM do not correspond
to actual TCP states. They are used to model the HMM to best
explain the traffic. They do not have direct physical signifi-
cance. For example, network traffic can be assumed to consist
of traffic from legitimate and malicious users. Transition from
one state to another can be considered equivalent to switch
from traffic between malicious and legitimate users.
Next we had to decide upon what could be used to represent
symbols in our HMM model. We use TCP flags as symbols for
the HMM model, following Vijayasarathy et al. [2]. The other
parameters of the HMM model - π, A, and B are estimated
using Baum-Welch algorithm.
B. Initial Approach
Building anomaly based classifier involves two phases -
training and testing. During the training phase, the classifier
is made to profile over clean traffic, i.e., traffic stream which
is devoid of any malicious traffic stream. During the testing
phase, traffic which were not used during training are used to
measure the performance of the model built. The classifier
flags any traffic that deviates from clean traffic profile as
suspicious. The intuition behind this approach is that clean
traffic and malicious traffic are not generated from the same
distribution.
Training phase of our algorithm begins with source separat-
ing training traffic into separate streams. All packets between
a unique source/destination IP pair constitute a stream. Each
stream consist of series of TCP flags that were used in the
packets throughout the connection. Then a single HMM model
is used to learn the characteristics of all streams to the server.
The HMM model takes these TCP flags as observables
and other parameters of the model can be computed from
them. Upon analyzing the traffic data, we found that only few
flags were used in general for most TCP communication. We
associated a number with each flag and a connection with
sequence of flags is converted into a sequence of numbers.
HMM model is trained over this sequence of numbers. The
frequently used TCP flags and the unique ID which we used
for our modeling are as follows.
SYN - 0
SYN/ACK - 1
ACK - 2
PUSH/ACK -3
FIN/ACK - 4
RST - 5
other TCP flags - 6
The same procedure is followed in the testing phase. The TCP
flag sequence is converted into a sequence of numbers and the
probability of occurrence of this sequence is tested over the
model. Since states of the model does not correspond to actual
TCP states, the number of states can be chosen empirically.
The testing phase of the above said approach is depicted in
Figure 2.
unique
IP pair
unique
IP pair
unique
IP pair
unique
IP pair
HMM
model
Incoming
traffic to
server
Source Separation
Attack
traffic
Legitimate
traffic
Fig. 2. Initial Approach
DARPA data set for intrusion detection [3] is used for
training and testing our HMM model. Preliminary results
obtained for the above said approach were not satisfactory.
The model had very high false positive rate, i.e., clean traffic
stream were also being flagged as attack. The classifier did
not succeed in discriminating between good traffic and bad
traffic. This low performance might be attributed to using just
one HMM to learn all clean traffic profile. A single HMM
could not capture all the characteristics of clean traffic that
were used for training.
C. Alternate Design
In order to overcome the above said shortcoming, we
performed source separation on training/testing traffic ac-
cording to destination ports of the server and then upon
source/destination IP address. Instead of using a single HMM
to lean all traffic coming to a server, we used separate HMMs
for each frequently occurring server port. The reasoning
behind such an approach is that not all traffic belonging to
different applications behave in the same way. For instance,
different traffic streams belonging to a particular application
port, say port 25 (SMTP), have similar characteristics than to
traffic at port 20 (FTP). This approach improved the results
drastically, i.e., the model had higher accuracy and lower false
positive rate compared to the single HMM approach.
The implementation details of this model are as follows.
Training traffic to the server is first separated based upon
destination port number of packets. Traffic to particular ports
are then source separated and trained by separate models for
each port. Ports which have higher traffic, like ports for HTTP,
telnet, FTP, etc., were modeled with separate HMM models.
Traffic to other infrequent ports were modeled by a separate
model. The testing phase proceeds the same way. Testing
traffic is first separated based upon ports and then source

separated and tested by corresponding HMM model for the
port. Figure 3 describes this approach.
Fig. 3. Layered Model
Even though port wise separation approach had better
results that single model approach, the false positive rate were
still high, almost 10% of training traffic were flagged as attack.
Any practical system designed to detect intrusions should
have low false positive rate, i.e., rate at which a legitimate
user is wrongly classified as attack should be very low. It
is able to classify most of the frequently occurring positive
traffic correctly but it is not able to correctly classify positive
traffic that were infrequent. Infrequent traffic that were clean
or positive were also flagged as attack. We made this model
as our basic classifier model and it required us explore other
strategies that would improve the performance of our base
classifier.
D. Cascaded HMMs
The positive traffic that were wrongly classified by the
above approach were traffic streams that were not so frequent.
This can be attributed to those traffic streams which had
very low probabilities in the training phase of the above
approach. In order to overcome this high false positive rate,
we employed multi-stage combination of models to improve
the base classifier’s performance. We employed cascading of
base classifiers into several layers to improve performance.
Figure 4 describes the cascading of models.
Implementation details of this approach: Low probability
legitimate streams that were flagged suspicious by all the base
classifiers are fed as input to a separate HMM model. This
HMM trains on all the infrequently occurring streams and
builds a model. Traffic streams that have low probabilities in
this model are fed into next layer of HMM model for training.
Fig. 4. Cascaded HMM design
The above process of adding new HMMs, i.e., cascading
HMMs, can be continued until addition of a new model makes
no improvement to the accuracy of the model.
The usage of traffic streams from different protocols for
the first layer of cascaded HMM might be counter-intuitive,
since we perform protocol based traffic separation in the
first step before feeding traffic into HMM models for each
protocol. We observed that most of training traffic connections
to frequently occurring ports were correctly classified by their
respective HMM model. The number of connections that were
wrongly flagged as anomalous were very less. But this was
not the case with the HMM model for infrequent port traffic.
The traffic connections that occur infrequently were the ones
wrongly flagged by initial HMMs. Since the connections were
anyway infrequent in their respective protocol, combining
them together did not reduce the performance of the model.
Instead, it improved the accuracy of the HMM model.
The HMM model can be extended to having separate levels
of cascading for each protocol. Since the data available for
training and testing were limited, we performed a combined
layer of cascading for all protocols.
IV. PRELIMINARY RESULTS
Building any classifier involves two phases, i.e., training
and testing phases. Training phase in our approach involves
learning the parameters of the model from a clean traffic
trace. HMM profiles this data and uses this information to test
incoming traffic. During the testing phase, traffic that were not
used for training are tested against the model learnt. To build a
classifier we need to have labeled data for training and testing.
Data sets released by DARPA[3] were used to train and test
our classifier.
Experiments
The experiments that were conducted are described as
follows.

# states Connection Separation Separate Models for Protocols Boosting Accuracy (%) False Alarm Rate (%)
5 Just IP No No 81.75 19.63
9 Just IP No No 85.14 15.05
5 IP & Port Yes No 91.49 9.49
9 IP & Port Yes No 92.27 8.49
5 IP & Port Yes Yes 96.96 2.89
9 IP & Port Yes Yes 97.1 2.71
TABLE I
RESULTS ON DARPA DATA SET
1) Single HMM model: Training traffic is separated accord-
ing to source/destination IP pair and trained with a single
HMM model. In the testing phase, source separated connec-
tions were tested against the learnt model. The performance
of the model is bad since it had very high false positive
rate. Probable reason for the failure of the model could be
that a single HMM could not capture all possible traffic
characteristics. High false positive rate can be alleviated by
the following approach.
2) Multiple HMM models: We performed source separation
both on IP and port information of source and destination.
Separate HMMs were used to train/test connections pertaining
to different protocols. Protocols with large amount of incoming
traffic were trained separately, while other infrequent ports
were trained separately. This approach reduced the false pos-
itive rate and we made this type of source separation as our
basic step for building HMM.
3) Cascading of HMMs: In order to improve the per-
formance of the above approach, we employed boosting.
HMM models were cascaded into several layers to model low
probability traffic. The results reported for our experiments are
using two layers of HMM model for cascading.
We used two days of clean traffic data from DARPA data
set for training and the rest of the traffic from other days were
used for testing the learnt model. This way we don’t overfit
the training process. Table I describe the performance of our
model on a particular server in DARPA data.
Number of states for the model
The number of states to be used for HMM could be
determined experimentally. Using 9 or 10 states for the model
gave us good results for DARPA data set. We tried using
higher number of states for HMM and the results obtained
were similar and did not improve the performance any further.
Hence we have reported the results on using 9 states for HMM
model.
Attacks detected by HMM
The following attacks present in the DARPA data set were
detected by HMM model.
neptune - Syn flood denial of service attack on one or
more ports.
ipsweep - Surveillance sweep performing ping on multi-
ple host addresses.
portsweep - Surveillance sweep through many ports to
determine which services are active on a single host.
satan - Network probing tool to exploiting well-known
weaknesses.
nmap - Network mapping using the nmap tool.
Auckland Data Set
We tried our cascaded HMM experiments on Auckland
IV[4] data set. In the training phase, HMM model is trained
with clean HTTP traffic from DARPA data. For testing pur-
pose, HTTP traffic to various servers in Auckland data set
were considered. Auckland data set is not a labeled data set.
Hence the testing results had to be cross checked manually.
HTTP sequences that were flagged as anomalous were of the
following types.
Reset Attacks
Short Connections
Connections that were too short were flagged as anomalous
by the model. The reason for very short connection length
could be abrupt end of connection. HMM model with just 5
states is sufficient for classifying Auckland data set.
V. MOTIVATION FOR HYBRID APPROACH
The goal of the work is to implement suitable models that
can function effectively in real time. When implementing the
above model into a real-time system and it in turn had the
following pitfalls [6].
Source separation of incoming traffic is the first and fore-
most step in our design. This way, the model keeps track of
all incoming IP addresses. But then, the problem of IP address
spoofing could tax our proposed model. Assume an incoming
packet to have spoofed IP address. The server replies to it
and allocates resource for this IP address. It is highly unlikely
that the connection established by a spoofed IP address would
proceed any further. This would make the server to wait
until time out period and to reclaim allocated resource. The
above scenario could be repeated by attackers and result in
exhausting the resources of a server.
The second issue to consider is the typical length of a
connection. The DARPA data used for training and testing
our model had information about entire connections. But in
reality, we have no way of telling when a connection would
end. The computations performed had complete end to end
connection data, which is quite impossible in reality. If this
model were to be implemented in a server, then the server has
to have separate buffers for each incoming new connection.
This again would end up in using all of server’s available
buffer to store packets. We cannot decide on how much buffer

Citations
More filters
Journal ArticleDOI
TL;DR: This paper proposes an intrusion detection system (IDS) based on a deep convolutional neural network (DCNN) to protect the CAN bus of the vehicle and demonstrates that the proposed IDS has significantly low false negative rates and error rates when compared to the conventional machine-learning algorithms.

232 citations

Journal ArticleDOI
TL;DR: This work develops a DL-based intrusion model based on a Convolutional Neural Network and evaluates its performance through comparison with an Recurrent Neural Network (RNN) and suggests the optimal CNN design for the better performance through numerous experiments.
Abstract: As cyberattacks become more intelligent, it is challenging to detect advanced attacks in a variety of fields including industry, national defense, and healthcare. Traditional intrusion detection systems are no longer enough to detect these advanced attacks with unexpected patterns. Attackers bypass known signatures and pretend to be normal users. Deep learning is an alternative to solving these issues. Deep Learning (DL)-based intrusion detection does not require a lot of attack signatures or the list of normal behaviors to generate detection rules. DL defines intrusion features by itself through training empirical data. We develop a DL-based intrusion model especially focusing on denial of service (DoS) attacks. For the intrusion dataset, we use KDD CUP 1999 dataset (KDD), the most widely used dataset for the evaluation of intrusion detection systems (IDS). KDD consists of four types of attack categories, such as DoS, user to root (U2R), remote to local (R2L), and probing. Numerous KDD studies have been employing machine learning and classifying the dataset into the four categories or into two categories such as attack and benign. Rather than focusing on the broad categories, we focus on various attacks belonging to same category. Unlike other categories of KDD, the DoS category has enough samples for training each attack. In addition to KDD, we use CSE-CIC-IDS2018 which is the most up-to-date IDS dataset. CSE-CIC-IDS2018 consists of more advanced DoS attacks than that of KDD. In this work, we focus on the DoS category of both datasets and develop a DL model for DoS detection. We develop our model based on a Convolutional Neural Network (CNN) and evaluate its performance through comparison with an Recurrent Neural Network (RNN). Furthermore, we suggest the optimal CNN design for the better performance through numerous experiments.

160 citations

Proceedings ArticleDOI
28 Sep 2015
TL;DR: This paper uses Decision Tree (J48) algorithm to classify the network packet that can be used for NIDS, and generates rules that works with 97.2% correctness for detecting the connection i.e., no attack, known attack or unknown attack.
Abstract: As the number of cyber attacks have increased, detecting the intrusion in networks become a very tough job. For network intrusion detection system (NIDS), many data mining and machine learning techniques are used. However, for evaluation, most of the researchers used KDD Cup 99 data set, which has widely criticized for not showing current network situation. In this paper we used a new labelled network dataset, called Kyoto 2006+ dataset. In Kyoto 2006+ data set, every instant is labelled as normal (no attack), attack (known attack) and unknown attack. We use Decision Tree (J48) algorithm to classify the network packet that can be used for NIDS. For training and testing we used 134665 network instances. The generated rules works with 97.2% correctness for detecting the connection i.e., no attack, known attack or unknown attack.

121 citations


Cites methods from "Adaptive network intrusion detectio..."

  • ...In the same year, R Rangaduari [9] introduces a Adaptive NIDS using a Hybrid Approach which uses two stage approach: in the first stage, a probabilistic classifier is used where as in second stage, a HMM based traffic model is used....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors present a formalized adaptive open world framework for stealth malware recognition and relate it mathematically to research from other machine learning domains and suggest that several flawed assumptions inherent to most recognition algorithms prevent a direct mapping between the stealth malware detection problem and a machine learning solution.
Abstract: As our professional, social, and financial existences become increasingly digitized and as our government, healthcare, and military infrastructures rely more on computer technologies, they present larger and more lucrative targets for malware. Stealth malware in particular poses an increased threat because it is specifically designed to evade detection mechanisms, spreading dormant, in the wild for extended periods of time, gathering sensitive information or positioning itself for a high-impact zero-day attack. Policing the growing attack surface requires the development of efficient anti-malware solutions with improved generalization to detect novel types of malware and resolve these occurrences with as little burden on human experts as possible. In this paper, we survey malicious stealth technologies as well as existing solutions for detecting and categorizing these countermeasures autonomously. While machine learning offers promising potential for increasingly autonomous solutions with improved generalization to new malware types, both at the network level and at the host level, our findings suggest that several flawed assumptions inherent to most recognition algorithms prevent a direct mapping between the stealth malware recognition problem and a machine learning solution. The most notable of these flawed assumptions is the closed world assumption: that no sample belonging to a class outside of a static training set will appear at query time. We present a formalized adaptive open world framework for stealth malware recognition and relate it mathematically to research from other machine learning domains.

118 citations

Journal ArticleDOI
TL;DR: A survey on darknet finds that Honeyd is probably the most practical tool to implement darknet sensors, and future deployment of darknet will include mobile-based VOIP technology, and specific darknet areas that require a significantly greater amount of attention from the research community are identified.
Abstract: Today, the Internet security community largely emphasizes cyberspace monitoring for the purpose of generating cyber intelligence. In this paper, we present a survey on darknet. The latter is an effective approach to observe Internet activities and cyber attacks via passive monitoring. We primarily define and characterize darknet and indicate its alternative names. We further list other trap-based monitoring systems and compare them to darknet. Moreover, in order to provide realistic measures and analysis of darknet information, we report case studies, namely, Conficker worm in 2008 and 2009, Sality SIP scan botnet in 2011, and the largest amplification attack in 2014. Finally, we provide a taxonomy in relation to darknet technologies and identify research gaps that are related to three main darknet categories: deployment, traffic analysis, and visualization. Darknet projects are found to monitor various cyber threat activities and are distributed in one third of the global Internet. We further identify that Honeyd is probably the most practical tool to implement darknet sensors, and future deployment of darknet will include mobile-based VOIP technology. In addition, as far as darknet analysis is considered, computer worms and scanning activities are found to be the most common threats that can be investigated throughout darknet; Code Red and Slammer/Sapphire are the most analyzed worms. Furthermore, our study uncovers various lacks in darknet research. For instance, less than 1% of the contributions tackled distributed reflection denial of service (DRDoS) amplification investigations, and at most 2% of research works pinpointed spoofing activities. Last but not least, our survey identifies specific darknet areas, such as IPv6 darknet, event monitoring, and game engine visualization methods that require a significantly greater amount of attention from the research community.

95 citations


Cites methods from "Adaptive network intrusion detectio..."

  • ...[141] use probability to describe an adaptive network-based IDS with a two-stage architecture....

    [...]

References
More filters
Journal ArticleDOI
Lawrence R. Rabiner1
01 Feb 1989
TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
Abstract: This tutorial provides an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and gives practical details on methods of implementation of the theory along with a description of selected applications of the theory to distinct problems in speech recognition. Results from a number of original sources are combined to provide a single source of acquiring the background required to pursue further this area of research. The author first reviews the theory of discrete Markov chains and shows how the concept of hidden states, where the observation is a probabilistic function of the state, can be used effectively. The theory is illustrated with two simple examples, namely coin-tossing, and the classic balls-in-urns system. Three fundamental problems of HMMs are noted and several practical techniques for solving these problems are given. The various types of HMMs that have been studied, including ergodic as well as left-right models, are described. >

21,819 citations

Proceedings ArticleDOI
12 Jun 2001
TL;DR: An overview of the research in real time data mining-based intrusion detection systems (IDS) and an architecture consisting of sensors, detectors, a data warehouse, and model generation components is presented that improves the efficiency and scalability of the IDS.
Abstract: We present an overview of our research in real time data mining-based intrusion detection systems (IDSs). We focus on issues related to deploying a data mining-based IDS in a real time environment. We describe our approaches to address three types of issues: accuracy, efficiency, and usability. To improve accuracy, data mining programs are used to analyze audit data and extract features that can distinguish normal activities from intrusions; we use artificial anomalies along with normal and/or intrusion data to produce more effective misuse and anomaly detection models. To improve efficiency, the computational costs of features are analyzed and a multiple-model cost-based approach is used to produce detection models with low cost and high accuracy. We also present a distributed architecture for evaluating cost-sensitive models in real-time. To improve usability, adaptive learning algorithms are used to facilitate model construction and incremental updates; unsupervised anomaly detection algorithms are used to reduce the reliance on labeled data. We also present an architecture consisting of sensors, detectors, a data warehouse, and model generation components. This architecture facilitates the sharing and storage of audit data and the distribution of new or updated models. This architecture also improves the efficiency and scalability of the IDS.

272 citations

Proceedings ArticleDOI
06 Jan 2003
TL;DR: This paper describes an approach using hidden Markov models (HMM) to detect complex Internet attacks, and shows that HMMs perform generally better than decision trees and substantially better than neural networks in detecting these complex intrusions.
Abstract: This paper describes an approach using hidden Markov models (HMM) to detect complex Internet attacks. These attacks consist of several steps that may occur over an extended period of time. Within each step, specific actions may be interchangeable. A perpetrator may deliberately use a choice of actions within a step to mask the intrusion. In other cases, alternate action sequences may be random (due to noise) or because of lack of experience on the part of the perpetrator. For an intrusion detection system to be effective against complex Internet attacks, it must be capable of dealing with the ambiguities described above. We describe research results concerning the use of HMMs as a defense against complex Internet attacks. We describe why HMMs are particularly useful when there is an order to the actions constituting the attack (that is, for the case where one action must precede or follow another action in order to be effective). Because of this property, we show that HMMs are well suited to address the multi-step attack problem. In a direct comparison with two other classic machine learning techniques, decision trees and neural nets, we show that HMMs perform generally better than decision trees and substantially better than neural networks in detecting these complex intrusions.

155 citations

Proceedings ArticleDOI
16 Mar 2008
TL;DR: The two commonly used signature-based IDSs, Snort and Cisco IDS, and two anomaly detectors, the PHAD and the ALAD, are made use of for this evaluation purpose and the results support the usefulness of DARPA dataset for IDS evaluation.
Abstract: The MIT Lincoln Laboratory IDS evaluation methodology is a practical solution in terms of evaluating the performance of Intrusion Detection Systems, which has contributed tremendously to the research progress in that field. The DARPA IDS evaluation dataset has been criticized and considered by many as a very outdated dataset, unable to accommodate the latest trend in attacks. Then naturally the question arises as to whether the detection systems have improved beyond detecting these old level of attacks. If not, is it worth thinking of this dataset as obsolete? The paper presented here tries to provide supporting facts for the use of the DARPA IDS evaluation dataset. The two commonly used signature-based IDSs, Snort and Cisco IDS, and two anomaly detectors, the PHAD and the ALAD, are made use of for this evaluation purpose and the results support the usefulness of DARPA dataset for IDS evaluation.

120 citations

Proceedings ArticleDOI
27 Jun 2005
TL;DR: A novel approach based on the monitoring of incoming HTTP requests to detect attacks against Web servers through a Markovian model whose states and transitions between them are determined from the specification of the HTTP protocol.
Abstract: This paper presents a novel approach based on the monitoring of incoming HTTP requests to detect attacks against Web servers. The detection is accomplished through a Markovian model whose states and transitions between them are determined from the specification of the HTTP protocol while the probabilities of the symbols associated to the Markovian source are obtained during a training stage according to a set of attack-free requests for the target server. The experiments carried out show a high detection capability with low false positive rates at reasonable computation requirements.

59 citations

Frequently Asked Questions (13)
Q1. What contributions have the authors mentioned in the paper "Adaptive network intrusion detection system using a hybrid approach" ?

In this paper the authors describe an adaptive network intrusion detection system, that uses a two stage architecture. The authors show that this system achieves good performance empirically. In the first stage a probabilistic classifier is used to detect potential anomalies in the traffic. In the second stage a HMM based traffic model is used to narrow down the potential attack IP addresses. 

States of the model are called hidden or latent variables and are used to describe the underlying distribution generating the data. 

In order to overcome this high false positive rate, the authors employed multi-stage combination of models to improve the base classifier’s performance. 

The addition of HMM model to NB model is intended to narrow down on the attacking IPs present in flagged traffic rather than to improve the performance of it. 

HMM model would then perform source separation for the connections present in the flagged traffic and classifies the connections as either attack or normal. 

The TCP flag sequence is converted into a sequence of numbers and the probability of occurrence of this sequence is tested over the model. 

Implementation details of this approach: Low probability legitimate streams that were flagged suspicious by all the base classifiers are fed as input to a separate HMM model. 

In their implementation, if there were five consecutive attack flags raised by the NB model, and incoming traffic from then on would be buffered and fed as input to the HMM model. 

Any practical system designed to detect intrusions should have low false positive rate, i.e., rate at which a legitimate user is wrongly classified as attack should be very low. 

Using HMM with larger states gave us exact results and the number of states to be chosen for a server can be computed empirically. 

In order to overcome the above said shortcoming, the authors performed source separation on training/testing traffic according to destination ports of the server and then upon source/destination IP address. 

The number of windows to consider during time out mechanism is implementation specific, depending upon traffic characteristics of a server. 

The testing phase of the above said approach is depicted in Figure 2.DARPA data set for intrusion detection [3] is used for training and testing their HMM model.