scispace - formally typeset
Search or ask a question

Showing papers on "Intrusion detection system published in 2006"


Proceedings ArticleDOI
21 Mar 2006
TL;DR: A taxonomy of different types of attacks on machine learning techniques and systems, a variety of defenses against those attacks, and an analytical model giving a lower bound on attacker's work function are provided.
Abstract: Machine learning systems offer unparalled flexibility in dealing with evolving input in a variety of applications, such as intrusion detection systems and spam e-mail filtering. However, machine learning algorithms themselves can be a target of attack by a malicious adversary. This paper provides a framework for answering the question, "Can machine learning be secure?" Novel contributions of this paper include a taxonomy of different types of attacks on machine learning techniques and systems, a variety of defenses against those attacks, a discussion of ideas that are important to security for machine learning, an analytical model giving a lower bound on attacker's work function, and a list of open problems.

853 citations


Book ChapterDOI
20 Sep 2006
TL;DR: The nepenthes platform as discussed by the authors is a framework for large-scale collection of information on self-replicating malware in the wild, which emulate only the vulnerable parts of a service.
Abstract: Up to now, there is little empirically backed quantitative and qualitative knowledge about self-replicating malware publicly available. This hampers research in these topics because many counter-strategies against malware, e.g., network- and host-based intrusion detection systems, need hard empirical data to take full effect. We present the nepenthes platform, a framework for large-scale collection of information on self-replicating malware in the wild. The basic principle of nepenthes is to emulate only the vulnerable parts of a service. This leads to an efficient and effective solution that offers many advantages compared to other honeypot-based solutions. Furthermore, nepenthes offers a flexible deployment solution, leading to even better scalability. Using the nepenthes platform we and several other organizations were able to greatly broaden the empirical basis of data available about self-replicating malware and provide thousands of samples of previously unknown malware to vendors of host-based IDS/anti-virus systems. This greatly improves the detection rate of this kind of threat.

508 citations


Proceedings ArticleDOI
21 May 2006
TL;DR: This paper evaluates a new type of malicious software that gains qualitatively more control over a system, which is called a virtual-machine based rootkit (VMBR), and implements a defense strategy suitable for protecting systems against this threat.
Abstract: Attackers and defenders of computer systems both strive to gain complete control over the system. To maximize their control, both attackers and defenders have migrated to low-level, operating system code. In this paper, we assume the perspective of the attacker, who is trying to run malicious software and avoid detection. By assuming this perspective, we hope to help defenders understand and defend against the threat posed by a new class of rootkits. We evaluate a new type of malicious software that gains qualitatively more control over a system. This new type of malware, which we call a virtual-machine based rootkit (VMBR), installs a virtual-machine monitor underneath an existing operating system and hoists the original operating system into a virtual machine. Virtual-machine based rootkits are hard to detect and remove because their state cannot be accessed by software running in the target system. Further, VMBRs support general-purpose malicious services by allowing such services to run in a separate operating system that is protected from the target system. We evaluate this new threat by implementing two proof-of-concept VMBRs. We use our proof-of-concept VMBRs to subvert Windows XP and Linux target systems, and we implement four example malicious services using the VMBR platform. Last, we use what we learn from our proof-of-concept VMBRs to explore ways to defend against this new threat. We discuss possible ways to detect and prevent VMBRs, and we implement a defense strategy suitable for protecting systems against this threat.

500 citations


Proceedings ArticleDOI
03 Apr 2006
TL;DR: A stream-based overlay network (SBON) is described, a layer between a stream-processing system and the physical network that manages operator placement for stream- processing systems, which permits decentralized, large-scale multi-query optimization decisions.
Abstract: To use their pool of resources efficiently, distributed stream-processing systems push query operators to nodes within the network. Currently, these operators, ranging from simple filters to custom business logic, are placed manually at intermediate nodes along the transmission path to meet application-specific performance goals. Determining placement locations is challenging because network and node conditions change over time and because streams may interact with each other, opening venues for reuse and repositioning of operators. This paper describes a stream-based overlay network (SBON), a layer between a stream-processing system and the physical network that manages operator placement for stream-processing systems. Our design is based on a cost space, an abstract representation of the network and on-going streams, which permits decentralized, large-scale multi-query optimization decisions. We present an evaluation of the SBON approach through simulation, experiments on PlanetLab, and an integration with Borealis, an existing stream-processing engine. Our results show that an SBON consistently improves network utilization, provides low stream latency, and enables dynamic optimization at low engineering cost.

458 citations


Book ChapterDOI
20 Sep 2006
TL;DR: Anagram is presented, a content anomaly detector that models a mixture of high-order n-grams (n > 1) designed to detect anomalous and “suspicious” network packet payloads and is demonstrated that Anagram can identify anomalous traffic with high accuracy and low false positive rates.
Abstract: In this paper, we present Anagram, a content anomaly detector that models a mixture ofhigh-order n-grams (n > 1) designed to detect anomalous and “suspicious” network packet payloads. By using higher-order n-grams, Anagram can detect significant anomalous byte sequences and generate robust signatures of validated malicious packet content. The Anagram content models are implemented using highly efficient Bloom filters, reducing space requirements and enabling privacy-preserving cross-site correlation. The sensor models the distinct content flow of a network or host using a semi-supervised training regimen. Previously known exploits, extracted from the signatures of an IDS, are likewise modeled in a Bloom filter and are used during training as well as detection time. We demonstrate that Anagram can identify anomalous traffic with high accuracy and low false positive rates. Anagram's high-order n-gram analysis technique is also resilient against simple mimicry attacks that blend exploits with “normal” appearing byte padding, such as the blended polymorphic attack recently demonstrated in [1]. We discuss randomized n-gram models, which further raises the bar and makes it more difficult for attackers to build precise packet structures to evade Anagram even if they know the distribution of the local site content flow. Finally, Anagram's speed and high detection rate makes it valuable not only as a standalone sensor, but also as a network anomaly flow classifier in an instrumented fault-tolerant host-based environment; this enables significant cost amortization and the possibility of a “symbiotic” feedback loop that can improve accuracy and reduce false positive rates over time.

391 citations


Proceedings Article
31 Jul 2006
TL;DR: This paper presents a new approach to strengthen policy enforcement by augmenting security policies with information about the trustworthiness of data used in securitysensitive operations, and evaluated this technique using 9 available exploits involving several popular software packages containing the above types of vulnerabilities.
Abstract: Policy-based confinement, employed in SELinux and specification-based intrusion detection systems, is a popular approach for defending against exploitation of vulnerabilities in benign software. Conventional access control policies employed in these approaches are effective in detecting privilege escalation attacks. However, they are unable to detect attacks that "hijack" legitimate access privileges granted to a program, e.g., an attack that subverts an FTP server to download the password file. (Note that an FTP server would normally need to access the password file for performing user authentication.) Some of the common attack types reported today, such as SQL injection and cross-site scripting, involve such subversion of legitimate access privileges. In this paper, we present a new approach to strengthen policy enforcement by augmenting security policies with information about the trustworthiness of data used in securitysensitive operations. We evaluated this technique using 9 available exploits involving several popular software packages containing the above types of vulnerabilities. Our technique sucessfully defeated these exploits.

390 citations


Journal ArticleDOI
TL;DR: This paper develops efficient adaptive sequential and batch-sequential methods for an early detection of attacks that lead to changes in network traffic, such as denial-of-service attacks, worm-based attacks, port-scanning, and man-in-the-middle attacks.
Abstract: Large-scale computer network attacks in their final stages can readily be identified by observing very abrupt changes in the network traffic. In the early stage of an attack, however, these changes are hard to detect and difficult to distinguish from usual traffic fluctuations. Rapid response, a minimal false-alarm rate, and the capability to detect a wide spectrum of attacks are the crucial features of intrusion detection systems. In this paper, we develop efficient adaptive sequential and batch-sequential methods for an early detection of attacks that lead to changes in network traffic, such as denial-of-service attacks, worm-based attacks, port-scanning, and man-in-the-middle attacks. These methods employ a statistical analysis of data from multiple layers of the network protocol to detect very subtle traffic changes. The algorithms are based on change-point detection theory and utilize a thresholding of test statistics to achieve a fixed rate of false alarms while allowing us to detect changes in statistical models as soon as possible. There are three attractive features of the proposed approach. First, the developed algorithms are self-learning, which enables them to adapt to various network loads and usage patterns. Secondly, they allow for the detection of attacks with a small average delay for a given false-alarm rate. Thirdly, they are computationally simple and thus can be implemented online. Theoretical frameworks for detection procedures are presented. We also give the results of the experimental study with the use of a network simulator testbed as well as real-life testing for TCP SYN flooding attacks

319 citations


01 Jan 2006
TL;DR: It is believed that model-based monitoring, which has the potential for detecting unknown attacks, is more feasible for control networks than for general enterprise networks.
Abstract: In a model-based intrusion detection approach for protecting SCADA networks, we construct models that characterize the expected/acceptable behavior of the system, and detect attacks that cause violations of these models. Process control networks tend to have static topologies, regular trac patterns, and a limited number of applications and protocols running on them. Thus, we believe that model-based monitoring, which has the potential for detecting unknown attacks, is more feasible for control networks than for general enterprise networks. To this end, we describe three model-based techniques that we have developed and a prototype implementation of them for monitoring Modbus TCP networks.

314 citations


Proceedings ArticleDOI
Zhichun Li1, Manan Sanghi1, Yan Chen1, Ming-Yang Kao1, B. Chavez1 
21 May 2006
TL;DR: Hamsa is proposed, a network-based automated signature generation system for polymorphic worms which is fast, noise-tolerant and attack-resilient, and significantly outperforms Polygraph in terms of efficiency, accuracy, and attack resilience.
Abstract: Zero-day polymorphic worms pose a serious threat to the security of Internet infrastructures. Given their rapid propagation, it is crucial to detect them at edge networks and automatically generate signatures in the early stages of infection. Most existing approaches for automatic signature generation need host information and are thus not applicable for deployment on high-speed network links. In this paper, we propose Hamsa, a network-based automated signature generation system for polymorphic worms which is fast, noise-tolerant and attack-resilient. Essentially, we propose a realistic model to analyze the invariant content of polymorphic worms which allows us to make analytical attack-resilience guarantees for the signature generation algorithm. Evaluation based on a range of polymorphic worms and polymorphic engines demonstrates that Hamsa significantly outperforms Polygraph (J. Newsome et al., 2005) in terms of efficiency, accuracy, and attack resilience.

313 citations


Proceedings Article
31 Jul 2006
TL;DR: This paper introduces a new class of polymorphic attacks, called polymorphic blending attacks, that can effectively evade byte frequency-based network anomaly IDS by carefully matching the statistics of the mutated attack instances to the normal profiles.
Abstract: A very effective means to evade signature-based intrusion detection systems (IDS) is to employ polymorphic techniques to generate attack instances that do not share a fixed signature. Anomaly-based intrusion detection systems provide good defense because existing polymorphic techniques can make the attack instances look different from each other, but cannot make them look like normal. In this paper we introduce a new class of polymorphic attacks, called polymorphic blending attacks, that can effectively evade byte frequency-based network anomaly IDS by carefully matching the statistics of the mutated attack instances to the normal profiles. The proposed polymorphic blending attacks can be viewed as a subclass of the mimicry attacks. We take a systematic approach to the problem and formally describe the algorithms and steps required to carry out such attacks. We not only show that such attacks are feasible but also analyze the hardness of evasion under different circumstances. We present detailed techniques using PAYL, a byte frequency-based anomaly IDS, as a case study and demonstrate that these attacks are indeed feasible. We also provide some insight into possible countermeasures that can be used as defense.

280 citations


Journal ArticleDOI
TL;DR: It is shown that the analysis of system call arguments and the use of Bayesian classification improves detection accuracy and resilience against evasion attempts and a tool is described based on this approach.
Abstract: Intrusion detection systems (IDSs) are used to detect traces of malicious activities targeted against the network and its resources. Anomaly-based IDSs build models of the expected behavior of applications by analyzing events that are generated during the applications' normal operation. Once these models have been established, subsequent events are analyzed to identify deviations, on the assumption that anomalies represent evidence of an attack. Host-based anomaly detection systems often rely on system call sequences to characterize the normal behavior of applications. Recently, it has been shown how these systems can be evaded by launching attacks that execute legitimate system call sequences. The evasion is possible because existing techniques do not take into account all available features of system calls. In particular, system call arguments are not considered. We propose two primary improvements upon existing host-based anomaly detectors. First, we apply multiple detection models to system call arguments. Multiple models allow the arguments of each system call invocation to be evaluated from several different perspectives. Second, we introduce a sophisticated method of combining the anomaly scores from each model into an overall aggregate score. The combined anomaly score determines whether an event is part of an attack. Individual anomaly scores are often contradicting and, therefore, a simple weighted sum cannot deliver reliable results. To address this problem, we propose a technique that uses Bayesian networks to perform system call classification. We show that the analysis of system call arguments and the use of Bayesian classification improves detection accuracy and resilience against evasion attempts. In addition, the paper describes a tool based on our approach and provides a quantitative evaluation of its performance in terms of both detection effectiveness and overhead. A comparison with four related approaches is also presented.

Proceedings ArticleDOI
01 Jan 2006
TL;DR: This work proposes a distributed, cluster-based anomaly detection algorithm that achieves comparable accuracy compared to a centralized scheme with a significant reduction in communication overhead.
Abstract: Identifying misbehaviors is an important challenge for monitoring, fault diagnosis and intrusion detection in wireless sensor networks. A key problem is how to minimise the communication overhead and energy consumption in the network when identifying misbe-haviors. Our approach to this problem is based on a distributed, cluster-based anomaly detection algorithm. We minimise the communication overhead by clustering the sensor measurements and merging clusters before sending a description of the clusters to the other nodes. In order to evaluate our distributed scheme, we implemented our algorithm in a simulation based on the sensor data gathered from the Great Duck Island project. We demonstrate that our scheme achieves comparable accuracy compared to a centralised scheme with a significant reduction in communication overhead.

Proceedings ArticleDOI
18 Dec 2006
TL;DR: This paper proposes a new approach to construct high speed payload-based anomaly IDS intended to be accurate and hard to evade, and uses a feature clustering algorithm originally proposed for text classification problems to reduce the dimensionality of the feature space.
Abstract: Unsupervised or unlabeled learning approaches for network anomaly detection have been recently proposed. In particular, recent work on unlabeled anomaly detection focused on high speed classification based on simple payload statistics. For example, PAYL, an anomaly IDS, measures the occurrence frequency in the payload of n-grams. A simple model of normal traffic is then constructed according to this description of the packets' content. It has been demonstrated that anomaly detectors based on payload statistics can be "evaded" by mimicry attacks using byte substitution and padding techniques. In this paper we propose a new approach to construct high speed payload-based anomaly IDS intended to be accurate and hard to evade. We propose a new technique to extract the features from the payload. We use a feature clustering algorithm originally proposed for text classification problems to reduce the dimensionality of the feature space. Accuracy and hardness of evasion are obtained by constructing our anomaly-based IDS using an ensemble of one-class SVM classifiers that work on different feature spaces.

Proceedings ArticleDOI
14 Oct 2006
TL;DR: A game theoretic framework to analyze the interactions between pairs of attacking/defending nodes using a Bayesian formulation and shows that the dynamic game produces energy-efficient monitoring strategies for the defender, while improving the overall hybrid detection power.
Abstract: In wireless ad hoc networks, although defense strategies such as intrusion detection systems (IDSs) can be deployed at each mobile node, significant constraints are imposed in terms of the energy expenditure of such systems. In this paper, we propose a game theoretic framework to analyze the interactions between pairs of attacking/defending nodes using a Bayesian formulation. We study the achievable Nash equilibrium for the attacker/defender game in both static and dynamic scenarios. The dynamic Bayesian game is a more realistic model, since it allows the defender to consistently update his belief on his opponent's maliciousness as the game evolves. A new Bayesian hybrid detection approach is suggested for the defender, in which a lightweight monitoring system is used to estimate his opponent's actions, and a heavyweight monitoring system acts as a last resort of defense. We show that the dynamic game produces energy-efficient monitoring strategies for the defender, while improving the overall hybrid detection power.

Patent
Paul Benjamin1
21 Feb 2006
TL;DR: In this article, a system and method for predicting and preventing unauthorized intrusion in a computer configuration is presented, where a vulnerability assessment component is provided that is operable to execute a command over the communication network, and a data monitoring utility operates to monitor data transmitted over a communication network as the vulnerability assessment components executes commands.
Abstract: The present invention provides a system and method for predicting and preventing unauthorized intrusion in a computer configuration. Preferably, the invention comprises a communication network to which at least two computing devices connect, wherein at least one of the computing devices is operable to receive data transmitted by the other computing device. The invention further comprises a database that is accessible over the network and operable to store information related to the network. A vulnerability assessment component is provided that is operable to execute a command over the communication network, and a data monitoring utility operates to monitor data transmitted over the communication network as the vulnerability assessment component executes commands. Also, an intrusion detection component is included that is operable to provide a simulated copy of the network, to generate a first data transmission on the simulated copy of the network that represents a second data transmission on the communication network, and to compare the first data transmission with a second data transmission. The vulnerability assessment component preferably interfaces with the intrusion detection component to define rules associated with the first and second data transmissions, to store the rules in the database, and to retrieve the rules from the database in order to predict and prevent unauthorized intrusion in the computer configuration.

Proceedings ArticleDOI
11 Dec 2006
TL;DR: This paper applies one of the efficient data mining algorithms called random forests algorithm in anomaly based NIDSs, and presents the modification on the outlier detection algorithm of random forests that is comparable to previously reported unsupervised anomaly detection approaches evaluated over the KDD' 99 dataset.
Abstract: Anomaly detection is a critical issue in Network Intrusion Detection Systems (NIDSs). Most anomaly based NIDSs employ supervised algorithms, whose performances highly depend on attack-free training data. However, this kind of training data is difficult to obtain in real world network environment. Moreover, with changing network environment or services, patterns of normal traffic will be changed. This leads to high false positive rate of supervised NIDSs. Unsupervised outlier detection can overcome the drawbacks of supervised anomaly detection. Therefore, we apply one of the efficient data mining algorithms called random forests algorithm in anomaly based NIDSs. Without attack-free training data, random forests algorithm can detect outliers in datasets of network traffic. In this paper, we discuss our framework of anomaly based network intrusion detection. In the framework, patterns of network services are built by random forests algorithm over traffic data. Intrusions are detected by determining outliers related to the built patterns. We present the modification on the outlier detection algorithm of random forests. We also report our experimental results over the KDD'99 dataset. The results show that the proposed approach is comparable to previously reported unsupervised anomaly detection approaches evaluated over the KDD' 99 dataset.

Proceedings ArticleDOI
13 Feb 2006
TL;DR: The general guidelines for applying IDS to static sensor networks are discussed, and a novel technique to optimally watch over the communications of the sensors’ neighborhood on certain scenarios is introduced.
Abstract: The research of Intrusion Detection Systems (IDS) is a mature area in wired networks, and has also attracted many attentions in wireless ad hoc networks recently. Nevertheless, there is no previous work reported in the literature about IDS architectures in wireless sensor networks. In this paper, we discuss the general guidelines for applying IDS to static sensor networks, and introduce a novel technique to optimally watch over the communications of the sensors’ neighborhood on certain scenarios.

Journal ArticleDOI
TL;DR: A tunable algorithm for distributed outlier detection in dynamic mixed-attribute data sets that are prone to concept drift and models of the data must be dynamic as well is presented.
Abstract: Efficiently detecting outliers or anomalies is an important problem in many areas of science, medicine and information technology Applications range from data cleaning to clinical diagnosis, from detecting anomalous defects in materials to fraud and intrusion detection Over the past decade, researchers in data mining and statistics have addressed the problem of outlier detection using both parametric and non-parametric approaches in a centralized setting However, there are still several challenges that must be addressed First, most approaches to date have focused on detecting outliers in a continuous attribute space However, almost all real-world data sets contain a mixture of categorical and continuous attributes Categorical attributes are typically ignored or incorrectly modeled by existing approaches, resulting in a significant loss of information Second, there have not been any general-purpose distributed outlier detection algorithms Most distributed detection algorithms are designed with a specific domain (eg sensor networks) in mind Third, the data sets being analyzed may be streaming or otherwise dynamic in nature Such data sets are prone to concept drift, and models of the data must be dynamic as well To address these challenges, we present a tunable algorithm for distributed outlier detection in dynamic mixed-attribute data sets

Proceedings ArticleDOI
10 May 2006
TL;DR: This paper uses RIPPER as the underlying rule classifier and implements a combination of oversampling (both by replication and synthetic generation) and undersampling techniques and proposes a clustering based methodology for oversamplings by generating synthetic instances.
Abstract: An approach to combating network intrusion is the development of systems applying machine learning and data min- ing techniques. Many IDS (Intrusion Detection Systems) suffer from a high rate of false alarms and missed intrusions. We want to be able to improve the intrusion detection rate at a reduced false positive rate. The focus of this paper is rule-learning, using RIPPER, on highly imbalanced intrusion datasets with an objective to improve the true positive rate (intrusions) without significantly increasing the false positives. We use RIPPER as the underlying rule classifier. To counter imbalance in data, we implement a combination of oversampling (both by replication and synthetic generation) and undersampling techniques. We also propose a clustering based methodology for oversampling by generating synthetic instances. We evaluate our approaches on two intrusion datasets — destination and actual packets based — constructed from actual Notre Dame traffic, giving a flavor of real-world data with its idiosyncrasies. Using ROC analysis, we show that oversampling by synthetic generation of minority (intrusion) class outperforms oversampling by replication and RIPPER's loss ratio method. Additionally, we establish that our clustering based approach is more suitable for the detecting intrusions and is able to provide additional improvement over just synthetic generation of instances.

Proceedings ArticleDOI
18 Apr 2006
TL;DR: Argos is built upon a fast x86 emulator which tracks network data throughout execution to identify their invalid use as jump targets, function addresses, instructions, etc, and is able to generate accurate network intrusion detection signatures for the exploits that are immune to payload mutations.
Abstract: As modern operating systems and software become larger and more complex, they are more likely to contain bugs, which may allow attackers to gain illegitimate access. A fast and reliable mechanism to discern and generate vaccines for such attacks is vital for the successful protection of networks and systems. In this paper we present Argos, a containment environment for worms as well as human orchestrated attacks. Argos is built upon a fast x86 emulator which tracks network data throughout execution to identify their invalid use as jump targets, function addresses, instructions, etc. Furthermore, system call policies disallow the use of network data as arguments to certain calls. When an attack is detected, we perform 'intelligent' process- or kernel-aware logging of the corresponding emulator state for further offline processing. In addition, our own forensics shellcode is injected, replacing the malevolent shellcode, to gather information about the attacked process. By correlating the data logged by the emulator with the data collected from the network, we are able to generate accurate network intrusion detection signatures for the exploits that are immune to payload mutations. The entire process can be automated and has few if any false positives, thus rapid global scale deployment of the signatures is possible.

Journal ArticleDOI
TL;DR: This paper extends a sensor network simulator to generate routing attacks in wireless sensor networks and demonstrates that the intrusion detection scheme is able to achieve high detection accuracy with a low false positive rate for a variety of simulated routing attacks.
Abstract: Security is a critical challenge for creating robust and reliable sensor networks. For example, routing attacks have the ability to disconnect a sensor network from its central base station. In this paper, we present a method for intrusion detection in wireless sensor networks. Our intrusion detection scheme uses a clustering algorithm to build a model of normal traffic behavior, and then uses this model of normal traffic to detect abnormal traffic patterns. A key advantage of our approach is that it is able to detect attacks that have not previously been seen. Moreover, our detection scheme is based on a set of traffic features that can potentially be applied to a wide range of routing attacks. In order to evaluate our intrusion detection scheme, we have extended a sensor network simulator to generate routing attacks in wireless sensor networks. We demonstrate that our intrusion detection scheme is able to achieve high detection accuracy with a low false positive rate for a variety of simulated routing attacks.

Proceedings Article
31 Jul 2006
TL;DR: This paper discusses the design and implementation of a NIDS extension to perform dynamic application-layer protocol analysis and demonstrates the power of the enhancement with three examples: reliable detection of applications not using their standard ports, payload inspection of FTP data transfers, and detection of IRC-based botnet clients and servers.
Abstract: Many network intrusion detection systems (NIDS) rely on protocol-specific analyzers to extract the higher-level semantic context from a traffic stream. To select the correct kind of analysis, traditional systems exclusively depend on well-known port numbers. However, based on our experience, increasingly significant portions of today's traffic are not classifiable by such a scheme. Yet for a NIDS, this traffic is very interesting, as a primary reason for not using a standard port is to evade security and policy enforcement monitoring. In this paper, we discuss the design and implementation of a NIDS extension to perform dynamic application-layer protocol analysis. For each connection, the system first identifies potential protocols in use and then activates appropriate analyzers to verify the decision and extract higher-level semantics. We demonstrate the power of our enhancement with three examples: reliable detection of applications not using their standard ports, payload inspection of FTP data transfers, and detection of IRC-based botnet clients and servers. Prototypes of our system currently run at the border of three large-scale operational networks. Due to its success, the bot-detection is already integrated into a dynamic inline blocking of production traffic at one of the sites.

Proceedings ArticleDOI
21 May 2006
TL;DR: A new and general class of attacks whereby a worm can combine polymorphism and misleading behavior to intentionally pollute the dataset of suspicious flows during its propagation and successfully mislead the automatic signature generation process is described.
Abstract: Several syntactic-based automatic worm signature generators, e.g., Polygraph, have recently been proposed. These systems typically assume that a set of suspicious flows are provided by a flow classifier, e.g., a honeynet or an intrusion detection system, that often introduces "noise" due to difficulties and imprecision inflow classification. The algorithms for extracting the worm signatures from the flow data are designed to cope with the noise. It has been reported that these systems can handle a fairly high noise level, e.g., 80% for Polygraph. In this paper, we show that if noise is introduced deliberately to mislead a worm signature generator, a much lower noise level, e.g., 50%, can already prevent the system from reliably generating useful worm signatures. Using Polygraph as a case study, we describe a new and general class of attacks whereby a worm can combine polymorphism and misleading behavior to intentionally pollute the dataset of suspicious flows during its propagation and successfully mislead the automatic signature generation process. This study suggests that unless an accurate and robust flow classification process is in place, automatic syntactic-based signature generators are vulnerable to such noise injection attacks.

Patent
03 Mar 2006
TL;DR: In this article, a traffic sniffer for extracting network packets from passing network traffic, a traffic parser configured to extract individual data from defined packet fields of the network packets, and a traffic logger configured to store individual packet fields in a database.
Abstract: An intrusion detection system (IDS). An IDS which has been configured in accordance with the present invention can include a traffic sniffer for extracting network packets from passing network traffic; a traffic parser configured to extract individual data from defined packet fields of the network packets; and, a traffic logger configured to store individual packet fields of the network packets in a database. A vector builder can be configured to generate multi-dimensional vectors from selected features of the stored packet fields. Notably, at least one self-organizing clustering module can be configured to process the multi-dimensional vectors to produce a self-organized map of clusters. Subsequently, an anomaly detector can detect anomalous correlations between individual ones of the clusters in the self-organized map based upon at least one configurable correlation metric. Finally, a classifier can classify detected anomalous correlations as one of an alarm and normal behavior.

Journal ArticleDOI
TL;DR: A comprehensive classification of security policy conflicts that might potentially exist in a single security device or between different network devices in enterprise networks is presented and the high probability of creating such conflicts even by expert system administrators and network practitioners is shown.
Abstract: Network security polices are essential elements in Internet security devices that provide traffic filtering, integrity, confidentiality, and authentication. Network security perimeter devices such as firewalls, IPSec, and IDS/IPS devices operate based on locally configured policies. However, configuring network security policies remains a complex and error-prone task due to rule dependency semantics and the interaction between policies in the network. This complexity is likely to increase as the network size increases. A successful deployment of a network security system requires global analysis of policy configurations of all network security devices in order to avoid policy conflicts and inconsistency. Policy conflicts may cause serious security breaches and network vulnerability such as blocking legitimate traffic, permitting unwanted traffic, and insecure data transmission. This article presents a comprehensive classification of security policy conflicts that might potentially exist in a single security device (intrapolicy conflicts) or between different network devices (interpolicy conflicts) in enterprise networks. We also show the high probability of creating such conflicts even by expert system administrators and network practitioners.

Journal ArticleDOI
TL;DR: A nonparametric multichannel detection test that can be effectively applied to detect a wide variety of attacks such as denial-of-service attacks, worm-based attacks, port-scanning, and man-in-the-middle attacks is proposed.

Proceedings ArticleDOI
J. van Lunteren1
23 Apr 2006
TL;DR: A novel scheme for pattern-matching, called BFPM, that exploits a hardware-based programmable statemachine technology to achieve deterministic processing rates that are independent of input and pattern characteristics on the order of 10 Gb/s for FPGA and at least 20 Gb-s for ASIC implementations.
Abstract: New generations of network intrusion detection systems create the need for advanced pattern-matching engines. This paper presents a novel scheme for pattern-matching, called BFPM, that exploits a hardware-based programmable statemachine technology to achieve deterministic processing rates that are independent of input and pattern characteristics on the order of 10 Gb/s for FPGA and at least 20 Gb/s for ASIC implementations. BFPM supports dynamic updates and is one of the most storage-efficient schemes in the industry, supporting two thousand patterns extracted from Snort with a total of 32 K characters in only 128 KB of memory.

Proceedings ArticleDOI
20 Apr 2006
TL;DR: The proposed framework of the hybrid system combines the misuse detection and anomaly detection components in which the random forests algorithm is applied and can improve the detection performance of the NIDSs, where only anomaly or misuse detection technique is used.
Abstract: Intrusion detection is important in network security. Most current network intrusion detection systems (NIDSs) employ either misuse detection or anomaly detection. However, misuse detection cannot detect unknown intrusions, and anomaly detection usually has high false positive rate. To overcome the limitations of both techniques, we incorporate both anomaly and misuse detection into the NIDS. In this paper, we present our framework of the hybrid system. The system combines the misuse detection and anomaly detection components in which the random forests algorithm is applied. We discuss the advantages of the framework and also report our experimental results over the KDD'99 dataset. The results show that the proposed approach can improve the detection performance of the NIDSs, where only anomaly or misuse detection technique is used.

Journal Article
TL;DR: An algorithm to detect spoofing by leveraging the sequence number field in the link-layer header of IEEE 802.11 frames is proposed, and it is demonstrated how it can detect various spoofing without modifying the APs or wireless stations.
Abstract: The exponential growth in the deployment of IEEE 802.11-based wireless LAN (WLAN) in enterprises and homes makes WLAN an attractive target for attackers. Attacks that exploit vulnerabilities at the IP layer or above can be readily addressed by intrusion detection systems designed for wired networks. However, attacks exploiting link-layer protocol vulnerabilities require a different set of intrusion detection mechanism. Most link-layer attacks in WLANs are denial of service attacks and work by spoofing either access points (APs) or wireless stations. Spoofing is possible because the IEEE 802.11 standard does not provide per-frame source authentication, but can be effectively prevented if a proper authentication is added into the standard. Unfortunately, it is unlikely that commercial WLANs will support link-layer source authentication that covers both management and control frames in the near future. Even if it is available in next-generation WLANs equipments, it cannot protect the large installed base of legacy WLAN devices. This paper proposes an algorithm to detect spoofing by leveraging the sequence number field in the link-layer header of IEEE 802.11 frames, and demonstrates how it can detect various spoofing without modifying the APs or wireless stations. The false positive rate of the proposed algorithm is zero, and the false negative rate is close to zero. In the worst case, the proposed algorithm can detect a spoofing activity, even though it can only detect some but not all spoofed frames.

Proceedings ArticleDOI
21 Mar 2006
TL;DR: This paper provides a novel information-theoretic analysis of IDS and proposes a new metric, CI D (Intrusion Detection Capability), which is defined as the ratio of the mutual information between the IDS input and output to the entropy of the input.
Abstract: A fundamental problem in intrusion detection is what metric(s) can be used to objectively evaluate an intrusion detection system (IDS) in terms of its ability to correctly classify events as normal or intrusive. Traditional metrics (e.g., true positive rate and false positive rate) measure different aspects, but no single metric seems sufficient to measure the capability of intrusion detection systems. The lack of a single unified metric makes it difficult to fine-tune and evaluate an IDS. In this paper, we provide an in-depth analysis of existing metrics. Specifically, we analyze a typical cost-based scheme [6], and demonstrate that this approach is very confusing and ineffective when the cost factor is not carefully selected. In addition, we provide a novel information-theoretic analysis of IDS and propose a new metric that highly complements cost-based analysis. When examining the intrusion detection process from an information-theoretic point of view, intuitively, we should have less uncertainty about the input (event data) given the IDS output (alarm data). Thus, our new metric, CI D (Intrusion Detection Capability), is defined as the ratio of the mutual information between the IDS input and output to the entropy of the input. CI D has the desired property that: (1) It takes into account all the important aspects of detection capability naturally, i.e., true positive rate, false positive rate, positive predictive value, negative predictive value, and base rate; (2) it objectively provides an intrinsic measure of intrusion detection capability; and (3) it is sensitive to IDS operation parameters such as true positive rate and false positive rate, which can demonstrate the effect of the subtle changes of intrusion detection systems. We propose CI D as an appropriate performance measure to maximize when fine-tuning an IDS. The obtained operation point is the best that can be achieved by the IDS in terms of its intrinsic ability to classify input data. We use numerical examples as well as experiments of actual IDSs on various data sets to show that by using CI D, we can choose the best (optimal) operating point for an IDS and objectively compare different IDSs.