scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Network attacks identification using consistency based feature selection and self organizing maps

25 Sep 2014-pp 162-166
TL;DR: An anomaly detection model is proposed by deploying consistency based feature selection, J48 decision tree and self organizing map (SOM), which has been carried on KDD99 data set and each of the features selected using the integrated mechanism has been able to identify the attacks in the data set.
Abstract: Anomaly detection is one of the major areas of research with the tremendous development of computer networks. Any intrusion detection model designed should have the ability to visualize high dimensional data with high processing and accurate detection rate. Integrated Intrusion detection models combine the advantage of low false positive rate and shorter detection time. Hence this paper proposes an anomaly detection model by deploying consistency based feature selection, J48 decision tree and self organizing map (SOM). Experimental analysis has been carried on KDD99 data set and each of the features selected using the integrated mechanism has been able to identify the attacks in the data set. Keywords— Self Organizing Map, Consistency based Feature Selection, Intrusion Detection Systems.
Citations
More filters
Journal ArticleDOI
TL;DR: The experimental results show that the proposed method outperforms existing methods applying commonly used flow statistical features, and is found to be more effective in discriminating different video traffics, especially from the QoS perspective, than commonly used features available in the literature.

63 citations

Proceedings ArticleDOI
01 Nov 2014
TL;DR: A novel method of integrating principal component analysis (PCA) and support vector machine (SVM) by optimizing the kernel parameters using automatic parameter selection technique is proposed, which reduces the training and testing time to identify intrusions thereby improving the accuracy.
Abstract: Intrusion detection systems (IDS) play a major role in detecting the attacks that occur in the computer or networks. Anomaly intrusion detection models detect new attacks by observing the deviation from profile. However there are many problems in the traditional IDS such as high false alarm rate, low detection capability against new network attacks and insufficient analysis capacity. The use of machine learning for intrusion models automatically increases the performance with an improved experience. This paper proposes a novel method of integrating principal component analysis (PCA) and support vector machine (SVM) by optimizing the kernel parameters using automatic parameter selection technique. This technique reduces the training and testing time to identify intrusions thereby improving the accuracy. The proposed method was tested on KDD data set. The datasets were carefully divided into training and testing considering the minority attacks such as U2R and R2L to be present in the testing set to identify the occurrence of unknown attack. The results indicate that the proposed method is successful in identifying intrusions. The experimental results show that the classification accuracy of the proposed method outperforms other classification techniques using SVM as the classifier and other dimensionality reduction or feature selection techniques. Minimum resources are consumed as the classifier input requires reduced feature set and thereby minimizing training and testing overhead time.

57 citations

Journal ArticleDOI
TL;DR: By comparing with the two SOM-based intrusion detection systems, the overall goal of this survey is to comprehensively compare the primitive components and properties of SOM- based intrusion detection.
Abstract: This paper describes a focused literature survey of self-organizing maps (SOM) in support of intrusion detection. Specifically, the SOM architecture can be divided into two categories, i.e., static-layered architectures and dynamic-layered architectures. The former one, Hierarchical Self-Organizing Maps (HSOM), can effectively reduce the computational overheads and efficiently represent the hierarchy of data. The latter one, Growing Hierarchical Self-Organizing Maps (GHSOM), is quite effective for online intrusion detection with low computing latency, dynamic self-adaptability, and self-learning. The ultimate goal of SOM architecture is to accurately represent the topological relationship of data to identify any anomalous attack. The overall goal of this survey is to comprehensively compare the primitive components and properties of SOM-based intrusion detection. By comparing with the two SOM-based intrusion detection systems, we can clearly understand the existing challenges of SOM-based intrusion detection systems and indicate the future research directions.

55 citations

Journal ArticleDOI
TL;DR: An experimental-based review of neural-based methods applied to intrusion detection issues, including deep-based approaches or weightless neural networks, which feature surprising outcomes and quantifies the value of neural networks when state-of-the-art datasets are used to train the models.
Abstract: The use of Machine Learning (ML) techniques in Intrusion Detection Systems (IDS) has taken a prominent role in the network security management field, due to the substantial number of sophisticated attacks that often pass undetected through classic IDSs. These are typically aimed at recognizing attacks based on a specific signature, or at detecting anomalous events. However, deterministic, rule-based methods often fail to differentiate particular (rarer) network conditions (as in peak traffic during specific network situations) from actual cyber attacks. In this article we provide an experimental-based review of neural-based methods applied to intrusion detection issues. Specifically, we i) offer a complete view of the most prominent neural-based techniques relevant to intrusion detection, including deep-based approaches or weightless neural networks, which feature surprising outcomes; ii) evaluate novel datasets (updated w.r.t. the obsolete KDD99 set) through a designed-from-scratch Python-based routine; iii) perform experimental analyses including time complexity and performance (accuracy and F-measure), considering both single-class and multi-class problems, and identifying trade-offs between resource consumption and performance. Our evaluation quantifies the value of neural networks, particularly when state-of-the-art datasets are used to train the models. This leads to interesting guidelines for security managers and computer network practitioners who are looking at the incorporation of neural-based ML into IDS.

38 citations


Cites methods from "Network attacks identification usin..."

  • ...In [21] and [22] the authors adopt neural-based methods exploiting Self-Organizing Maps....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors provide an experimental-based review of neural-based methods applied to intrusion detection issues, including deep-based approaches or weightless neural networks, and evaluate novel datasets (updated w.r.t. the obsolete KDD99 set).
Abstract: The use of Machine Learning (ML) techniques in Intrusion Detection Systems (IDS) has taken a prominent role in the network security management field, due to the substantial number of sophisticated attacks that often pass undetected through classic IDSs. These are typically aimed at recognising attacks based on a specific signature, or at detecting anomalous events. However, deterministic, rule-based methods often fail to differentiate particular (rarer) network conditions (as in peak traffic during specific network situations) from actual cyber attacks. In this paper we provide an experimental-based review of neural-based methods applied to intrusion detection issues. Specifically, we i) offer a complete view of the most prominent neural-based techniques relevant to intrusion detection, including deep-based approaches or weightless neural networks, which feature surprising outcomes; ii) evaluate novel datasets (updated w.r.t. the obsolete KDD99 set) through a designed-from-scratch Python-based routine; iii) perform experimental analyses including time complexity and performance (accuracy and F-measure), considering both single-class and multi-class problems, and identifying trade-offs between resource consumption and performance. Our evaluation quantifies the value of neural networks, particularly when state-of-the-art datasets are used to train the models. This leads to interesting guidelines for security managers and computer network practitioners who are looking at the incorporation of neural-based ML into IDS.

27 citations

References
More filters
Journal ArticleDOI
TL;DR: An empirical study is conducted to examine the pros and cons of these search methods, give some guidelines on choosing a search method, and compare the classifier error rates before and after feature selection.

846 citations

Journal ArticleDOI
TL;DR: The principle interest of this work is to benchmark the performance of the proposed hybrid IDS architecture by using KDD Cup 99 Data Set, the benchmark dataset used by IDS researchers.
Abstract: In this paper, we propose a novel Intrusion Detection System (IDS) architecture utilizing both anomaly and misuse detection approaches. This hybrid Intrusion Detection System architecture consists of an anomaly detection module, a misuse detection module and a decision support system combining the results of these two detection modules. The proposed anomaly detection module uses a Self-Organizing Map (SOM) structure to model normal behavior. Deviation from the normal behavior is classified as an attack. The proposed misuse detection module uses J.48 decision tree algorithm to classify various types of attacks. The principle interest of this work is to benchmark the performance of the proposed hybrid IDS architecture by using KDD Cup 99 Data Set, the benchmark dataset used by IDS researchers. A rule-based Decision Support System (DSS) is also developed for interpreting the results of both anomaly and misuse detection modules. Simulation results of both anomaly and misuse detection modules based on the KDD 99 Data Set are given. It is observed that the proposed hybrid approach gives better performance over individual approaches.

460 citations

Book ChapterDOI
01 Jan 2008
TL;DR: The unsupervised learning process was applied to provide a comprehensive view on ecological data through the use of ordination and classification to reveal the adaptive convergence of connection weights among computation nodes (i.e., neurons).
Abstract: Ecological data are considered difficult to analyze because numerous biological and environmental factors are involved in ecological processes in a complex manner. The self-organizing map (SOM) has been an efficient alternative tool for analyzing ecological data without a priori knowledge. The unsupervised learning process was applied to provide a comprehensive view on ecological data through the use of ordination and classification. The SOM extracts information from multidimensional data and maps it onto two- or three-dimensional space. The network structure and learning algorithm are discussed to reveal the adaptive convergence of connection weights among computation nodes (i.e., neurons). Examples are provided to demonstrate the environmental impact gradient and sample unit clustering. SOM visualization is also presented to show profiles of the corresponding taxa and environmental variables.

239 citations

Book ChapterDOI
18 Apr 2000
TL;DR: This work focuses on one measure called consistency, which is an effective technique in dealing with dimensionality reduction for classification task and its properties in comparison with other major measures and different ways of using this measure in search of feature subsets.
Abstract: Feature selection is an effective technique in dealing with dimensionality reduction for classification task, a main component of data mining. It searches for an "optimal" subset of features. The search strategies under consideration are one of the three: complete, heuristic, and probabilistic. Existing algorithms adopt various measures to evaluate the goodness of feature subsets. This work focuses on one measure called consistency. We study its properties in comparison with other major measures and different ways of using this measure in search of feature subsets. We conduct an empirical study to examine the pros and cons of these different search methods using consistency. Through this extensive exercise, we aim to provide a comprehensive view of this measure and its relations with other measures and a guideline of the use of this measure with different search strategies facing a new application.

217 citations


"Network attacks identification usin..." refers background in this paper

  • ...Consider the following example, an inconsistency results if there are two records (0 1 a) and (0 1 b) with different class labels ( a and b) and (2) the inconsistency count for a particular pattern is considered as the different times it is seen in the data subtracted by the highest number of different class labels [15]....

    [...]

Book ChapterDOI
26 Jul 2013

146 citations