scispace - formally typeset
Search or ask a question
Author

Sylvain Gombault

Bio: Sylvain Gombault is an academic researcher from École nationale supérieure des télécommunications de Bretagne. The author has contributed to research in topics: Intrusion detection system & Anomaly-based intrusion detection system. The author has an hindex of 13, co-authored 40 publications receiving 494 citations. Previous affiliations of Sylvain Gombault include Orange S.A. & European University of Brittany.

Papers
More filters
Proceedings ArticleDOI
14 Dec 2009
TL;DR: The systematical evaluation results show that the process of attribute normalization improves a lot the detection performance, and the statistical normalization scheme is the best choice for detection if the data set is large.
Abstract: Anomaly intrusion detection is an important issue in computer network security. As a step of data preprocessing, attribute normalization is essential to detection performance. However, many anomaly detection methods do not normalize attributes before training and detection. Few methods consider to normalize the attributes but the question of which normalization method is more effective still remains. In this paper, we introduce four different schemes of attribute normalization to preprocess the data for anomaly intrusion detection. Three methods, k-NN, PCA as well as SVM, are then employed on the normalized data for comparison of the detection results. KDD Cup 1999 data are used to evaluate the normalization schemes and the detection methods. The systematical evaluation results show that the process of attribute normalization improves a lot the detection performance. The statistical normalization scheme is the best choice for detection if the data set is large.

92 citations

Journal ArticleDOI
TL;DR: With only the most important 10 features selected from all the original 41 features, the attack detection accuracy almost remains the same or even becomes better based on both BN and C4.5 classifiers.
Abstract: Efficiently processing massive data is a big issue in high-speed network intrusion detection, as network traffic has become increasingly large and complex. In this work, instead of constructing a large number of features from massive network traffic, the authors aim to select the most important features and use them to detect intrusions in a fast and effective manner. The authors first employed several techniques, that is, information gain (IG), wrapper with Bayesian networks (BN) and Decision trees (C4.5), to select important subsets of features for network intrusion detection based on KDD'99 data. The authors then validate the feature selection schemes in a real network test bed to detect distributed denial-of-service attacks. The feature selection schemes are extensively evaluated based on the two data sets. The empirical results demonstrate that with only the most important 10 features selected from all the original 41 features, the attack detection accuracy almost remains the same or even becomes better based on both BN and C4.5 classifiers. Constructing fewer features can also improve the efficiency of network intrusion detection.

44 citations

Journal ArticleDOI
TL;DR: The testing results show that the LOGtfidf weight gives better detection performance compared with plain frequency and other types of weights, and the simple NN method and PCA method achieve the better masquerade detection results than the other 7 methods in the literature.

43 citations

Proceedings ArticleDOI
01 Oct 2008
TL;DR: A system that only extracts several important attributes from network traffic for DDoS attack detection in real computer networks is introduced and empirical results show that only using the most important 9 attributes, the detection accuracy remains the same or has some improvements compared with that of using all the 41 attributes based on Bayesian Networks and C4.5 methods.
Abstract: DDoS attacks are major threats in current computer networks. However, DDoS attacks are difficult to be quickly detected. In this paper, we introduce a system that only extracts several important attributes from network traffic for DDoS attack detection in real computer networks. We collect a large set of DDoS attack traffic by implementing various DDoS attacks as well as normal data during normal usage. Information Gain and Chi-square methods are used to rank the importance of 41 attributes extracted from the network traffic with our programs. Bayesian networks as well as C4.5 are then employed to detect attacks as well as to determine what size of attributes is appropriate for fast detection. Empirical results show that only using the most important 9 attributes, the detection accuracy remains the same or even has some improvements compared with that of using all the 41 attributes based on Bayesian Networks and C4.5 methods. Only using several attributes also improves the efficiency in terms of attributes constructing, models training as well as intrusion detection.

38 citations

Proceedings ArticleDOI
29 Jun 2008
TL;DR: This paper employed information gain, wrapper with Bayesian networks and decision trees to select key subsets of attributes for network intrusion detection based on KDD Cup 1999 data and used the selected 10 attributes to detect DDoS attacks in the real environments.
Abstract: Extracting attributes from network traffic is the first step of network intrusion detection. However, the question of what attributes are most effective for the detection still remains. In this paper, we employed information gain, wrapper with Bayesian networks (BN) and decision trees (C4.5) respectively to select key subsets of attributes for network intrusion detection based on KDD Cup 1999 data. We then used the selected 10 attributes to detect DDoS attacks in the real environments. The empirical results based on KDD Cup 1999 data as well as DDoS attack data show that only using the 10 attributes, the detection accuracy almost remains the same or even becomes better compared with that of using all the 41 attributes with both BN and C4.5 classifiers. Using a small subset of attributes also improves the efficiency of intrusion detection.

38 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
Hervé Debar1, Marc Dacier1, Andreas Wespi1
TL;DR: A taxonomy of intrusion-detection systems is introduced that highlights the various aspects of this area and is illustrated by numerous examples from past and current projects.

882 citations

Proceedings ArticleDOI
10 Oct 2010
TL;DR: This work presents a lightweight method for DDoS attack detection based on traffic flow features, in which the extraction of such information is made with a very low overhead compared to traditional approaches.
Abstract: Distributed denial-of-service (DDoS) attacks became one of the main Internet security problems over the last decade, threatening public web servers in particular. Although the DDoS mechanism is widely understood, its detection is a very hard task because of the similarities between normal traffic and useless packets, sent by compromised hosts to their victims. This work presents a lightweight method for DDoS attack detection based on traffic flow features, in which the extraction of such information is made with a very low overhead compared to traditional approaches. This is possible due to the use of the NOX platform which provides a programmatic interface to facilitate the handling of switch information. Other major contributions include the high rate of detection and very low rate of false alarms obtained by flow analysis using Self Organizing Maps.

689 citations

Journal ArticleDOI
Hervé Debar1, Marc Dacier1, Andreas Wespi1
TL;DR: This paper extends the taxonomy beyond real- time intrusion detection to include additional aspects of security monitoring, such as vulnerability assessment, and introduces a taxonomy of intrusion- detection systems that highlights the various aspects of this area.
Abstract: Intrusion-detection systems aim at detecting attacks against computer systems and networks, or in general against information systems Indeed, it is difficult to provide provably secure information systems and to maintain them in such a secure state during their lifetime and utilization Sometimes, legacy or operational constraints do not even allow the definition of a fully secure information system Therefore, intrusion- detection systems have the task of monitoring the usage of such systems to detect apparition of insecure states They detect attempts and active misuse, either by legitimate users of the information systems or by external parties, to abuse their privileges or exploit security vulnerabilities In a previous paper [Computer networks 31, 805–822 (1999)], we introduced a taxonomy of intrusion- detection systems that highlights the various aspects of this area This paper extends the taxonomy beyond real- time intrusion detection to include additional aspects of security monitoring, such as vulnerability assessment

371 citations