Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization

doi:10.5220/0006639801080116

Open AccessProceedings ArticleDOI

Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization

Iman Sharafaldin, +2 more

- pp 108-116

Chats0

TLDR

A reliable dataset is produced that contains benign and seven common attack network flows, which meets real world criteria and is publicly avaliable and evaluates the performance of a comprehensive set of network traffic features and machine learning algorithms to indicate the best set of features for detecting the certain attack categories.

Abstract:

With exponential growth in the size of computer networks and developed applications, the significant increasing of the potential damage that can be caused by launching attacks is becoming obvious. Meanwhile, Intrusion Detection Systems (IDSs) and Intrusion Prevention Systems (IPSs) are one of the most important defense tools against the sophisticated and ever-growing network attacks. Due to the lack of adequate dataset, anomaly-based approaches in intrusion detection systems are suffering from accurate deployment, analysis and evaluation. There exist a number of such datasets such as DARPA98, KDD99, ISC2012, and ADFA13 that have been used by the researchers to evaluate the performance of their proposed intrusion detection and intrusion prevention approaches. Based on our study over eleven available datasets since 1998, many such datasets are out of date and unreliable to use. Some of these datasets suffer from lack of traffic diversity and volumes, some of them do not cover the variety of attacks, while others anonymized packet information and payload which cannot reflect the current trends, or they lack feature set and metadata. This paper produces a reliable dataset that contains benign and seven common attack network flows, which meets real world criteria and is publicly avaliable. Consequently, the paper evaluates the performance of a comprehensive set of network traffic features and machine learning algorithms to indicate the best set of features for detecting the certain attack categories.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Unsupervised feature selection and cluster center initialization based arbitrary shaped clusters for intrusion detection

Mahendra Prasad, +2 more

- 01 Dec 2020 -

Computers & Security

TL;DR: A clustering method based on unsupervised feature selection and cluster center initialization for intrusion detection that performs better than basic clustering, which takes fewer iterations to form final clusters and provides better accuracy.

...read moreread less

Journal ArticleDOI

An Empirical Evaluation of Deep Learning for Network Anomaly Detection

Ritesh K. Malaiya, +5 more

- 23 Sep 2019 -

IEEE Access

TL;DR: This study designs and examines deep learning models constructed based on Fully Connected Networks, Variational AutoEncoder (VAE), and Sequence-to-Sequence (Seq2Seq) structures, and confirms the feasibility of deep learning-based network anomaly detection.

...read moreread less

Posted ContentDOI

New Directions in Automated Traffic Analysis

Jordan Holland, +3 more

- 06 Aug 2020 -

arXiv: Cryptography and Security

TL;DR: In this article, the authors introduce nPrint, a tool that generates a unified packet representation that is amenable for representation learning and model training, and integrate nPrint with Automated Machine Learning (AutoML) to automate many aspects of traffic analysis.

...read moreread less

Journal ArticleDOI

Empirical study on multiclass classification‐based network intrusion detection

Wisam Elmasry, +2 more

TL;DR: A comprehensive empirical study on network intrusion detection as a multiclass classification task, not just to detect a suspicious connection but also to assign the correct type as well, showing a significant improvement in the detection of network attacks with the recommended approach.

...read moreread less

Journal ArticleDOI

A Survey on Device Behavior Fingerprinting: Data Sources, Techniques, Application Scenarios, and Datasets

Pedro Miguel Sánchez Sánchez, +5 more

- 07 Aug 2020 -

arXiv: Cryptography and Security

TL;DR: In this paper, a comprehensive review of the device types, behavioral data, and processing and evaluation techniques used by the most recent and representative research works dealing with two major scenarios: device identification and device misbehavior detection.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

A detailed analysis of the KDD CUP 99 data set

Mahbod Tavallaee, +3 more

TL;DR: A new data set is proposed, NSL-KDD, which consists of selected records of the complete KDD data set and does not suffer from any of mentioned shortcomings.

...read moreread less

Journal ArticleDOI

Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory

John McHugh

- 01 Nov 2000 -

ACM Transactions on Information and Syst...

TL;DR: The purpose of this article is to attempt to identify the shortcomings of the Lincoln Lab effort in the hope that future efforts of this kind will be placed on a sounder footing.

...read moreread less

Journal ArticleDOI

Toward developing a systematic approach to generate benchmark datasets for intrusion detection

Ali Shiravi, +3 more

- 01 May 2012 -

Computers & Security

TL;DR: The intent for this dataset is to assist various researchers in acquiring datasets of this kind for testing, evaluation, and comparison purposes, through sharing the generated datasets and profiles.

...read moreread less

Proceedings ArticleDOI

Characterization of Tor Traffic using Time based Features.

Arash Habibi Lashkari, +3 more

TL;DR: A time analysis on Tor traffic flows is presented, captured between the client and the entry node, to detect the application type: Browsing, Chat, Streaming, Mail, Voip, P2P or File Transfer.

...read moreread less

Proceedings ArticleDOI

Generation of a new IDS test dataset: Time to retire the KDD collection

Gideon Creech, +1 more

TL;DR: A new publicly available dataset is introduced which is representative of modern attack structure and methodology and is contrasted with the legacy datasets, and the performance difference of commonly used intrusion detection algorithms is highlighted.

...read moreread less

IEEE Communications Surveys and Tutorial...

Outside the Closed World: On Using Machine Learning for Network Intrusion Detection

Robin Sommer, +1 more

Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization

Citations

Unsupervised feature selection and cluster center initialization based arbitrary shaped clusters for intrusion detection

An Empirical Evaluation of Deep Learning for Network Anomaly Detection

New Directions in Automated Traffic Analysis

Empirical study on multiclass classification‐based network intrusion detection

A Survey on Device Behavior Fingerprinting: Data Sources, Techniques, Application Scenarios, and Datasets

References

A detailed analysis of the KDD CUP 99 data set

Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory

Toward developing a systematic approach to generate benchmark datasets for intrusion detection

Characterization of Tor Traffic using Time based Features.

Generation of a new IDS test dataset: Time to retire the KDD collection

Related Papers (5)

A detailed analysis of the KDD CUP 99 data set

UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set)

A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection

Outside the Closed World: On Using Machine Learning for Network Intrusion Detection

Scikit-learn: Machine Learning in Python