A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection

doi:10.1109/COMST.2015.2494502

Journal ArticleDOI

A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection

Anna L. Buczak, +1 more

- 22 Jan 2016 -

IEEE Communications Surveys and Tutorial...

- Vol. 18, Iss: 2, pp 1153-1176

TLDR

The complexity of ML/DM algorithms is addressed, discussion of challenges for using ML/ DM for cyber security is presented, and some recommendations on when to use a given method are provided.

Abstract:

This survey paper describes a focused literature survey of machine learning (ML) and data mining (DM) methods for cyber analytics in support of intrusion detection. Short tutorial descriptions of each ML/DM method are provided. Based on the number of citations or the relevance of an emerging method, papers representing each method were identified, read, and summarized. Because data are so important in ML/DM approaches, some well-known cyber data sets used in ML/DM are described. The complexity of ML/DM algorithms is addressed, discussion of challenges for using ML/DM for cyber security is presented, and some recommendations on when to use a given method are provided.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

A Discretized Extended Feature Space (DEFS) Model to Improve the Anomaly Detection Performance in Network Intrusion Detection Systems

Roberto Saia, +4 more

TL;DR: A novel Discretized Extended Feature Space (DEFS) model is introduced that presents a twofold advantage: first, through a discretization process it reduces the event patterns by grouping those similar in terms of feature values, reducing the issues related to the classification of unknown events; second, it balances such a discRETization by extending the event pattern with a series of meta-information able to well characterize them.

...read moreread less

Proceedings ArticleDOI

On the Analysis of Network Measurements Through Machine Learning: The Power of the Crowd

Pedro Casas

TL;DR: Results suggest that both neural networks and decision-tree-based models provide in general better results in terms of accuracy and prediction, with a much smaller computation overhead for decision trees as compared to models based on neural networks or support vector machines.

...read moreread less

Book ChapterDOI

STAN: Synthetic Network Traffic Generation with Generative Neural Models

Shengzhe Xu, +3 more

TL;DR: In this paper, the authors proposed Synthetic network Traffic Generation with Autoregressive Neural Models (STAN), a tool to generate realistic synthetic network traffic datasets for subsequent downstream applications.

...read moreread less

Book ChapterDOI

Requirements for Training and Evaluation Dataset of Network and Host Intrusion Detection System

Petteri Nevavuori, +1 more

TL;DR: The requirements for state-of-the-art NHIDS dataset are presented to be utilised for research and development of NHIDS applying machine learning and artificial intelligence.

...read moreread less

Journal ArticleDOI

Investigating the Effect of Traffic Sampling on Machine Learning-Based Network Intrusion Detection Approaches

- 01 Jan 2022 -

IEEE Access

TL;DR: Jumabek et al. as discussed by the authors explored the impact of packet sampling on the performance and efficiency of ML-based NIDSs and found that malicious flows with shorter size are likely to go unnoticed even with mild sampling rates such as 1/10 and 1/100.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Random Forests

Leo Breiman

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.

...read moreread less

Book

Fuzzy sets

Lotfi A. Zadeh

TL;DR: A separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.

...read moreread less

Journal ArticleDOI

Maximum likelihood from incomplete data via the EM algorithm

Arthur P. Dempster, +2 more

- 01 Sep 1977 -

Journal of the royal statistical society...

Book

The Nature of Statistical Learning Theory

Vladimir Vapnik

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?

...read moreread less

Journal ArticleDOI

Collective dynamics of small-world networks

Duncan J. Watts, +1 more

- 04 Jun 1998 -

Nature

TL;DR: Simple models of networks that can be tuned through this middle ground: regular networks ‘rewired’ to introduce increasing amounts of disorder are explored, finding that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs.

...read moreread less

Collapse

Computers & Security

UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set)

Nour Moustafa, +1 more

A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection

Citations

A Discretized Extended Feature Space (DEFS) Model to Improve the Anomaly Detection Performance in Network Intrusion Detection Systems

On the Analysis of Network Measurements Through Machine Learning: The Power of the Crowd

STAN: Synthetic Network Traffic Generation with Generative Neural Models

Requirements for Training and Evaluation Dataset of Network and Host Intrusion Detection System

Investigating the Effect of Traffic Sampling on Machine Learning-Based Network Intrusion Detection Approaches

References

Random Forests

Fuzzy sets

Maximum likelihood from incomplete data via the EM algorithm

The Nature of Statistical Learning Theory

Collective dynamics of small-world networks

Related Papers (5)

A detailed analysis of the KDD CUP 99 data set

Outside the Closed World: On Using Machine Learning for Network Intrusion Detection

Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization

Anomaly-based network intrusion detection: Techniques, systems and challenges

UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set)