scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Malware traffic classification using convolutional neural network for representation learning

TL;DR: This paper presented a new taxonomy of traffic classification from an artificial intelligence perspective, and proposed a malware traffic classification method using convolutional neural network by taking traffic data as images by taking raw traffic as input data of classifier.
Abstract: Traffic classification is the first step for network anomaly detection or network based intrusion detection system and plays an important role in network security domain. In this paper we first presented a new taxonomy of traffic classification from an artificial intelligence perspective, and then proposed a malware traffic classification method using convolutional neural network by taking traffic data as images. This method needed no hand-designed features but directly took raw traffic as input data of classifier. To the best of our knowledge this interesting attempt is the first time of applying representation learning approach to malware traffic classification using raw traffic data. We determined that the best type of traffic representation is session with all layers through eight experiments. The method is validated in two scenarios including three types of classifiers and the experiment results show that our proposed method can satisfy the accuracy requirement of practical application.
Citations
More filters
Journal ArticleDOI
TL;DR: This paper bridges the gap between deep learning and mobile and wireless networking research, by presenting a comprehensive survey of the crossovers between the two areas, and provides an encyclopedic review of mobile and Wireless networking research based on deep learning, which is categorize by different domains.
Abstract: The rapid uptake of mobile devices and the rising popularity of mobile applications and services pose unprecedented demands on mobile and wireless networking infrastructure. Upcoming 5G systems are evolving to support exploding mobile traffic volumes, real-time extraction of fine-grained analytics, and agile management of network resources, so as to maximize user experience. Fulfilling these tasks is challenging, as mobile environments are increasingly complex, heterogeneous, and evolving. One potential solution is to resort to advanced machine learning techniques, in order to help manage the rise in data volumes and algorithm-driven applications. The recent success of deep learning underpins new and powerful tools that tackle problems in this space. In this paper, we bridge the gap between deep learning and mobile and wireless networking research, by presenting a comprehensive survey of the crossovers between the two areas. We first briefly introduce essential background and state-of-the-art in deep learning techniques with potential applications to networking. We then discuss several techniques and platforms that facilitate the efficient deployment of deep learning onto mobile systems. Subsequently, we provide an encyclopedic review of mobile and wireless networking research based on deep learning, which we categorize by different domains. Drawing from our experience, we discuss how to tailor deep learning to mobile environments. We complete this survey by pinpointing current challenges and open future directions for research.

975 citations


Cites methods from "Malware traffic classification usin..."

  • ...[200] Malware traffic classification Traffic extracted from 9 types of malware Superivised CNN Oulehla et al....

    [...]

  • ...[200] Malware traffic classification CNN SGD First work to use representation learning for malware classification from raw traffic Aceto et al....

    [...]

  • ...CNNs have also been used to identify malware traffic, where work in [200] regards traffic data as images and unusual patterns that malware traffic exhibit are classified by representation learning....

    [...]

  • ...Network Security [165], [200], [286], [320]–[344], [344]–[350] Signal Processing [273], [308], [310], [351]–[367] Emerging Applications [368]–[372]...

    [...]

Journal ArticleDOI
TL;DR: This survey report describes key literature surveys on machine learning (ML) and deep learning (DL) methods for network analysis of intrusion detection and provides a brief tutorial description of each ML/DL method.
Abstract: With the development of the Internet, cyber-attacks are changing rapidly and the cyber security situation is not optimistic. This survey report describes key literature surveys on machine learning (ML) and deep learning (DL) methods for network analysis of intrusion detection and provides a brief tutorial description of each ML/DL method. Papers representing each method were indexed, read, and summarized based on their temporal or thermal correlations. Because data are so important in ML/DL methods, we describe some of the commonly used network datasets used in ML/DL, discuss the challenges of using ML/DL for cybersecurity and provide suggestions for research directions.

676 citations

Proceedings ArticleDOI
22 Jul 2017
TL;DR: Among all of the four experiments, with the best traffic representation and the fine-tuned model, 11 of 12 evaluation metrics of the experiment results outperform the state-of-the-art method, which indicates the effectiveness of the proposed method.
Abstract: Traffic classification plays an important and basic role in network management and cyberspace security. With the widespread use of encryption techniques in network applications, encrypted traffic has recently become a great challenge for the traditional traffic classification methods. In this paper we proposed an end-to-end encrypted traffic classification method with one-dimensional convolution neural networks. This method integrates feature extraction, feature selection and classifier into a unified end-to-end framework, intending to automatically learning nonlinear relationship between raw input and expected output. To the best of our knowledge, it is the first time to apply an end-to-end method to the encrypted traffic classification domain. The method is validated with the public ISCX VPN-nonVPN traffic dataset. Among all of the four experiments, with the best traffic representation and the fine-tuned model, 11 of 12 evaluation metrics of the experiment results outperform the state-of-the-art method, which indicates the effectiveness of the proposed method.

496 citations


Cites methods or result from "Malware traffic classification usin..."

  • ...In summary, the best type of traffic representation is session + all layers, and this conclusion is consistent with our previous findings of [6]....

    [...]

  • ...The preprocess tool is USTC-TL2016 developed by our team in [6], consisting of four steps: traffic split, traffic clean, image generation and IDX conversion....

    [...]

  • ...Our team proposed a malware traffic classification method with 2D-CNN [6]....

    [...]

  • ...Recently, there are some researches about applying CNN in network traffic analysis, such as malware classification by our team [6]....

    [...]

  • ...Based on the traffic classification taxonomy proposed by [6], they are both machine learning approaches, whose general workflow is as follows: firstly hand-designing the traffic features (e....

    [...]

Journal ArticleDOI
TL;DR: This paper proposes a novel IDS called the hierarchical spatial-temporal features-based intrusion detection system (HAST-IDS), which first learns the low-level spatial features of network traffic using deep convolutional neural networks (CNNs) and then learns high-level temporal features using long short-term memory networks.
Abstract: The development of an anomaly-based intrusion detection system (IDS) is a primary research direction in the field of intrusion detection. An IDS learns normal and anomalous behavior by analyzing network traffic and can detect unknown and new attacks. However, the performance of an IDS is highly dependent on feature design, and designing a feature set that can accurately characterize network traffic is still an ongoing research issue. Anomaly-based IDSs also have the problem of a high false alarm rate (FAR), which seriously restricts their practical applications. In this paper, we propose a novel IDS called the hierarchical spatial-temporal features-based intrusion detection system (HAST-IDS), which first learns the low-level spatial features of network traffic using deep convolutional neural networks (CNNs) and then learns high-level temporal features using long short-term memory networks. The entire process of feature learning is completed by the deep neural networks automatically; no feature engineering techniques are required. The automatically learned traffic features effectively reduce the FAR. The standard DARPA1998 and ISCX2012 data sets are used to evaluate the performance of the proposed system. The experimental results show that the HAST-IDS outperforms other published approaches in terms of accuracy, detection rate, and FAR, which successfully demonstrates its effectiveness in both feature learning and FAR reduction.

398 citations


Cites methods or result from "Malware traffic classification usin..."

  • ...Combining the previous research results [13], [48], we conclude that deep neural networks can automatically learn features directly from raw network traffic data and achieve good results in the field of intrusion detection or network anomaly detection....

    [...]

  • ...[13] used a CNN to learn the spatial features of network traffic and achievedmalware traffic classification using the image classification method....

    [...]

Journal ArticleDOI
TL;DR: Different state-of-the-art DL techniques from (standard) TC are reproduced, dissected, and set into a systematic framework for comparison, including also a performance evaluation workbench, to propose deep learning classifiers based on automatically extracted features, able to cope with encrypted traffic, and reflecting their complex traffic patterns.
Abstract: The massive adoption of hand-held devices has led to the explosion of mobile traffic volumes traversing home and enterprise networks, as well as the Internet. Traffic classification (TC), i.e., the set of procedures for inferring (mobile) applications generating such traffic, has become nowadays the enabler for highly valuable profiling information (with certain privacy downsides), other than being the workhorse for service differentiation/blocking. Nonetheless, the design of accurate classifiers is exacerbated by the raising adoption of encrypted protocols (such as TLS), hindering the suitability of (effective) deep packet inspection approaches. Also, the fast-expanding set of apps and the moving-target nature of mobile traffic makes design solutions with usual machine learning, based on manually and expert-originated features, outdated and unable to keep the pace. For these reasons deep learning (DL) is here proposed, for the first time, as a viable strategy to design practical mobile traffic classifiers based on automatically extracted features, able to cope with encrypted traffic, and reflecting their complex traffic patterns. To this end, different state-of-the-art DL techniques from (standard) TC are here reproduced, dissected (highlighting critical choices), and set into a systematic framework for comparison, including also a performance evaluation workbench. The latter outcome, although declined in the mobile context, has the applicability appeal to the wider umbrella of encrypted TC tasks. Finally, the performance of these DL classifiers is critically investigated based on an exhaustive experimental validation (based on three mobile datasets of real human users’ activity), highlighting the related pitfalls, design guidelines, and challenges.

359 citations

References
More filters
Journal ArticleDOI
28 May 2015-Nature
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

46,982 citations

Book
18 Nov 2016
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

38,208 citations

Journal ArticleDOI
TL;DR: Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks.
Abstract: The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks. This motivates longer term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation, and manifold learning.

11,201 citations


"Malware traffic classification usin..." refers background in this paper

  • ...Representation learning is a new rapidly developing machine learning approach in recent years that automatically learning features from raw data and to a certain extent has solved the problem of hand-designing features [4]....

    [...]

Posted Content
TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.
Abstract: TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.

10,447 citations

Proceedings ArticleDOI
08 Jul 2009
TL;DR: A new data set is proposed, NSL-KDD, which consists of selected records of the complete KDD data set and does not suffer from any of mentioned shortcomings.
Abstract: During the last decade, anomaly detection has attracted the attention of many researchers to overcome the weakness of signature-based IDSs in detecting novel attacks, and KDDCUP'99 is the mostly widely used data set for the evaluation of these systems. Having conducted a statistical analysis on this data set, we found two important issues which highly affects the performance of evaluated systems, and results in a very poor evaluation of anomaly detection approaches. To solve these issues, we have proposed a new data set, NSL-KDD, which consists of selected records of the complete KDD data set and does not suffer from any of mentioned shortcomings.

3,300 citations


"Malware traffic classification usin..." refers background in this paper

  • ...The following part introduces how to select traffic granularity and packet layers in our approach....

    [...]

  • ...Part I contains ten types of malware traffic from public website which were collected from real network environment by CTU researcher from 2011 to 2015 [16]....

    [...]