scispace - formally typeset
Search or ask a question
Author

Xinzheng He

Bio: Xinzheng He is an academic researcher. The author has contributed to research in topics: Deep learning & Statistical classification. The author has an hindex of 1, co-authored 3 publications receiving 715 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The experimental results show that RNN-IDS is very suitable for modeling a classification model with high accuracy and that its performance is superior to that of traditional machine learning classification methods in both binary and multiclass classification.
Abstract: Intrusion detection plays an important role in ensuring information security, and the key technology is to accurately identify various attacks in the network. In this paper, we explore how to model an intrusion detection system based on deep learning, and we propose a deep learning approach for intrusion detection using recurrent neural networks (RNN-IDS). Moreover, we study the performance of the model in binary classification and multiclass classification, and the number of neurons and different learning rate impacts on the performance of the proposed model. We compare it with those of J48, artificial neural network, random forest, support vector machine, and other machine learning methods proposed by previous researchers on the benchmark data set. The experimental results show that RNN-IDS is very suitable for modeling a classification model with high accuracy and that its performance is superior to that of traditional machine learning classification methods in both binary and multiclass classification. The RNN-IDS model improves the accuracy of the intrusion detection and provides a new research method for intrusion detection.

1,123 citations

Journal ArticleDOI
TL;DR: The experimental results show that the characteristics of the network traffic under the busy state can be well in line with the nature of Pareto distribution, and the network bandwidth utilization calculated in this way is also consistent with the actual situation, so that the model established in this paper can well describe the characteristic of network traffic.
Abstract: The establishment of the network traffic model has an important significance on the research of network. On the basis of existing researches, this paper proposes the improved Semi-Markov network traffic model, and re-divides the state of network traffic into idle state, normal state and busy state, and also researches the busy state of network traffic by the use of Pareto distribution, and later researches the state transition relation of Semi-Markov process on this basis. The experimental results show that the characteristics of the network traffic under the busy state can be well in line with the nature of Pareto distribution, and the network bandwidth utilization calculated in this way is also consistent with the actual situation, so that the model established in this paper can well describe the characteristics of network traffic.

1 citations

Journal ArticleDOI
TL;DR: Based on the problem that supervised machine learning requires labeled samples and fails to identify unknown traffic, the author innovatively integrates wavelet transform and K-means algorithm of unsupervised machine learning by combining the advantage of wavelet Transform in solving multi-fractal network traffic and proposes a traffic identification method based on wavelet spectrum of scatter factor and improved K-measures.
Abstract: Based on the problem that supervised machine learning requires labeled samples and fails to identify unknown traffic, the author innovatively integrates wavelet transform and K-means algorithm of unsupervised machine learning by combining the advantage of wavelet transform in solving multi-fractal network traffic and proposes a traffic identification method based on wavelet spectrum of scatter factor and improved K-means. This method represents each stream sequence with wavelet spectrum of scatter factor, which is taken as the input of clustering algorithm. The author carries out a cluster analysis with GA K-means algorithm. The experimental result suggests that this method has an obvious superiority in stability and accuracy of classification.

Cited by
More filters
Journal ArticleDOI
TL;DR: A highly scalable and hybrid DNNs framework called scale-hybrid-IDS-AlertNet is proposed which can be used in real-time to effectively monitor the network traffic and host-level events to proactively alert possible cyberattacks.
Abstract: Machine learning techniques are being widely used to develop an intrusion detection system (IDS) for detecting and classifying cyberattacks at the network-level and the host-level in a timely and automatic manner. However, many challenges arise since malicious attacks are continually changing and are occurring in very large volumes requiring a scalable solution. There are different malware datasets available publicly for further research by cyber security community. However, no existing study has shown the detailed analysis of the performance of various machine learning algorithms on various publicly available datasets. Due to the dynamic nature of malware with continuously changing attacking methods, the malware datasets available publicly are to be updated systematically and benchmarked. In this paper, a deep neural network (DNN), a type of deep learning model, is explored to develop a flexible and effective IDS to detect and classify unforeseen and unpredictable cyberattacks. The continuous change in network behavior and rapid evolution of attacks makes it necessary to evaluate various datasets which are generated over the years through static and dynamic approaches. This type of study facilitates to identify the best algorithm which can effectively work in detecting future cyberattacks. A comprehensive evaluation of experiments of DNNs and other classical machine learning classifiers are shown on various publicly available benchmark malware datasets. The optimal network parameters and network topologies for DNNs are chosen through the following hyperparameter selection methods with KDDCup 99 dataset. All the experiments of DNNs are run till 1,000 epochs with the learning rate varying in the range [0.01-0.5]. The DNN model which performed well on KDDCup 99 is applied on other datasets, such as NSL-KDD, UNSW-NB15, Kyoto, WSN-DS, and CICIDS 2017, to conduct the benchmark. Our DNN model learns the abstract and high-dimensional feature representation of the IDS data by passing them into many hidden layers. Through a rigorous experimental testing, it is confirmed that DNNs perform well in comparison with the classical machine learning classifiers. Finally, we propose a highly scalable and hybrid DNNs framework called scale-hybrid-IDS-AlertNet which can be used in real-time to effectively monitor the network traffic and host-level events to proactively alert possible cyberattacks.

847 citations

Journal ArticleDOI
TL;DR: This survey report describes key literature surveys on machine learning (ML) and deep learning (DL) methods for network analysis of intrusion detection and provides a brief tutorial description of each ML/DL method.
Abstract: With the development of the Internet, cyber-attacks are changing rapidly and the cyber security situation is not optimistic. This survey report describes key literature surveys on machine learning (ML) and deep learning (DL) methods for network analysis of intrusion detection and provides a brief tutorial description of each ML/DL method. Papers representing each method were indexed, read, and summarized based on their temporal or thermal correlations. Because data are so important in ML/DL methods, we describe some of the commonly used network datasets used in ML/DL, discuss the challenges of using ML/DL for cybersecurity and provide suggestions for research directions.

676 citations

Posted Content
TL;DR: A structured and comprehensive overview of research methods in deep learning-based anomaly detection, grouped state-of-the-art research techniques into different categories based on the underlying assumptions and approach adopted.
Abstract: Anomaly detection is an important problem that has been well-studied within diverse research areas and application domains. The aim of this survey is two-fold, firstly we present a structured and comprehensive overview of research methods in deep learning-based anomaly detection. Furthermore, we review the adoption of these methods for anomaly across various application domains and assess their effectiveness. We have grouped state-of-the-art research techniques into different categories based on the underlying assumptions and approach adopted. Within each category we outline the basic anomaly detection technique, along with its variants and present key assumptions, to differentiate between normal and anomalous behavior. For each category, we present we also present the advantages and limitations and discuss the computational complexity of the techniques in real application domains. Finally, we outline open issues in research and challenges faced while adopting these techniques.

522 citations

Journal ArticleDOI
01 Feb 2020
TL;DR: A survey of deep learning approaches for cyber security intrusion detection, the datasets used, and a comparative study to evaluate the efficiency of several methods are presented.
Abstract: In this paper, we present a survey of deep learning approaches for cybersecurity intrusion detection, the datasets used, and a comparative study. Specifically, we provide a review of intrusion detection systems based on deep learning approaches. The dataset plays an important role in intrusion detection, therefore we describe 35 well-known cyber datasets and provide a classification of these datasets into seven categories; namely, network traffic-based dataset, electrical network-based dataset, internet traffic-based dataset, virtual private network-based dataset, android apps-based dataset, IoT traffic-based dataset, and internet-connected devices-based dataset. We analyze seven deep learning models including recurrent neural networks, deep neural networks, restricted Boltzmann machines, deep belief networks, convolutional neural networks, deep Boltzmann machines, and deep autoencoders. For each model, we study the performance in two categories of classification (binary and multiclass) under two new real traffic datasets, namely, the CSE-CIC-IDS2018 dataset and the Bot-IoT dataset. In addition, we use the most important performance indicators, namely, accuracy, false alarm rate, and detection rate for evaluating the efficiency of several methods.

464 citations

Journal ArticleDOI
TL;DR: This paper mainly focus on the application of deep learning architectures to three major applications, namely (i) wild animal detection, (ii) small arm detection and (iii) human being detection.
Abstract: Deep learning has developed as an effective machine learning method that takes in numerous layers of features or representation of the data and provides state-of-the-art results. The application of deep learning has shown impressive performance in various application areas, particularly in image classification, segmentation and object detection. Recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we provide a detailed review of various deep architectures and model highlighting characteristics of particular model. Firstly, we described the functioning of CNN architectures and its components followed by detailed description of various CNN models starting with classical LeNet model to AlexNet, ZFNet, GoogleNet, VGGNet, ResNet, ResNeXt, SENet, DenseNet, Xception, PNAS/ENAS. We mainly focus on the application of deep learning architectures to three major applications, namely (i) wild animal detection, (ii) small arm detection and (iii) human being detection. A detailed review summary including the systems, database, application and accuracy claimed is also provided for each model to serve as guidelines for future work in the above application areas.

435 citations