scispace - formally typeset
Search or ask a question
Book ChapterDOI

DDAM: Detecting DDoS Attacks Using Machine Learning Approach

TL;DR: The experimental results on the real-time dataset confirm that the proposed machine learning approach can effectively detect network anomalies with high detection rate and low false positive rate.
Abstract: Dealing the Distributed Denial of Service (DDoS) attack is a continuing challenge in the field of network security. An Intrusion Detection System (IDS) is one of the solutions to detect the DDoS attack. The IDS system should always be updated with the attack disincentive to preserve the network security service. In this paper, we propose a new approach for anomaly detection using machine learning to secure the network and to determine the attack patterns. The major contribution is to create real-time dataset and to use the naive Bayes algorithm as a classifier for detecting and comparing its performance with the existing classifiers like random forest and J48 algorithm. The experimental results on the real-time dataset confirm that the proposed machine learning approach can effectively detect network anomalies with high detection rate and low false positive rate.
Citations
More filters
Journal ArticleDOI
TL;DR: A novel hybrid framework based on data stream approach for detecting DDoS attack with incremental learning is proposed and the naive Bayes, random forest, decision tree, multilayer perceptron (MLP), and k-nearest neighbors (K-NN) on the proxy side to make better results.

74 citations

Book ChapterDOI
01 Jan 2021
TL;DR: DDoS attack was performed using ping of death technique and detected using machine learning technique by using WEKA tool and 99.76% of the samples were correctly classified.
Abstract: Numerous attacks are performed on network infrastructures. These include attacks on network availability, confidentiality and integrity. Distributed denial-of-service (DDoS) attack is a persistent attack which affects the availability of the network. Command and Control (C & C) mechanism is used to perform such kind of attack. Various researchers have proposed different methods based on machine learning technique to detect these attacks. In this paper, DDoS attack was performed using ping of death technique and detected using machine learning technique by using WEKA tool. NSL-KDD dataset was used in this experiment. Random forest algorithm was used to perform classification of the normal and attack samples. 99.76% of the samples were correctly classified.

32 citations

Journal ArticleDOI
TL;DR: A novel technique for feature selection is introduced, which combines five feature selection techniques as a stack, and best accuracy of 99.87% is achieved with the XGBoost classifier after selecting the best eleven features from the KDD dataset.
Abstract: Wireless sensor networks (WSNs) are developing at an incredible pace because of their cost-effective solutions for applications like military and medical. WSN consists of a large number of nodes that have to suffer from constraints like limited computation capacity and limited battery capacity. There are a lot of attacks in WSNs; one of them is the distributed denial of service attack. Many studies have shown that decreasing the redundancy of relevant features from a dataset can make a model more accurate and efficient. In this paper, correlation-based feature selection, principal component analysis, linear discriminant analysis, recursive feature elimination, and univariate feature selection are used for feature selection. Results are compared after selecting features using these techniques. A novel technique for feature selection is introduced, which combines five feature selection techniques as a stack. After implementing the feature selection techniques, the model is trained with five machine learning algorithms, namely SVM, perceptron, K-nearest neighbor, stochastic gradient descent, and XGBoost. Finally, the model is evaluated with the help of K-fold cross-validation. Among all of the techniques best accuracy of 99.87% is achieved with the XGBoost classifier after selecting the best eleven features from the KDD dataset.

18 citations

Journal ArticleDOI
TL;DR: It is revealed that an adequate tuning of hyper-parameters and the way of pre-processing data input have a significant impact on the attack detection rate.

13 citations

Journal ArticleDOI
TL;DR: In this paper, a new model based on random forest and synthetic minority over-sampling technique (RF-SMOTE) was proposed to detect the attacks in an IoT network.
Abstract: In recent decades, the internet of things (IoT) is a growing technology in smart applications, where it is highly susceptible to security breaches, due to the resource constrained nature of IoT. Among the available security breaches in IoT, Mirai, denial of service, user to root attack, remote to local attack, and probe attack renounce the networks in several ways such as saturating link bandwidth, consumption of server resources, etc. Hence, the installation of antivirus software cannot be guaranteed, because the IoT devices are equipped only with light weighted operating systems. So, intrusion detection systems are developed for the detection of IoT attacks. In this research, a new model is proposed based on random forest and synthetic minority over-sampling technique (RF-SMOTE) to detect the attacks in an IoT network. In this research, the experimental analysis is performed for IoT attack detection, where the evaluation is done on NSL-KDD dataset and network-based detection of IoT (N-BaIoT) dataset, which are the well-known datasets for IoT attack detection. In the experimental phase, the proposed RF-SMOTE model showed minimum of 0.14% and maximum of 14.25% improvement in accuracy on NSL-KDD dataset for binary class. In addition, the proposed model averagely showed minimum of 0.04% and maximum of 7.35% improvement in accuracy on NSL-KDD dataset for four classes. Additionally, the proposed RF-SMOTE model showed minimum of 0.01% and maximum of 0.04% improvement in accuracy on N-BaIoT dataset related to the existing model’s decision tree, shallow model, etc.

12 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper provides a comprehensive survey of anomaly detection systems and hybrid intrusion detection systems of the recent past and present and discusses recent technological trends in anomaly detection and identifies open problems and challenges in this area.

1,433 citations

Proceedings Article
01 Jan 1996
TL;DR: The simple Bayesian classi er (SBC) is commonly thought to assume that attributes are independent given the class, but this is apparently contradicted by the surprisingly good performance it exhibits in many domains that contain clear attribute dependences as mentioned in this paper.
Abstract: The simple Bayesian classi er (SBC) is commonly thought to assume that attributes are independent given the class, but this is apparently contradicted by the surprisingly good performance it exhibits in many domains that contain clear attribute dependences. No explanation for this has been proposed so far. In this paper we show that the SBC does not in fact assume attribute independence, and can be optimal even when this assumption is violated by a wide margin. The key to this nding lies in the distinction between classi cation and probability estimation: correct classi cation can be achieved even when the probability estimates used contain large errors. We show that the previously-assumed region of optimality of the SBC is a second-order in nitesimal fraction of the actual one. This is followed by the derivation of several necessary and several su cient conditions for the optimality of the SBC. For example, the SBC is optimal for learning arbitrary conjunctions and disjunctions, even though they violate the independence assumption. The paper also reports empirical evidence of the SBC's competitive performance in domains containing substantial degrees of attribute dependence. 1 THE SIMPLE BAYESIAN

798 citations

Proceedings ArticleDOI
12 Nov 2010
TL;DR: The analysis clearly shows that PCA has the potential to perform feature selection and is able to select a number of important individuals from all the feature components and the devised algorithm is not only subject to the nature of PCA but also computationally efficient.
Abstract: Principal component analysis (PCA) has been widely applied in the area of computer science. It is well-known that PCA is a popular transform method and the transform result is not directly related to a sole feature component of the original sample. However, in this paper, we try to apply principal components analysis (PCA) to feature selection. The proposed method well addresses the feature selection issue, from a viewpoint of numerical analysis. The analysis clearly shows that PCA has the potential to perform feature selection and is able to select a number of important individuals from all the feature components. Our method assumes that different feature components of original samples have different effects on feature extraction result and exploits the eigenvectors of the covariance matrix of PCA to evaluate the significance of each feature component of the original sample. When evaluating the significance of the feature components, the proposed method takes a number of eigenvectors into account. Then it uses a reasonable scheme to perform feature selection. The devised algorithm is not only subject to the nature of PCA but also computationally efficient. The experimental results on face recognition show that when the proposed method is able to greatly reduce the dimensionality of the original samples, it also does not bring the decrease in the recognition accuracy.

284 citations

01 Jan 2007
TL;DR: It is observed that the proposed technique performs better in terms of false positive rate, cost, and computational time when applied to KDD’99 data sets compared to a back propagation neural network based approach.
Abstract: Summary With the tremendous growth of network-based services and sensitive information on networks, network security is getting more and more importance than ever. Intrusion poses a serious security risk in a network environment. The ever growing new intrusion types posses a serious problem for their detection. The human labelling of the available network audit data instances is usually tedious, time consuming and expensive. In this paper, we apply one of the efficient data mining algorithms called naive bayes for anomaly based network intrusion detection. Experimental results on the KDD cup’99 data set show the novelty of our approach in detecting network intrusion. It is observed that the proposed technique performs better in terms of false positive rate, cost, and computational time when applied to KDD’99 data sets compared to a back propagation neural network based approach.

219 citations

Journal ArticleDOI
TL;DR: This paper shows how to apply the naive Bayes methodology to numeric prediction tasks by modeling the probability distribution of the target value with kernel density estimators, and compares it to linear regression, locally weightedlinear regression, and a method that produces “model trees”—decision trees with linear regression functions at the leaves.
Abstract: Despite its simplicity, the naive Bayes learning scheme performs well on most classification tasks, and is often significantly more accurate than more sophisticated methods. Although the probability estimates that it produces can be inaccurate, it often assigns maximum probability to the correct class. This suggests that its good performance might be restricted to situations where the output is categorical. It is therefore interesting to see how it performs in domains where the predicted value is numeric, because in this case, predictions are more sensitive to inaccurate probability estimates. This paper shows how to apply the naive Bayes methodology to numeric prediction (i.e., regression) tasks by modeling the probability distribution of the target value with kernel density estimators, and compares it to linear regression, locally weighted linear regression, and a method that produces “model trees”—decision trees with linear regression functions at the leaves. Although we exhibit an artificial dataset for which naive Bayes is the method of choice, on real-world datasets it is almost uniformly worse than locally weighted linear regression and model trees. The comparison with linear regression depends on the error measure: for one measure naive Bayes performs similarly, while for another it is worse. We also show that standard naive Bayes applied to regression problems by discretizing the target value performs similarly badly. We then present empirical evidence that isolates naive Bayes' independence assumption as the culprit for its poor performance in the regression setting. These results indicate that the simplistic statistical assumption that naive Bayes makes is indeed more restrictive for regression than for classification.

172 citations