scispace - formally typeset
Search or ask a question

Showing papers by "Antonio Sánchez-Esguevillas published in 2021"


Journal ArticleDOI
TL;DR: In this article, the authors extend the classic RBF neural network by including it as a policy network in an offline reinforcement learning algorithm, where all parameters of the RBF are learned end-to-end by gradient descent without external optimization.
Abstract: Network intrusion detection focuses on classifying network traffic as either normal or attack carrier. The classification is based on information extracted from the network flow packets. This is a complex classification problem with unbalanced datasets and noisy data. This work extends the classic radial basis function (RBF) neural network by including it as a policy network in an offline reinforcement learning algorithm. With this approach, all parameters of the radial basis functions (along with the network weights) are learned end-to-end by gradient descent without external optimization. We further explore how additional dense hidden-layers, and the number of radial basis kernels influence the results. This novel approach is applied to five prominent intrusion detection datasets (NSL-KDD, UNSW-NB15, AWID, CICIDS2017 and CICDDOS2019) achieving better performance metrics than alternative state-of-the-art models. Each dataset provides different restrictions and challenges allowing a better validation of results. Analysis of the results shows that the proposed architectures are excellent candidates for designing classifiers with the constraints imposed by network intrusion detection. We discuss the importance of dataset imbalance and how the proposed methods may be critically important for unbalanced datasets.

20 citations


Journal ArticleDOI
23 Apr 2021-Sensors
TL;DR: The proposed quantile regression neural network based on a novel constrained weighted quantile loss (CWQLoss) and its application to probabilistic short and medium-term electric-load forecasting of special interest for smart grids operations is proposed and shown to achieve the best results when an additive ensemble neural network is used as the base model.
Abstract: This work proposes a quantile regression neural network based on a novel constrained weighted quantile loss (CWQLoss) and its application to probabilistic short and medium-term electric-load forecasting of special interest for smart grids operations. The method allows any point forecast neural network based on a multivariate multi-output regression model to be expanded to become a quantile regression model. CWQLoss extends the pinball loss to more than one quantile by creating a weighted average for all predictions in the forecast window and across all quantiles. The pinball loss for each quantile is evaluated separately. The proposed method imposes additional constraints on the quantile values and their associated weights. It is shown that these restrictions are important to have a stable and efficient model. Quantile weights are learned end-to-end by gradient descent along with the network weights. The proposed model achieves two objectives: (a) produce probabilistic (quantile and interval) forecasts with an associated probability for the predicted target values. (b) generate point forecasts by adopting the forecast for the median (0.5 quantiles). We provide specific metrics for point and probabilistic forecasts to evaluate the results considering both objectives. A comprehensive comparison is performed between a selection of classic and advanced forecasting models with the proposed quantile forecasting model. We consider different scenarios for the duration of the forecast window (1 h, 1-day, 1-week, and 1-month), with the proposed model achieving the best results in almost all scenarios. Additionally, we show that the proposed method obtains the best results when an additive ensemble neural network is used as the base model. The experimental results are drawn from real loads of a medium-sized city in Spain.

15 citations


Journal ArticleDOI
TL;DR: This work proposes to replace the original network addresses by new features based on a set of distances defined between different components of the source and destination IP and Port addresses, which significantly increase the prediction performance of most classifiers for the detection of network intrusions.
Abstract: Including high-dimensional categorical predictors in a machine learning model is a major challenge. This is particularly appropriate for the IP and Port addresses of network connections when they are considered as predictors (features) in machine learning models. These features are particularly important for network intrusion detection, as many attacks exploit information about IP/Port addresses. The sparsity and high dimensionality of these features make it difficult their inclusion into the models, being discarded as useful information in many cases. This work proposes to replace the original network addresses by new features based on a set of distances defined between different components of the source and destination IP and Port addresses. These distances incorporate information on the probability of co-occurrence of source and destination addresses. The distances are calculated using a dense, low-dimensional vector representation (embedding) of the different network address components. The embeddings are obtained with a neural network, which requires few computational resources, plus an additional hash function that collapses the extremely large range of IP and Port values, making the model implementation feasible. A self-supervised learning framework under a hierarchical model is used to train the encoding network. The novel features can be used to predict future co-occurrence of source and destination network addresses, and, when applied as features in a supervised model, they significantly increase the prediction performance of most classifiers for the detection of network intrusions. We demonstrate this prediction improvement over two modern network intrusion datasets: CICIDS2017 and CICDDoS2019.

15 citations


Journal ArticleDOI
TL;DR: This work explores the influence of critical parameters when performing time-series forecasting, such as rolling window length, k-step ahead forecast length, and number/nature of features used to characterize the information used as predictors.
Abstract: This work brings together and applies a large representation of the most novel forecasting techniques, with origins and applications in other fields, to the short-term electric load forecasting problem. We present a comparison study between different classic machine learning and deep learning techniques and recent methods for data-driven analysis of dynamical models (dynamic mode decomposition) and deep learning ensemble models applied to short-term load forecasting. This work explores the influence of critical parameters when performing time-series forecasting, such as rolling window length, k-step ahead forecast length, and number/nature of features used to characterize the information used as predictors. The deep learning architectures considered include 1D/2D convolutional and recurrent neural networks and their combination, Seq2seq with and without attention mechanisms, and recent ensemble models based on gradient boosting principles. Three groups of models stand out from the rest according to the forecast scenario: (a) deep learning ensemble models for average results, (b) simple linear regression and Seq2seq models for very short-term forecasts, and (c) combinations of convolutional/recurrent models and deep learning ensemble models for longer-term forecasts.

13 citations