scispace - formally typeset
Open AccessJournal ArticleDOI

Web Traffic Time Series Forecasting Using LSTM Neural Networks with Distributed Asynchronous Training

Reads0
Chats0
TLDR
An architecture that collects source data and in a supervised way performs the forecasting of the time series of the page views of the Wikipedia page views is introduced, representing a significant step forward in the field of time series prediction for web traffic forecasting.
Abstract
Evaluating web traffic on a web server is highly critical for web service providers since, without a proper demand forecast, customers could have lengthy waiting times and abandon that website. However, this is a challenging task since it requires making reliable predictions based on the arbitrary nature of human behavior. We introduce an architecture that collects source data and in a supervised way performs the forecasting of the time series of the page views. Based on the Wikipedia page views dataset proposed in a competition by Kaggle in 2017, we created an updated version of it for the years 2018–2020. This dataset is processed and the features and hidden patterns in data are obtained for later designing an advanced version of a recurrent neural network called Long Short-Term Memory. This AI model is distributed training, according to the paradigm called data parallelism and using the Downpour training strategy. Predictions made for the seven dominant languages in the dataset are accurate with loss function and measurement error in reasonable ranges. Despite the fact that the analyzed time series have fairly bad patterns of seasonality and trend, the predictions have been quite good, evidencing that an analysis of the hidden patterns and the features extraction before the design of the AI model enhances the model accuracy. In addition, the improvement of the accuracy of the model with the distributed training is remarkable. Since the task of predicting web traffic in as precise quantities as possible requires large datasets, we designed a forecasting system to be accurate despite having limited data in the dataset. We tested the proposed model on the new Wikipedia page views dataset we created and obtained a highly accurate prediction; actually, the mean absolute error of predictions regarding the original one on average is below 30. This represents a significant step forward in the field of time series prediction for web traffic forecasting.

read more

Citations
More filters
Journal ArticleDOI

Hybrid Model for Time Series of Complex Structure with ARIMA Components

TL;DR: A technique to identify HMTS is described and operations to detect anomalies are proposed and the HMTS efficiency was compared with the nonlinear autoregression neural network NARX, which confirmedHMTS efficiency.
Journal ArticleDOI

Wavelet LSTM for Fault Forecasting in Electrical Power Grids

TL;DR: In this paper , the authors proposed a failure prediction model for the first year of the pandemic in Brazil (2020) to verify the feasibility of using time series forecasting models for fault prediction.
Journal ArticleDOI

False Data Injection Attack Detection in Smart Grid Using Energy Consumption Forecasting

TL;DR: This work presents novel forecasting-aided anomaly detection using an CNN-LSTM based auto-encoder sequence to sequence architecture to combat against false data injection attacks and presents an adaptive optimal threshold based on the consumption patterns to identify abnormal behaviour.
Journal ArticleDOI

Fault Diagnosis of Electric Motors Using Deep Learning Algorithms and Its Application: A Review

TL;DR: In this paper, four traditional types of deep learning models: deep belief networks (DBN), autoencoders (AE), convolutional neural networks (CNN), and recurrent neural networks(RNN) are discussed and summarized.
Journal ArticleDOI

A comparative study of series hybrid approaches to model and predict the vehicle operating states

TL;DR: In this article, a threshold-based anomaly detection approach was developed based on the residual errors of the WNN-ARIMA model, which can help to make more accurate decisions regarding the maintenance of the vehicle.
References
More filters
Proceedings Article

Large Scale Distributed Deep Networks

TL;DR: This paper considers the problem of training a deep network with billions of parameters using tens of thousands of CPU cores and develops two algorithms for large-scale distributed training, Downpour SGD and Sandblaster L-BFGS, which increase the scale and speed of deep network training.
Posted Content

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

TL;DR: A systematic evaluation of generic convolutional and recurrent architectures for sequence modeling concludes that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutionals should be regarded as a natural starting point for sequence modeled tasks.
Journal ArticleDOI

25 years of time series forecasting

TL;DR: A review of the past 25 years of research into time series forecasting can be found in this paper, where the authors highlight results published in journals managed by the International Institute of Forecasters.
Journal ArticleDOI

Statistical and Machine Learning forecasting methods: Concerns and ways forward.

TL;DR: It is found that the post-sample accuracy of popular ML methods are dominated across both accuracy measures used and for all forecasting horizons examined, and that their computational requirements are considerably greater than those of statistical methods.
Proceedings ArticleDOI

A dual-stage attention-based recurrent neural network for time series prediction

TL;DR: Zhang et al. as discussed by the authors proposed a dual-stage attention-based recurrent neural network (DA-RNN) to capture long-term temporal dependencies appropriately and select the relevant driving series to make predictions.
Related Papers (5)