Showing papers on "Recurrent neural network published in 2021"

PDF

Open Access

Journal Article•DOI•

Recurrent Neural Networks for Time Series Forecasting: Current status and future directions

[...]

Hansika Hewamalage¹, Christoph Bergmeir¹, Kasun Bandara¹•Institutions (1)

01 Jan 2021-International Journal of Forecasting

TL;DR: It is concluded that RNNs are capable of modelling seasonality directly if the series in the dataset possess homogeneous seasonal patterns; otherwise, it is recommended to recommend a deseasonalisation step.

...read moreread less

450 citations

Journal Article•DOI•

Clustered Federated Learning: Model-Agnostic Distributed Multitask Optimization Under Privacy Constraints

[...]

Felix Sattler¹, Klaus-Robert Müller², Wojciech Samek¹•Institutions (2)

Heinrich Hertz Institute¹, Technical University of Berlin²

01 Aug 2021-IEEE Transactions on Neural Networks

TL;DR: Clustered FL (CFL) as discussed by the authors exploits geometric properties of the FL loss surface to group the client population into clusters with jointly trainable data distributions, which can be viewed as a postprocessing method that will always achieve greater or equal performance than conventional FL by allowing clients to arrive at more specialized models.

...read moreread less

Abstract: Federated learning (FL) is currently the most widely adopted framework for collaborative training of (deep) machine learning models under privacy constraints. Albeit its popularity, it has been observed that FL yields suboptimal results if the local clients’ data distributions diverge. To address this issue, we present clustered FL (CFL), a novel federated multitask learning (FMTL) framework, which exploits geometric properties of the FL loss surface to group the client population into clusters with jointly trainable data distributions. In contrast to existing FMTL approaches, CFL does not require any modifications to the FL communication protocol to be made, is applicable to general nonconvex objectives (in particular, deep neural networks), does not require the number of clusters to be known a priori , and comes with strong mathematical guarantees on the clustering quality. CFL is flexible enough to handle client populations that vary over time and can be implemented in a privacy-preserving way. As clustering is only performed after FL has converged to a stationary point, CFL can be viewed as a postprocessing method that will always achieve greater or equal performance than conventional FL by allowing clients to arrive at more specialized models. We verify our theoretical analysis in experiments with deep convolutional and recurrent neural networks on commonly used FL data sets.

...read moreread less

234 citations

Journal Article•DOI•

Deep Learning for Time Series Forecasting: A Survey.

[...]

José F. Torres¹, Dalil Hadjout, Abderrazak Sebaa², Francisco Martínez-Álvarez¹, Alicia Troncoso¹ - Show less +1 more•Institutions (2)

Pablo de Olavide University¹, University of Béjaïa²

05 Feb 2021

TL;DR: The most common deep learning architectures that are currently being successfully applied to predict time series are described, highlighting their advantages and limitations.

...read moreread less

Abstract: Time series forecasting has become a very intensive field of research, which is even increasing in recent years. Deep neural networks have proved to be powerful and are achieving high accuracy in many application fields. For these reasons, they are one of the most widely used methods of machine learning to solve problems dealing with big data nowadays. In this work, the time series forecasting problem is initially formulated along with its mathematical fundamentals. Then, the most common deep learning architectures that are currently being successfully applied to predict time series are described, highlighting their advantages and limitations. Particular attention is given to feed forward networks, recurrent neural networks (including Elman, long-short term memory, gated recurrent units, and bidirectional networks), and convolutional neural networks. Practical aspects, such as the setting of values for hyper-parameters and the choice of the most suitable frameworks, for the successful application of deep learning to time series are also provided and discussed. Several fruitful research fields in which the architectures analyzed have obtained a good performance are reviewed. As a result, research gaps have been identified in the literature for several domains of application, thus expecting to inspire new and better forms of knowledge.

...read moreread less

207 citations

Journal Article•DOI•

An empirical survey of data augmentation for time series classification with neural networks.

[...]

Brian Kenji Iwana¹, Seiichi Uchida¹•Institutions (1)

Kyushu University¹

15 Jul 2021-PLOS ONE

TL;DR: A taxonomy is proposed and outline the four families in time series data augmentation, including transformation-based methods, pattern mixing, generative models, and decomposition methods, and their application to time series classification with neural networks.

...read moreread less

Abstract: In recent times, deep artificial neural networks have achieved many successes in pattern recognition. Part of this success can be attributed to the reliance on big data to increase generalization. However, in the field of time series recognition, many datasets are often very small. One method of addressing this problem is through the use of data augmentation. In this paper, we survey data augmentation techniques for time series and their application to time series classification with neural networks. We propose a taxonomy and outline the four families in time series data augmentation, including transformation-based methods, pattern mixing, generative models, and decomposition methods. Furthermore, we empirically evaluate 12 time series data augmentation methods on 128 time series classification datasets with six different types of neural networks. Through the results, we are able to analyze the characteristics, advantages and disadvantages, and recommendations of each data augmentation method. This survey aims to help in the selection of time series data augmentation for neural network applications.

...read moreread less

198 citations

Journal Article•DOI•

Application of deep learning algorithms in geotechnical engineering: a short critical review

[...]

Wengang Zhang¹, Hongrui Li¹, Yongqin Li¹, Hanlong Liu¹, Yumin Chen², Xuanming Ding¹ - Show less +2 more•Institutions (2)

Chongqing University¹, Hohai University²

16 Feb 2021-Artificial Intelligence Review

TL;DR: This study presented the state of practice of DL in geotechnical engineering, and depicted the statistical trend of the published papers, as well as describing four major algorithms, including feedforward neural, recurrent neural network, convolutional neural network and generative adversarial network.

...read moreread less

Abstract: With the advent of big data era, deep learning (DL) has become an essential research subject in the field of artificial intelligence (AI). DL algorithms are characterized with powerful feature learning and expression capabilities compared with the traditional machine learning (ML) methods, which attracts worldwide researchers from different fields to its increasingly wide applications. Furthermore, in the field of geochnical engineering, DL has been widely adopted in various research topics, a comprehensive review summarizing its application is desirable. Consequently, this study presented the state of practice of DL in geotechnical engineering, and depicted the statistical trend of the published papers. Four major algorithms, including feedforward neural (FNN), recurrent neural network (RNN), convolutional neural network (CNN) and generative adversarial network (GAN) along with their geotechnical applications were elaborated. In addition, a thorough summary containing pubilished literatures, the corresponding reference cases, the adopted DL algorithms as well as the related geotechnical topics was compiled. Furthermore, the challenges and perspectives of future development of DL in geotechnical engineering were presented and discussed.

...read moreread less

194 citations

Journal Article•DOI•

Optimized Graph Convolution Recurrent Neural Network for Traffic Prediction

[...]

Kan Guo¹, Yongli Hu¹, Zhen (Sean) Qian², Hao Liu, Ke Zhang, Yanfeng Sun¹, Junbin Gao³, Baocai Yin¹ - Show less +4 more•Institutions (3)

Beijing University of Technology¹, Carnegie Mellon University², University of Sydney³

01 Feb 2021-IEEE Transactions on Intelligent Transportation Systems

TL;DR: A graph network is introduced and an optimized graph convolution recurrent neural network is proposed for traffic prediction, in which the spatial information of the road network is represented as a graph, which outperforms state-of-the-art traffic prediction methods.

...read moreread less

Abstract: Traffic prediction is a core problem in the intelligent transportation system and has broad applications in the transportation management and planning, and the main challenge of this field is how to efficiently explore the spatial and temporal information of traffic data. Recently, various deep learning methods, such as convolution neural network (CNN), have shown promising performance in traffic prediction. However, it samples traffic data in regular grids as the input of CNN, thus it destroys the spatial structure of the road network. In this paper, we introduce a graph network and propose an optimized graph convolution recurrent neural network for traffic prediction, in which the spatial information of the road network is represented as a graph. Additionally, distinguishing with most current methods using a simple and empirical spatial graph, the proposed method learns an optimized graph through a data-driven way in the training phase, which reveals the latent relationship among the road segments from the traffic data. Lastly, the proposed method is evaluated on three real-world case studies, and the experimental results show that the proposed method outperforms state-of-the-art traffic prediction methods.

...read moreread less

164 citations

Journal Article•DOI•

ASRNN: A recurrent neural network with an attention model for sequence labeling

[...]

Jerry Chun-Wei Lin¹, Jerry Chun-Wei Lin², Yinan Shao³, Youcef Djenouri⁴, Unil Yun⁵ - Show less +1 more•Institutions (5)

Qingdao University¹, Bergen University College², Alibaba Group³, SINTEF⁴, Sejong University⁵

05 Jan 2021-Knowledge Based Systems

TL;DR: An attention segmental recurrent neural network (ASRNN) that relies on a hierarchical attention neural semi-Markov conditional random fields (semi-CRF) model for the task of sequence labeling that uses a hierarchical structure to incorporate character-level and word-level information and applies an attention mechanism to both levels.

...read moreread less

Abstract: Natural language processing (NLP) is useful for handling text and speech, and sequence labeling plays an important role by automatically analyzing a sequence (text) to assign category labels to each part. However, the performance of these conventional models depends greatly on hand-crafted features and task-specific knowledge, which is a time consuming task. Several conditional random fields (CRF)-based models for sequence labeling have been presented, but the major limitation is how to use neural networks for extracting useful representations for each unit or segment in the input sequence. In this paper, we propose an attention segmental recurrent neural network (ASRNN) that relies on a hierarchical attention neural semi-Markov conditional random fields (semi-CRF) model for the task of sequence labeling. Our model uses a hierarchical structure to incorporate character-level and word-level information and applies an attention mechanism to both levels. This enables our method to differentiate more important information from less important information when constructing the segmental representation. We evaluated our model on three sequence labeling tasks, including named entity recognition (NER), chunking, and reference parsing. Experimental results show that the proposed model benefited from the hierarchical structure, and it achieved competitive and robust performance on all three sequence labeling tasks.

...read moreread less

163 citations

Journal Article•DOI•

Online capacity estimation of lithium-ion batteries with deep long short-term memory networks

[...]

Weihan Li¹, Neil Sengupta¹, Philipp Dechent¹, David A. Howey², Anuradha M. Annaswamy³, Dirk Uwe Sauer - Show less +2 more•Institutions (3)

RWTH Aachen University¹, University of Oxford², Massachusetts Institute of Technology³

15 Jan 2021-Journal of Power Sources

TL;DR: The scope of this work is the development of a data-driven capacity estimation model for cells under real-world working conditions with recurrent neural networks having long short-term memory capability, which achieves a best-case mean absolute percentage error and is extremely robust while handling input noise.

...read moreread less

157 citations

Journal Article•DOI•

Contour Stella Image and Deep Learning for Signal Recognition in the Physical Layer

[...]

Yun Lin¹, Ya Tu¹, Zheng Dou¹, Lei Chen², Shiwen Mao³ - Show less +1 more•Institutions (3)

Harbin Engineering University¹, Georgia Southern University², Auburn University³

01 Mar 2021-IEEE Transactions on Cognitive Communications and Networking

TL;DR: The investigation validates that CSI is a promising method to bridge the gap between signal recognition and DL, and develops a framework to transform complex-valued signal waveforms into images with statistical significance, termed contour stellar image (CSI), which can convey deep level statistical information from the raw wireless signal waves while being represented in an image data format.

...read moreread less

Abstract: The rapid development of communication systems poses unprecedented challenges, e.g., handling exploding wireless signals in a real-time and fine-grained manner. Recent advances in data-driven machine learning algorithms, especially deep learning (DL), show great potential to address the challenges. However, waveforms in the physical layer may not be suitable for the prevalent classical DL models, such as convolution neural network (CNN) and recurrent neural network (RNN), which mainly accept formats of images, time series, and text data in the application layer. Therefore, it is of considerable interest to bridge the gap between signal waveforms to DL amenable data formats. In this article, we develop a framework to transform complex-valued signal waveforms into images with statistical significance, termed contour stellar image (CSI), which can convey deep level statistical information from the raw wireless signal waveforms while being represented in an image data format. In this article, we explore several potential application scenarios and present effective CSI-based solutions to address the signal recognition challenges. Our investigation validates that CSI is a promising method to bridge the gap between signal recognition and DL.

...read moreread less

152 citations

Journal Article•DOI•

Bearing fault diagnosis base on multi-scale CNN and LSTM model

[...]

Xiaohan Chen¹, Beike Zhang¹, Dong Gao¹•Institutions (1)

Beijing University of Chemical Technology¹

01 Apr 2021-Journal of Intelligent Manufacturing

TL;DR: This study proposes an automatic feature learning neural network that utilizes raw vibration signals as inputs, and uses two convolutional neural networks with different kernel sizes to automatically extract different frequency signal characteristics from raw data.

...read moreread less

Abstract: Intelligent fault diagnosis methods based on signal analysis have been widely used for bearing fault diagnosis. These methods use a pre-determined transformation (such as empirical mode decomposition, fast Fourier transform, discrete wavelet transform) to convert time-series signals into frequency domain signals, the performance of dignostic system is significantly rely on the extracted features. However, extracting signal characteristic is fairly time consuming and depends on specialized signal processing knowledge. Although some studies have developed highly accurate algorithms, the diagnostic results rely heavily on large data sets and unreliable human analysis. This study proposes an automatic feature learning neural network that utilizes raw vibration signals as inputs, and uses two convolutional neural networks with different kernel sizes to automatically extract different frequency signal characteristics from raw data. Then long short-term memory was used to identify the fault type according to learned features. The data is down-sampled before inputting into the network, greatly reducing the number of parameters. The experiment shows that the proposed method can not only achieve 98.46% average accuracy, exceeding some state-of-the-art intelligent algorithms based on prior knowledge and having better performance in noisy environments.

...read moreread less

151 citations

Journal Article•DOI•

Battery Fault Diagnosis for Electric Vehicles Based on Voltage Abnormality by Combining the Long Short-Term Memory Neural Network and the Equivalent Circuit Model

[...]

Li Da¹, Zhaosheng Zhang¹, Peng Liu¹, Zhenpo Wang¹, Lei Zhang¹ - Show less +1 more•Institutions (1)

Beijing Institute of Technology¹

01 Feb 2021-IEEE Transactions on Power Electronics

TL;DR: A novel battery fault diagnosis method is presented by combining the long short-term memory recurrent neural network and the equivalent circuit model to achieve accurate fault diagnosis for potential battery cell failure and precise locating of thermal runaway cells.

...read moreread less

Abstract: Battery fault diagnosis is essential for ensuring safe and reliable operation of electric vehicles. In this article, a novel battery fault diagnosis method is presented by combining the long short-term memory recurrent neural network and the equivalent circuit model. The modified adaptive boosting method is utilized to improve diagnosis accuracy, and a prejudging model is employed to reduce computational time and improve diagnosis reliability. Considering the influence of the driver behavior on battery systems, the proposed scheme is able to achieve potential failure risk assessment and accordingly to issue early thermal runaway warning. A large volume of real-world operation data is acquired from the National Monitoring and Management Center for New Energy Vehicles in China to examine its robustness, reliability, and superiority. The verification results show that the proposed method can achieve accurate fault diagnosis for potential battery cell failure and precise locating of thermal runaway cells.

...read moreread less

Journal Article•DOI•

Wind power forecasting – A data-driven method along with gated recurrent neural network

[...]

Adam Kisvari¹, Zi Lin², Xiaolei Liu¹•Institutions (2)

University of Glasgow¹, Northumbria University²

01 Jan 2021-Renewable Energy

TL;DR: A novel data-driven approach is proposed for wind power forecasting by integrating data pre-processing & re-sampling, anomalies detection & treatment, feature engineering, and hyperparameter tuning based on gated recurrent deep learning models, which is systematically presented for the first time.

...read moreread less

Proceedings Article•DOI•

Attention Is All You Need In Speech Separation

[...]

Cem Subakan, Mirco Ravanelli, Samuele Cornell¹, Mirko Bronzi, Jianyuan Zhong² - Show less +1 more•Institutions (2)

Marche Polytechnic University¹, University of Rochester²

06 Jun 2021

TL;DR: In this article, a RNN-free Transformer-based neural network for speech separation is proposed, which achieves state-of-the-art performance on the standard WSJ0-2/3mix datasets.

...read moreread less

Abstract: Recurrent Neural Networks (RNNs) have long been the dominant architecture in sequence-to-sequence learning. RNNs, however, are inherently sequential models that do not allow parallelization of their computations. Transformers are emerging as a natural alternative to standard RNNs, replacing recurrent computations with a multi-head attention mechanism.In this paper, we propose the SepFormer, a novel RNN-free Transformer-based neural network for speech separation. The Sep-Former learns short and long-term dependencies with a multi-scale approach that employs transformers. The proposed model achieves state-of-the-art (SOTA) performance on the standard WSJ0-2/3mix datasets. It reaches an SI-SNRi of 22.3 dB on WSJ0-2mix and an SI-SNRi of 19.5 dB on WSJ0-3mix. The SepFormer inherits the parallelization advantages of Transformers and achieves a competitive performance even when downsampling the encoded representation by a factor of 8. It is thus significantly faster and it is less memory-demanding than the latest speech separation systems with comparable performance.

...read moreread less

Journal Article•DOI•

Application of hybrid model based on empirical mode decomposition, novel recurrent neural networks and the ARIMA to wind speed prediction

[...]

Ming-De Liu¹, Lin Ding¹, Yulong Bai¹•Institutions (1)

Northwest Normal University¹

01 Apr 2021-Energy Conversion and Management

TL;DR: The results in this paper show that the EMD method can improve the wind speed prediction performance when it is combined with LSTM and after decomposition, L STM is suitable for predicting high complexity subsequences and the ARIMA is suited for effectively predicting low complexity subsequence based on the different sample entropies.

...read moreread less

Journal Article•DOI•

Dynamic memristor-based reservoir computing for high-efficiency temporal signal processing.

[...]

Yanan Zhong¹, Jianshi Tang¹, Xinyi Li¹, Bin Gao¹, He Qian¹, Huaqiang Wu¹ - Show less +2 more•Institutions (1)

Tsinghua University¹

18 Jan 2021-Nature Communications

TL;DR: In this paper, a parallel dynamic memristor-based reservoir computing system was proposed by applying a controllable mask process, in which the critical parameters, including state richness, feedback strength and input scaling, can be tuned by changing the mask length and the range of input signal.

...read moreread less

Abstract: Reservoir computing is a highly efficient network for processing temporal signals due to its low training cost compared to standard recurrent neural networks, and generating rich reservoir states is critical in the hardware implementation. In this work, we report a parallel dynamic memristor-based reservoir computing system by applying a controllable mask process, in which the critical parameters, including state richness, feedback strength and input scaling, can be tuned by changing the mask length and the range of input signal. Our system achieves a low word error rate of 0.4% in the spoken-digit recognition and low normalized root mean square error of 0.046 in the time-series prediction of the Henon map, which outperforms most existing hardware-based reservoir computing systems and also software-based one in the Henon map prediction task. Our work could pave the road towards high-efficiency memristor-based reservoir computing systems to handle more complex temporal tasks in the future.

...read moreread less

Journal Article•DOI•

Reduced-order modeling of advection-dominated systems with recurrent neural networks and convolutional autoencoders

[...]

Romit Maulik¹, Bethany Lusch¹, Prasanna Balaprakash¹•Institutions (1)

Argonne National Laboratory¹

05 Mar 2021-Physics of Fluids

TL;DR: This study demonstrates that a truncated system of only two latent-space dimensions can reproduce a sharp advecting shock profile for the viscous Burgers equation with very low viscosities, and a twelve-dimensional latent space can recreate the evolution of the inviscid shallow water equations.

...read moreread less

Abstract: A common strategy for the dimensionality reduction of nonlinear partial differential equations (PDEs) relies on the use of the proper orthogonal decomposition (POD) to identify a reduced subspace and the Galerkin projection for evolving dynamics in this reduced space. However, advection-dominated PDEs are represented poorly by this methodology since the process of truncation discards important interactions between higher-order modes during time evolution. In this study, we demonstrate that encoding using convolutional autoencoders (CAEs) followed by a reduced-space time evolution by recurrent neural networks overcomes this limitation effectively. We demonstrate that a truncated system of only two latent space dimensions can reproduce a sharp advecting shock profile for the viscous Burgers equation with very low viscosities, and a six-dimensional latent space can recreate the evolution of the inviscid shallow water equations. Additionally, the proposed framework is extended to a parametric reduced-order model by directly embedding parametric information into the latent space to detect trends in system evolution. Our results show that these advection-dominated systems are more amenable to low-dimensional encoding and time evolution by a CAE and recurrent neural network combination than the POD-Galerkin technique.

...read moreread less

Journal Article•DOI•

Video Anomaly Detection with Sparse Coding Inspired Deep Neural Networks

[...]

Weixin Luo¹, Wen Liu¹, Dongze Lian¹, Jinhui Tang², Lixin Duan³, Xi Peng⁴, Shenghua Gao¹ - Show less +3 more•Institutions (4)

ShanghaiTech University¹, Nanjing University², University of Electronic Science and Technology of China³, Sichuan University⁴

01 Mar 2021-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper presents an anomaly detection method that is based on a sparse coding inspired Deep Neural Networks (DNN), where a temporally-coherent term is used to preserve the similarity between two similar frames, and improves sRNN-AE, which achieves real-time anomaly detection.

...read moreread less

Abstract: This paper presents an anomaly detection method that is based on a sparse coding inspired Deep Neural Networks (DNN). Specifically, in light of the success of sparse coding based anomaly detection, we propose a Temporally-coherent Sparse Coding (TSC), where a temporally-coherent term is used to preserve the similarity between two similar frames. The optimization of sparse coefficients in TSC with the Sequential Iterative Soft-Thresholding Algorithm (SIATA) is equivalent to a special stacked Recurrent Neural Networks (sRNN) architecture. Further, to reduce the computational cost in alternatively updating the dictionary and sparse coefficients in TSC optimization and to alleviate hyperparameters selection in TSC, we stack one more layer on top of the TSC-inspired sRNN to reconstruct the inputs, and arrive at an sRNN-AE. We further improve sRNN-AE in the following aspects: i) rather than using a predefined similarity measurement between two frames, we propose to learn a data-dependent similarity measurement between neighboring frames in sRNN-AE to make it more suitable for anomaly detection; ii) to reduce computational costs in the inference stage, we reduce the depth of the sRNN in sRNN-AE and, consequently, our framework achieves real-time anomaly detection; iii) to improve computational efficiency, we conduct temporal pooling over the appearance features of several consecutive frames for summarizing information temporally, then we feed appearance features and temporally summarized features into a separate sRNN-AE for more robust anomaly detection. To facilitate anomaly detection evaluation, we also build a large-scale anomaly detection dataset which is even larger than the summation of all existing datasets for anomaly detection in terms of both the volume of data and the diversity of scenes. Extensive experiments on both a toy dataset under controlled settings and real datasets demonstrate that our method significantly outperforms existing methods, which validates the effectiveness of our sRNN-AE method for anomaly detection. Codes and data have been released at https://github.com/StevenLiuWen/sRNN_TSC_Anomaly_Detection .

...read moreread less

Journal Article•DOI•

Short-term wind speed forecasting using recurrent neural networks with error correction

[...]

Jikai Duan¹, Hongchao Zuo¹, Yulong Bai², Jizheng Duan³, Chang Mingheng¹, Bolong Chen¹ - Show less +2 more•Institutions (3)

Lanzhou University¹, Northwest Normal University², Chinese Academy of Sciences³

15 Feb 2021-Energy

TL;DR: A novel hybrid forecasting system is proposed in this paper that includes effective data decomposition techniques, recurrent neural network prediction algorithms and error decomposition correction methods, and decomposes the error to correct the previously predicted wind speed.

...read moreread less

Journal Article•DOI•

Deep learning for load forecasting with smart meter data: Online Adaptive Recurrent Neural Network

[...]

Mohammad Navid Fekri¹, Harsh Patel¹, Katarina Grolinger¹, Vinay Sharma•Institutions (1)

University of Western Ontario¹

15 Jan 2021-Applied Energy

TL;DR: The results show that the proposed approach achieves higher accuracy than the standalone offline long short term memory network and five other online algorithms and the time to learn from new samples is only a fraction of the time needed to re-train the offline model.

...read moreread less

Posted Content•

Graph Neural Network for Traffic Forecasting: A Survey.

[...]

Weiwei Jiang, Jiayun Luo¹•Institutions (1)

Nanyang Technological University¹

27 Jan 2021-arXiv: Learning

TL;DR: In this paper, the authors present a comprehensive survey of graph neural networks for traffic forecasting problems, including graph convolutional and graph attention networks, and a comprehensive list of open data and source resources.

...read moreread less

Abstract: Traffic forecasting is important for the success of intelligent transportation systems. Deep learning models, including convolution neural networks and recurrent neural networks, have been extensively applied in traffic forecasting problems to model spatial and temporal dependencies. In recent years, to model the graph structures in transportation systems as well as contextual information, graph neural networks have been introduced and have achieved state-of-the-art performance in a series of traffic forecasting problems. In this survey, we review the rapidly growing body of research using different graph neural networks, e.g. graph convolutional and graph attention networks, in various traffic forecasting problems, e.g. road traffic flow and speed forecasting, passenger flow forecasting in urban rail transit systems, and demand forecasting in ride-hailing platforms. We also present a comprehensive list of open data and source resources for each problem and identify future research directions. To the best of our knowledge, this paper is the first comprehensive survey that explores the application of graph neural networks for traffic forecasting problems. We have also created a public GitHub repository where the latest papers, open data, and source resources will be updated.

...read moreread less

Journal Article•DOI•

Temporal Multi-Graph Convolutional Network for Traffic Flow Prediction

[...]

Mingqi Lv¹, Zhaoxiong Hong¹, Ling Chen², Tieming Chen¹, Tiantian Zhu¹, Shouling Ji² - Show less +2 more•Institutions (2)

Zhejiang University of Technology¹, Zhejiang University²

01 Jun 2021-IEEE Transactions on Intelligent Transportation Systems

TL;DR: This paper proposes T-MGCN (Temporal Multi-Graph Convolutional Network), a deep learning framework for traffic flow prediction that identifies several kinds of semantic correlations, and encode the non-Euclidean spatial correlations and heterogeneous semantic correlations among roads into multiple graphs by a multi-graph convolutional network.

...read moreread less

Abstract: Traffic flow prediction plays an important role in ITS (Intelligent Transportation System). This task is challenging due to the complex spatial and temporal correlations (e.g., the constraints of road network and the law of dynamic change with time). Existing work tried to solve this problem by exploiting a variety of spatiotemporal models. However, we observe that more semantic pair-wise correlations among possibly distant roads are also critical for traffic flow prediction. To jointly model the spatial, temporal, semantic correlations with various global features in the road network, this paper proposes T-MGCN ( Temporal Multi-Graph Convolutional Network ), a deep learning framework for traffic flow prediction. First, we identify several kinds of semantic correlations, and encode the non-Euclidean spatial correlations and heterogeneous semantic correlations among roads into multiple graphs. These correlations are then modeled by a multi-graph convolutional network. Second, a recurrent neural network is utilized to learn dynamic patterns of traffic flow to capture the temporal correlations. Third, a fully connected neural network is utilized to fuse the spatiotemporal correlations with global features. We evaluate T-MGCN on two real-world traffic datasets and observe improvement by approximately 3% to 6% as compared to the state-of-the-art baseline.

...read moreread less

Journal Article•DOI•

Semi-Supervised Support Vector Machine for Digital Twins Based Brain Image Fusion

[...]

Zhibo Wan¹, Youqiang Dong, Zengchen Yu¹, Haibin Lv², Zhihan Lv¹ - Show less +1 more•Institutions (2)

Qingdao University¹, Ontario Ministry of Natural Resources²

09 Jul 2021-Frontiers in Neuroscience

TL;DR: In this article, a semi-supervised support vector machine (SVM) was used for brain image feature recognition, diagnosis, and forecasting performance of brain image fusion digital twins.

...read moreread less

Abstract: The purpose is to explore the feature recognition, diagnosis, and forecasting performances of Semi-Supervised Support Vector Machines (S3VMs) for brain image fusion Digital Twins (DTs). Both unlabeled and labeled data are used regarding many unlabeled data in brain images, and semi supervised support vector machine (SVM) is proposed. Meantime, the AlexNet model is improved, and the brain images in real space are mapped to virtual space by using digital twins. Moreover, a diagnosis and prediction model of brain image fusion digital twins based on semi supervised SVM and improved AlexNet is constructed. Magnetic Resonance Imaging (MRI) data from the Brain Tumor Department of a Hospital are collected to test the performance of the constructed model through simulation experiments. Some state-of-art models are included for performance comparison: Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), AlexNet, and Multi-Layer Perceptron (MLP). Results demonstrate that the proposed model can provide a feature recognition and extraction accuracy of 92.52%, at least an improvement of 2.76% compared to other models. Its training lasts for about 100 s, and the test takes about 0.68 s. The Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) of the proposed model are 4.91 and 5.59%, respectively. Regarding the assessment indicators of brain image segmentation and fusion, the proposed model can provide a 79.55% Jaccard coefficient, a 90.43% Positive Predictive Value (PPV), a 73.09% Sensitivity, and a 75.58% Dice Similarity Coefficient (DSC), remarkably better than other models. Acceleration efficiency analysis suggests that the improved AlexNet model is suitable for processing massive brain image data with a higher speedup indicator. To sum up, the constructed model can provide high accuracy, good acceleration efficiency, and excellent segmentation and recognition performances while ensuring low errors, which can provide an experimental basis for brain image feature recognition and digital diagnosis.

...read moreread less

Journal Article•DOI•

Seamless GPS/Inertial Navigation System Based on Self-Learning Square-Root Cubature Kalman Filter

[...]

Chong Shen¹, Yu Zhang², Xiaoting Guo¹, Xiyuan Chen², Huiliang Cao¹, Jun Tang¹, Jie Li¹, Jun Liu¹ - Show less +4 more•Institutions (2)

North University of China¹, Southeast University²

01 Jan 2021-IEEE Transactions on Industrial Electronics

TL;DR: The proposed SL-SRCKF strategy is a hybrid navigation strategy called the self-learning square-root- cubature Kalman filter that comprises two cycle filtering systems that work in a tightly coupled mode and allows more accurate error correction results to be obtained during GPS outages.

...read moreread less

Abstract: To improve the seamless navigation ability of an integrated Global Positioning System (GPS)/inertial navigation system in GPS-denied environments, a hybrid navigation strategy called the self-learning square-root- cubature Kalman filter (SL-SRCKF) is proposed in this article. The SL-SRCKF process contains two innovative steps: 1) it provides the traditional SRCKF with a self-learning ability, which means that navigation system observations can be provided continuously, even during long-term GPS outages; and 2) the relationship between the current Kalman filter gains and the optimal estimation error is established, which means that the optimal estimation accuracy can be improved by error compensation. The superiority of the proposed SL-SRCKF strategy is verified via experimental results and prominent advantages of this approach include: 1) the SL-SRCKF comprises two cycle filtering systems that work in a tightly coupled mode, and this allows more accurate error correction results to be obtained during GPS outages; 2) the system's error prediction ability is effectively improved by introducing a long short-term memory, which provides much better performance than other neural networks, such as random forest regression or the recursive neural network; and 3) under different (30, 60, and 100 s) GPS outage conditions, the long-term stability of SL-SRCKF is much better than that of other error correction approaches.

...read moreread less

Journal Article•DOI•

Detection of False Data Injection Cyber-Attacks in DC Microgrids Based on Recurrent Neural Networks

[...]

Mohammad Reza Habibi¹, Hamid Reza Baghaee², Tomislav Dragicevic¹, Frede Blaabjerg¹•Institutions (2)

Aalborg University¹, Amirkabir University of Technology²

01 Oct 2021-IEEE Journal of Emerging and Selected Topics in Power Electronics

TL;DR: A new artificial intelligence (AI)-based method for the detection of cyber-attacks in direct current (dc) microgrids and also the identification of the attacked distributed energy resource (DER) unit is proposed.

...read moreread less

Abstract: Cyber-physical systems (CPSs) are vulnerable to cyber-attacks. Nowadays, the detection of cyber-attacks in microgrids as examples of CPS has become an important topic due to their wide use in various practical applications from renewable energy plants to power distribution and electric transportation. In this article, we propose a new artificial intelligence (AI)-based method for the detection of cyber-attacks in direct current (dc) microgrids and also the identification of the attacked distributed energy resource (DER) unit. The proposed method works based on the time-series analysis and a nonlinear auto-regressive exogenous model (NARX) neural network, which is a special type of recurrent neural network for estimating dc voltages and currents. In the proposed method, we consider the effect of cyber-attacks named false data injection attacks (FDIAs), which try to affect the accurate voltage regulation and current sharing by affecting voltage and current sensors. In the presented strategy, first, a dc microgrid is operated and controlled without any FDIAs to gather enough data during the normal operation required for the training of NARX neural networks. It is worth mentioning that in the data generation process, load changing is also considered to have distinguishing data sets for load changing and cyber-attack scenarios. Trained and fine-tuned NARX neural networks are exploited in an online manner to estimate the output dc voltages and currents of DER units in dc microgrid. Then, based on the error of estimation, the cyber-attack is detected. To show the effectiveness of the proposed method, offline digital time-domain simulation studies are performed on a test dc microgrid system in the MATLAB/Simulink environment, and the results are verified using real-time simulations using the OPAL-RT real-time digital simulator (RTDS).

...read moreread less

Journal Article•DOI•

Dynamic graph convolutional network for long-term traffic flow prediction with reinforcement learning

[...]

Hao Peng¹, Bowen Du¹, Mingsheng Liu, Mingzhe Liu¹, Shumei Ji, Senzhang Wang², Xu Zhang, Lifang He³ - Show less +4 more•Institutions (3)

Beihang University¹, Central South University², Lehigh University³

01 Nov 2021-Information Sciences

TL;DR: This paper proposes to use graph convolutional policy network based on reinforcement learning to generate dynamic graphs when the dynamic graphs are incomplete due to the data sparsity, and demonstrates that the model can achieve stable and effective long-term predictions of traffic flow, and can reduce the impact of data defects on prediction results.

...read moreread less

Journal Article•DOI•

A survey on long short-term memory networks for time series prediction

[...]

Benjamin Lindemann¹, Timo Müller¹, Hannes Vietz¹, Nasser Jazdi¹, Michael Weyrich¹ - Show less +1 more•Institutions (1)

University of Stuttgart¹

01 Jan 2021-Procedia CIRP

TL;DR: A categorization in L STM with optimized cell state representations and LSTM with interacting cell states is proposed, and Sequence-to-sequence networks with partially conditioning outperform the other approaches, and are best suited to fulfill the requirements.

...read moreread less

Journal Article•DOI•

Reinforcement Learning-Based Load Forecasting of Electric Vehicle Charging Station Using Q -Learning Technique

[...]

Morteza Dabbaghjamanesh¹, Amirhossein Moeini², Abdollah Kavousi-Fard³•Institutions (3)

University of Texas at Dallas¹, Missouri University of Science and Technology², Shiraz University of Technology³

01 Jun 2021-IEEE Transactions on Industrial Informatics

TL;DR: Results prove that PHEV loads can accurately be forecasted by using the Q-learning technique under three different scenarios (smart, uncoordinated, and coordinated), and prove the effectiveness and advantages of the proposed Q- learning technique.

...read moreread less

Abstract: The electric vehicles’ (EVs) rapid growth can potentially lead power grids to face new challenges due to load profile changes. To this end, a new method is presented to forecast the EV charging station loads with machine learning techniques. The plug-in hybrid EVs (PHEVs) charging can be categorized into three main techniques (smart, uncoordinated, and coordinated). To have a good prediction of the future PHEV loads in this article, the Q -learning technique, which is a kind of the reinforcement learning, is used for different charging scenarios. The proposed Q -learning technique improves the forecasting of the conventional artificial intelligence techniques such as the recurrent neural network and the artificial neural network. Results prove that PHEV loads can accurately be forecasted by using the Q -learning technique under three different scenarios (smart, uncoordinated, and coordinated). The simulations of three different scenarios are obtained in the Keras open source software to validate the effectiveness and advantages of the proposed Q -learning technique.

...read moreread less

Journal Article•DOI•

Machine learning for pore-water pressure time-series prediction: Application of recurrent neural networks

[...]

Xin Wei¹, Lulu Zhang¹, Haoqing Yang¹, Li Min Zhang², Yangping Yao³ - Show less +1 more•Institutions (3)

Shanghai Jiao Tong University¹, Hong Kong University of Science and Technology², Beihang University³

01 Jan 2021-Geoscience frontiers

TL;DR: The results reveal that MLP can provide acceptable performance but is not robust and the standard RNN can perform better but the robustness is slightly affected when there are significant time lags between PWP changes and rainfall.

...read moreread less

Abstract: Knowledge of pore-water pressure (PWP) variation is fundamental for slope stability. A precise prediction of PWP is difficult due to complex physical mechanisms and in situ natural variability. To explore the applicability and advantages of recurrent neural networks (RNNs) on PWP prediction, three variants of RNNs, i.e., standard RNN, long short-term memory (LSTM) and gated recurrent unit (GRU) are adopted and compared with a traditional static artificial neural network (ANN), i.e., multi-layer perceptron (MLP). Measurements of rainfall and PWP of representative piezometers from a fully instrumented natural slope in Hong Kong are used to establish the prediction models. The coefficient of determination (R2) and root mean square error (RMSE) are used for model evaluations. The influence of input time series length on the model performance is investigated. The results reveal that MLP can provide acceptable performance but is not robust. The uncertainty bounds of RMSE of the MLP model range from 0.24 kPa to 1.12 kPa for the selected two piezometers. The standard RNN can perform better but the robustness is slightly affected when there are significant time lags between PWP changes and rainfall. The GRU and LSTM models can provide more precise and robust predictions than the standard RNN. The effects of the hidden layer structure and the dropout technique are investigated. The single-layer GRU is accurate enough for PWP prediction, whereas a double-layer GRU brings extra time cost with little accuracy improvement. The dropout technique is essential to overfitting prevention and improvement of accuracy.

...read moreread less

Journal Article•DOI•

Daily Traffic Flow Forecasting Through a Contextual Convolutional Recurrent Neural Network Modeling Inter- and Intra-Day Traffic Patterns

[...]

Dongfang Ma¹, Xiang Song², Pu Li³•Institutions (3)

Zhejiang University¹, Massachusetts Institute of Technology², College of Management and Economics³

01 May 2021-IEEE Transactions on Intelligent Transportation Systems

TL;DR: A novel deep-learning-based method for daily traffic flow forecasting where incorporating contextual factors and traffic flow patterns can be critical and it greatly outperforms existing benchmark methods and its forecasting performance is robust under various scenarios.

...read moreread less

Abstract: Traffic flow forecasting is an important problem for the successful deployment of intelligent transportation systems, which has been studied for more than two decades. In recent years, deep learning methods are emerging to serve as the benchmark tool for traffic flow forecasting due to its superior prediction performance. However, most studies are based on simple deep learning methods that can not capture inter- and intra-day traffic patterns as well as the correlation between contextual factors like the weather and the traffic flow. In this paper, we propose a novel deep-learning-based method for daily traffic flow forecasting where incorporating contextual factors and traffic flow patterns can be critical. First, a particular convolutional neural network (CNN) is deployed to extract inter- and intra- day traffic flow patterns. Then extracted features are fed into long short-term memory (LSTM) units to learn the intra-day temporal evolution of traffic flow. Finally, contextual information of historical days is integrated to enhance the prediction performance. Through a real-data case study, we show that the proposed approach achieves over 90% prediction accuracy which greatly outperforms existing benchmark methods and its forecasting performance is robust under various scenarios.

...read moreread less

Journal Article•DOI•

A review of irregular time series data handling with gated recurrent neural networks

[...]

Philip B. Weerakody¹, Kok Wai Wong¹, Guanjin Wang¹, Wendell P. Ela¹•Institutions (1)

Murdoch University¹

21 Jun 2021-Neurocomputing

TL;DR: The most effective techniques emerging within this branch of research are presented to identify remaining challenges as well as to build upon this platform of work towards further novel techniques for handling irregular time series data.

...read moreread less

Collapse