scispace - formally typeset
Search or ask a question

Showing papers on "Recurrent neural network published in 2020"


Posted Content
TL;DR: This work proposes the convolution-augmented transformer for speech recognition, named Conformer, which significantly outperforms the previous Transformer and CNN based models achieving state-of-the-art accuracies.
Abstract: Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs). Transformer models are good at capturing content-based global interactions, while CNNs exploit local features effectively. In this work, we achieve the best of both worlds by studying how to combine convolution neural networks and transformers to model both local and global dependencies of an audio sequence in a parameter-efficient way. To this regard, we propose the convolution-augmented transformer for speech recognition, named Conformer. Conformer significantly outperforms the previous Transformer and CNN based models achieving state-of-the-art accuracies. On the widely used LibriSpeech benchmark, our model achieves WER of 2.1%/4.3% without using a language model and 1.9%/3.9% with an external language model on test/testother. We also observe competitive performance of 2.7%/6.3% with a small model of only 10M parameters.

1,270 citations


Journal ArticleDOI
TL;DR: Deep learning has been shown to be successful in a number of domains, ranging from acoustics, images, to natural language processing as discussed by the authors. However, applying deep learning to the ubiquitous graph data is non-trivial because of the unique characteristics of graphs.
Abstract: Deep learning has been shown to be successful in a number of domains, ranging from acoustics, images, to natural language processing. However, applying deep learning to the ubiquitous graph data is non-trivial because of the unique characteristics of graphs. Recently, substantial research efforts have been devoted to applying deep learning methods to graphs, resulting in beneficial advances in graph analysis techniques. In this survey, we comprehensively review the different types of deep learning methods on graphs. We divide the existing methods into five categories based on their model architectures and training strategies: graph recurrent neural networks, graph convolutional networks, graph autoencoders, graph reinforcement learning, and graph adversarial methods. We then provide a comprehensive overview of these methods in a systematic manner mainly by following their development history. We also analyze the differences and compositions of different methods. Finally, we briefly outline the applications in which they have been used and discuss potential future research directions.

686 citations


Journal ArticleDOI
TL;DR: A novel deep learning framework, Traffic Graph Convolutional Long Short-Term Memory Neural Network (TGC-LSTM), to learn the interactions between roadways in the traffic network and forecast the network-wide traffic state and shows that the proposed model outperforms baseline methods on two real-world traffic state datasets.
Abstract: Traffic forecasting is a particularly challenging application of spatiotemporal forecasting, due to the time-varying traffic patterns and the complicated spatial dependencies on road networks. To address this challenge, we learn the traffic network as a graph and propose a novel deep learning framework, Traffic Graph Convolutional Long Short-Term Memory Neural Network (TGC-LSTM), to learn the interactions between roadways in the traffic network and forecast the network-wide traffic state. We define the traffic graph convolution based on the physical network topology. The relationship between the proposed traffic graph convolution and the spectral graph convolution is also discussed. An L1-norm on graph convolution weights and an L2-norm on graph convolution features are added to the model’s loss function to enhance the interpretability of the proposed model. Experimental results show that the proposed model outperforms baseline methods on two real-world traffic state datasets. The visualization of the graph convolution weights indicates that the proposed framework can recognize the most influential road segments in real-world traffic networks.

611 citations


Proceedings ArticleDOI
16 May 2020
TL;DR: Conformer as mentioned in this paper combines convolution neural networks and transformers to model both local and global dependencies of an audio sequence in a parameter-efficient way, achieving state-of-the-art accuracies.
Abstract: Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs). Transformer models are good at capturing content-based global interactions, while CNNs exploit local features effectively. In this work, we achieve the best of both worlds by studying how to combine convolution neural networks and transformers to model both local and global dependencies of an audio sequence in a parameter-efficient way. To this regard, we propose the convolution-augmented transformer for speech recognition, named Conformer. Conformer significantly outperforms the previous Transformer and CNN based models achieving state-of-the-art accuracies. On the widely used LibriSpeech benchmark, our model achieves WER of 2.1%/4.3% without using a language model and 1.9%/3.9% with an external language model on test/testother. We also observe competitive performance of 2.7%/6.3% with a small model of only 10M parameters.

607 citations


Proceedings ArticleDOI
04 May 2020
TL;DR: In this paper, a dual-path recurrent neural network (DPRNN) is proposed for modeling extremely long sequences. But the model is not effective for modeling such long sequences due to optimization difficulties, while one-dimensional CNNs cannot perform utterance-level sequence modeling when its receptive field is smaller than the sequence length.
Abstract: Recent studies in deep learning-based speech separation have proven the superiority of time-domain approaches to conventional time-frequency-based methods. Unlike the time-frequency domain approaches, the time-domain separation systems often receive input sequences consisting of a huge number of time steps, which introduces challenges for modeling extremely long sequences. Conventional recurrent neural networks (RNNs) are not effective for modeling such long sequences due to optimization difficulties, while one-dimensional convolutional neural networks (1-D CNNs) cannot perform utterance-level sequence modeling when its receptive field is smaller than the sequence length. In this paper, we propose dual-path recurrent neural network (DPRNN), a simple yet effective method for organizing RNN layers in a deep structure to model extremely long sequences. DPRNN splits the long sequential input into smaller chunks and applies intra- and inter-chunk operations iteratively, where the input length can be made proportional to the square root of the original sequence length in each operation. Experiments show that by replacing 1-D CNN with DPRNN and apply sample-level modeling in the time-domain audio separation network (TasNet), a new state-of-the-art performance on WSJ0-2mix is achieved with a 20 times smaller model than the previous best system.

476 citations


Journal ArticleDOI
01 Feb 2020
TL;DR: A survey of deep learning approaches for cyber security intrusion detection, the datasets used, and a comparative study to evaluate the efficiency of several methods are presented.
Abstract: In this paper, we present a survey of deep learning approaches for cybersecurity intrusion detection, the datasets used, and a comparative study. Specifically, we provide a review of intrusion detection systems based on deep learning approaches. The dataset plays an important role in intrusion detection, therefore we describe 35 well-known cyber datasets and provide a classification of these datasets into seven categories; namely, network traffic-based dataset, electrical network-based dataset, internet traffic-based dataset, virtual private network-based dataset, android apps-based dataset, IoT traffic-based dataset, and internet-connected devices-based dataset. We analyze seven deep learning models including recurrent neural networks, deep neural networks, restricted Boltzmann machines, deep belief networks, convolutional neural networks, deep Boltzmann machines, and deep autoencoders. For each model, we study the performance in two categories of classification (binary and multiclass) under two new real traffic datasets, namely, the CSE-CIC-IDS2018 dataset and the Bot-IoT dataset. In addition, we use the most important performance indicators, namely, accuracy, false alarm rate, and detection rate for evaluating the efficiency of several methods.

464 citations


Journal ArticleDOI
Slawek Smyl1
TL;DR: A dynamic computational graph neural network system that enables a standard exponential smoothing model to be mixed with advanced long short term memory networks into a common framework is used and the result is a hybrid and hierarchical forecasting method.

423 citations


Journal ArticleDOI
TL;DR: A comprehensive review of LSTM’s formulation and training, relevant applications reported in the literature and code resources implementing this model for a toy example are presented.
Abstract: Long short-term memory (LSTM) has transformed both machine learning and neurocomputing fields. According to several online sources, this model has improved Google’s speech recognition, greatly improved machine translations on Google Translate, and the answers of Amazon’s Alexa. This neural system is also employed by Facebook, reaching over 4 billion LSTM-based translations per day as of 2017. Interestingly, recurrent neural networks had shown a rather discrete performance until LSTM showed up. One reason for the success of this recurrent network lies in its ability to handle the exploding/vanishing gradient problem, which stands as a difficult issue to be circumvented when training recurrent or very deep neural networks. In this paper, we present a comprehensive review that covers LSTM’s formulation and training, relevant applications reported in the literature and code resources implementing this model for a toy example.

412 citations


Proceedings ArticleDOI
Qian Zhang1, Han Lu1, Hasim Sak1, Anshuman Tripathi1, Erik McDermott1, Stephen Koo1, Shankar Kumar1 
07 Feb 2020
TL;DR: An end-to-end speech recognition model with Transformer encoders that can be used in a streaming speech recognition system and shows that the full attention version of the model beats the-state-of-the art accuracy on the LibriSpeech benchmarks.
Abstract: In this paper we present an end-to-end speech recognition model with Transformer encoders that can be used in a streaming speech recognition system. Transformer computation blocks based on self-attention are used to encode both audio and label sequences independently. The activations from both audio and label encoders are combined with a feed-forward layer to compute a probability distribution over the label space for every combination of acoustic frame position and label history. This is similar to the Recurrent Neural Network Transducer (RNN-T) model, which uses RNNs for information encoding instead of Transformer encoders. The model is trained with the RNN-T loss well-suited to streaming decoding. We present results on the LibriSpeech dataset showing that limiting the left context for self-attention in the Transformer layers makes decoding computationally tractable for streaming, with only a slight degradation in accuracy. We also show that the full attention version of our model beats the-state-of-the art accuracy on the LibriSpeech benchmarks. Our results also show that we can bridge the gap between full attention and limited attention versions of our model by attending to a limited number of future frames.

382 citations


Journal ArticleDOI
03 Apr 2020
TL;DR: In this article, the authors proposed EvolveGCN, which adapts the graph convolutional network (GCN) model along the temporal dimension without resorting to node embeddings.
Abstract: Graph representation learning resurges as a trending research subject owing to the widespread use of deep learning for Euclidean data, which inspire various creative designs of neural networks in the non-Euclidean domain, particularly graphs. With the success of these graph neural networks (GNN) in the static setting, we approach further practical scenarios where the graph dynamically evolves. Existing approaches typically resort to node embeddings and use a recurrent neural network (RNN, broadly speaking) to regulate the embeddings and learn the temporal dynamics. These methods require the knowledge of a node in the full time span (including both training and testing) and are less applicable to the frequent change of the node set. In some extreme scenarios, the node sets at different time steps may completely differ. To resolve this challenge, we propose EvolveGCN, which adapts the graph convolutional network (GCN) model along the temporal dimension without resorting to node embeddings. The proposed approach captures the dynamism of the graph sequence through using an RNN to evolve the GCN parameters. Two architectures are considered for the parameter evolution. We evaluate the proposed approach on tasks including link prediction, edge classification, and node classification. The experimental results indicate a generally higher performance of EvolveGCN compared with related approaches. The code is available at https://github.com/IBM/EvolveGCN.

327 citations


Journal ArticleDOI
TL;DR: The results show that the proposed model has higher robustness and better activity detection capability than some of the reported results, and can not only adaptively extract activity features, but also has fewer parameters and higher accuracy.
Abstract: In the past years, traditional pattern recognition methods have made great progress. However, these methods rely heavily on manual feature extraction, which may hinder the generalization model performance. With the increasing popularity and success of deep learning methods, using these techniques to recognize human actions in mobile and wearable computing scenarios has attracted widespread attention. In this paper, a deep neural network that combines convolutional layers with long short-term memory (LSTM) was proposed. This model could extract activity features automatically and classify them with a few model parameters. LSTM is a variant of the recurrent neural network (RNN), which is more suitable for processing temporal sequences. In the proposed architecture, the raw data collected by mobile sensors was fed into a two-layer LSTM followed by convolutional layers. In addition, a global average pooling layer (GAP) was applied to replace the fully connected layer after convolution for reducing model parameters. Moreover, a batch normalization layer (BN) was added after the GAP layer to speed up the convergence, and obvious results were achieved. The model performance was evaluated on three public datasets (UCI, WISDM, and OPPORTUNITY). Finally, the overall accuracy of the model in the UCI-HAR dataset is 95.78%, in the WISDM dataset is 95.85%, and in the OPPORTUNITY dataset is 92.63%. The results show that the proposed model has higher robustness and better activity detection capability than some of the reported results. It can not only adaptively extract activity features, but also has fewer parameters and higher accuracy.

Journal ArticleDOI
TL;DR: A supervised LSTM (SLSTM) network is proposed to learn quality-relevant hidden dynamics for soft sensor application, which is composed of basic SLSTM unit at each sampling instant.
Abstract: Soft sensor has been extensively utilized in industrial processes for prediction of key quality variables. To build an accurate virtual sensor model, it is very significant to model the dynamic and nonlinear behaviors of process sequential data properly. Recently, a long short-term memory (LSTM) network has shown great modeling ability on various time series, in which basic LSTM units can handle data nonlinearities and dynamics with a dynamic latent variable structure. However, the hidden variables in the basic LSTM unit mainly focus on describing the dynamics of input variables, which lack representation for the quality data. In this paper, a supervised LSTM (SLSTM) network is proposed to learn quality-relevant hidden dynamics for soft sensor application, which is composed of basic SLSTM unit at each sampling instant. In the basic SLSTM unit, the quality and input variables are simultaneously utilized to learn the dynamic hidden states, which are more relevant and useful for quality prediction. The effectiveness of the proposed SLSTM network is demonstrated on a penicillin fermentation process and an industrial debutanizer column.

Journal ArticleDOI
TL;DR: This learning method–called e-prop–approaches the performance of backpropagation through time (BPTT), the best-known method for training recurrent neural networks in machine learning and suggests a method for powerful on-chip learning in energy-efficient spike-based hardware for artificial intelligence.
Abstract: Recurrently connected networks of spiking neurons underlie the astounding information processing capabilities of the brain. Yet in spite of extensive research, how they can learn through synaptic plasticity to carry out complex network computations remains unclear. We argue that two pieces of this puzzle were provided by experimental data from neuroscience. A mathematical result tells us how these pieces need to be combined to enable biologically plausible online network learning through gradient descent, in particular deep reinforcement learning. This learning method–called e-prop–approaches the performance of backpropagation through time (BPTT), the best-known method for training recurrent neural networks in machine learning. In addition, it suggests a method for powerful on-chip learning in energy-efficient spike-based hardware for artificial intelligence. Bellec et al. present a mathematically founded approximation for gradient descent training of recurrent neural networks without backwards propagation in time. This enables biologically plausible training of spike-based neural network models with working memory and supports on-chip training of neuromorphic hardware.

Proceedings ArticleDOI
20 Apr 2020
TL;DR: A novel spatial temporal graph neural network for traffic flow prediction, which can comprehensively capture spatial and temporal patterns and provides a sequential component to model the traffic flow dynamics which can exploit both local and global temporal dependencies.
Abstract: Traffic flow analysis, prediction and management are keystones for building smart cities in the new era. With the help of deep neural networks and big traffic data, we can better understand the latent patterns hidden in the complex transportation networks. The dynamic of the traffic flow on one road not only depends on the sequential patterns in the temporal dimension but also relies on other roads in the spatial dimension. Although there are existing works on predicting the future traffic flow, the majority of them have certain limitations on modeling spatial and temporal dependencies. In this paper, we propose a novel spatial temporal graph neural network for traffic flow prediction, which can comprehensively capture spatial and temporal patterns. In particular, the framework offers a learnable positional attention mechanism to effectively aggregate information from adjacent roads. Meanwhile, it provides a sequential component to model the traffic flow dynamics which can exploit both local and global temporal dependencies. Experimental results on various real traffic datasets demonstrate the effectiveness of the proposed framework.

Journal ArticleDOI
TL;DR: This study establishes that RNNs are a potent computational framework for the learning and forecasting of complex spatiotemporal systems.

Journal ArticleDOI
TL;DR: A comprehensive survey on recent progress in applying deep learning techniques for STDM is provided and existing literatures are classified based on the types of spatio-temporal data, the data mining tasks, and the deep learning models.
Abstract: With the fast development of various positioning techniques such as Global Position System (GPS), mobile devices and remote sensing, spatio-temporal data has become increasingly available nowadays. Mining valuable knowledge from spatio-temporal data is critically important to many real-world applications including human mobility understanding, smart transportation, urban planning, public safety, health care and environmental management. As the number, volume and resolution of spatio-temporal data increase rapidly, traditional data mining methods, especially statistics based methods for dealing with such data are becoming overwhelmed. Recently deep learning models such as recurrent neural network (RNN) and convolutional neural network (CNN) have achieved remarkable success in many domains, and are also widely applied in various spatio-temporal data mining (STDM) tasks such as predictive learning, anomaly detection and classification. In this paper, we provide a comprehensive review of recent progress in applying deep learning techniques for STDM. We first categorize the spatio-temporal data into five different types, and then briefly introduce the deep learning models that are widely used in STDM. Next, we classify existing literature based on the types of spatio-temporal data, the data mining tasks, and the deep learning models, followed by the applications of deep learning for STDM in different domains.

Journal ArticleDOI
TL;DR: The proposed CNN-RNN model, along with other popular methods such as random forest, deep fully connected neural networks (DFNN), and LASSO, was used to forecast corn and soybean yield across the entire Corn Belt for years 2016, 2017, and 2018 using historical data.
Abstract: Crop yield prediction is extremely challenging due to its dependence on multiple factors such as crop genotype, environmental factors, management practices, and their interactions. This paper presents a deep learning framework using convolutional neural networks (CNNs) and recurrent neural networks (RNNs) for crop yield prediction based on environmental data and management practices. The proposed CNN-RNN model, along with other popular methods such as random forest (RF), deep fully connected neural networks (DFNN), and LASSO, was used to forecast corn and soybean yield across the entire Corn Belt (including 13 states) in the United States for years 2016, 2017, and 2018 using historical data. The new model achieved a root-mean-square-error (RMSE) 9% and 8% of their respective average yields, substantially outperforming all other methods that were tested. The CNN-RNN has three salient features that make it a potentially useful method for other crop yield prediction studies. (1) The CNN-RNN model was designed to capture the time dependencies of environmental factors and the genetic improvement of seeds over time without having their genotype information. (2) The model demonstrated the capability to generalize the yield prediction to untested environments without significant drop in the prediction accuracy. (3) Coupled with the backpropagation method, the model could reveal the extent to which weather conditions, accuracy of weather predictions, soil conditions, and management practices were able to explain the variation in the crop yields.

Posted Content
TL;DR: A new network structure simulating the complex-valued operation, called Deep Complex Convolution Recurrent Network (DCCRN), where both CNN and RNN structures can handle complex- valued operation.
Abstract: Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality. Conventional time-frequency (TF) domain methods focus on predicting TF-masks or speech spectrum, via a naive convolution neural network (CNN) or recurrent neural network (RNN). Some recent studies use complex-valued spectrogram as a training target but train in a real-valued network, predicting the magnitude and phase component or real and imaginary part, respectively. Particularly, convolution recurrent network (CRN) integrates a convolutional encoder-decoder (CED) structure and long short-term memory (LSTM), which has been proven to be helpful for complex targets. In order to train the complex target more effectively, in this paper, we design a new network structure simulating the complex-valued operation, called Deep Complex Convolution Recurrent Network (DCCRN), where both CNN and RNN structures can handle complex-valued operation. The proposed DCCRN models are very competitive over other previous networks, either on objective or subjective metric. With only 3.7M parameters, our DCCRN models submitted to the Interspeech 2020 Deep Noise Suppression (DNS) challenge ranked first for the real-time-track and second for the non-real-time track in terms of Mean Opinion Score (MOS).

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a direction-aware attention mechanism in a spatial recurrent neural network (RNN) by introducing attention weights when aggregating spatial context features in the RNN.
Abstract: Shadow detection and shadow removal are fundamental and challenging tasks, requiring an understanding of the global image semantics. This paper presents a novel deep neural network design for shadow detection and removal by analyzing the spatial image context in a direction-aware manner. To achieve this, we first formulate the direction-aware attention mechanism in a spatial recurrent neural network (RNN) by introducing attention weights when aggregating spatial context features in the RNN. By learning these weights through training, we can recover direction-aware spatial context (DSC) for detecting and removing shadows. This design is developed into the DSC module and embedded in a convolutional neural network (CNN) to learn the DSC features at different levels. Moreover, we design a weighted cross entropy loss to make effective the training for shadow detection and further adopt the network for shadow removal by using a euclidean loss function and formulating a color transfer function to address the color and luminosity inconsistencies in the training pairs. We employed two shadow detection benchmark datasets and two shadow removal benchmark datasets, and performed various experiments to evaluate our method. Experimental results show that our method performs favorably against the state-of-the-art methods for both shadow detection and shadow removal.

Proceedings ArticleDOI
01 Aug 2020
TL;DR: Deep Complex Convolution Recurrent Network (DCCRN) as mentioned in this paper is a new network structure simulating the complex-valued operation, where both convolutional encoder-decoder (CED) and long short-term memory (LSTM) structures can handle complexvalued operation.
Abstract: Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality. Conventional time-frequency (TF) domain methods focus on predicting TF-masks or speech spectrum, via a naive convolution neural network (CNN) or recurrent neural network (RNN). Some recent studies use complex-valued spectrogram as a training target but train in a real-valued network, predicting the magnitude and phase component or real and imaginary part, respectively. Particularly, convolution recurrent network (CRN) integrates a convolutional encoder-decoder (CED) structure and long short-term memory (LSTM), which has been proven to be helpful for complex targets. In order to train the complex target more effectively, in this paper, we design a new network structure simulating the complex-valued operation, called Deep Complex Convolution Recurrent Network (DCCRN), where both CNN and RNN structures can handle complex-valued operation. The proposed DCCRN models are very competitive over other previous networks, either on objective or subjective metric. With only 3.7M parameters, our DCCRN models submitted to the Interspeech 2020 Deep Noise Suppression (DNS) challenge ranked first for the real-time-track and second for the non-real-time track in terms of Mean Opinion Score (MOS).

Proceedings ArticleDOI
04 May 2020
TL;DR: This article proposed and evaluated transformer-based acoustic models (AMs) for hybrid speech recognition, including various positional embedding methods and an iterated loss to enable training deep transformers.
Abstract: We propose and evaluate transformer-based acoustic models (AMs) for hybrid speech recognition. Several modeling choices are discussed in this work, including various positional embedding methods and an iterated loss to enable training deep transformers. We also present a preliminary study of using limited right context in transformer models, which makes it possible for streaming applications. We demonstrate that on the widely used Librispeech benchmark, our transformer-based AM outperforms the best published hybrid result by 19% to 26% relative when the standard n-gram language model (LM) is used. Combined with neural network LM for rescoring, our proposed approach achieves state-of-the-art results on Librispeech. Our findings are also confirmed on a much larger internal dataset.

Journal ArticleDOI
TL;DR: A novel hybrid model with the strength of fractional order derivative is presented with their dynamical features of deep learning, long-short term memory (LSTM) networks, to predict the abrupt stochastic variation of the financial market.
Abstract: Forecasting of fast fluctuated and high-frequency financial data is always a challenging problem in the field of economics and modelling. In this study, a novel hybrid model with the strength of fractional order derivative is presented with their dynamical features of deep learning, long-short term memory (LSTM) networks, to predict the abrupt stochastic variation of the financial market. Stock market prices are dynamic, highly sensitive, nonlinear and chaotic. There are different techniques for forecast prices in the time-variant domain and due to variability and uncertain behavior in stock prices, traditional methods, such as data mining, statistical approaches, and non-deep neural networks models are not suited for prediction and generalized forecasting stock prices. While autoregressive fractional integrated moving average (ARFIMA) model provides a flexible tool for classes of long-memory models. The advancement of machine learning-based deep non-linear modelling confirms that the hybrid model efficiently extracts profound features and model non-linear functions. LSTM networks are a special kind of recurrent neural network (RNN) that map sequences of input observations to output observations with capabilities of long-term dependencies. A novel ARFIMA-LSTM hybrid recurrent network is presented in which ARFIMA model-based filters having the linear tendencies better than ARIMA model in the data and passes the residual to the LSTM model that captures nonlinearity in the residual values with the help of exogenous dependent variables. The model not only minimizes the volatility problem but also overcome the over fitting problem of neural networks. The model is evaluated using PSX company data of the stock market based on RMSE, MSE and MAPE along with a comparison of ARIMA, LSTM model and generalized regression radial basis neural network (GRNN) ensemble method independently. The forecasting performance indicates the effectiveness of the proposed AFRIMA-LSTM hybrid model to improve around 80% accuracy on RMSE as compared to traditional forecasting counterparts.

Journal ArticleDOI
TL;DR: A comprehensive review study on the recent DL methods applied to the ECG signal for the classification purposes, which showed high accuracy in correct classification of Atrial Fibrillation, Supraventricular ECTopic Beats, and Ventricular Ectopic Beats using the GRU, CNN, and LSTM, respectively.
Abstract: Deep Learning (DL) has recently become a topic of study in different applications including healthcare, in which timely detection of anomalies on Electrocardiogram (ECG) can play a vital role in patient monitoring. This paper presents a comprehensive review study on the recent DL methods applied to the ECG signal for the classification purposes. This study considers various types of the DL methods such as Convolutional Neural Network (CNN), Deep Belief Network (DBN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU). From the 75 studies reported within 2017 and 2018, CNN is dominantly observed as the suitable technique for feature extraction, seen in 52% of the studies. DL methods showed high accuracy in correct classification of Atrial Fibrillation (AF) (100%), Supraventricular Ectopic Beats (SVEB) (99.8%), and Ventricular Ectopic Beats (VEB) (99.7%) using the GRU/LSTM, CNN, and LSTM, respectively.

Journal ArticleDOI
TL;DR: This paper classified textual clinical reports into four classes by using classical and ensemble machine learning algorithms, and Logistic regression and Multinomial Naïve Bayes showed better results than other ML algorithms by having 96.2% testing accuracy.
Abstract: Technology advancements have a rapid effect on every field of life, be it medical field or any other field. Artificial intelligence has shown the promising results in health care through its decision making by analysing the data. COVID-19 has affected more than 100 countries in a matter of no time. People all over the world are vulnerable to its consequences in future. It is imperative to develop a control system that will detect the coronavirus. One of the solution to control the current havoc can be the diagnosis of disease with the help of various AI tools. In this paper, we classified textual clinical reports into four classes by using classical and ensemble machine learning algorithms. Feature engineering was performed using techniques like Term frequency/inverse document frequency (TF/IDF), Bag of words (BOW) and report length. These features were supplied to traditional and ensemble machine learning classifiers. Logistic regression and Multinomial Naive Bayes showed better results than other ML algorithms by having 96.2% testing accuracy. In future recurrent neural network can be used for better accuracy.

Journal ArticleDOI
TL;DR: Experimental results indicate that the proposed SBU-LSTM architecture, especially the two-layer BDLSTM network, can achieve superior performance for the network-wide traffic prediction in both accuracy and robustness and comprehensive comparison results show that the suggested data imputation mechanism in the RNN-based models can achieve outstanding prediction performance.
Abstract: Short-term traffic forecasting based on deep learning methods, especially recurrent neural networks (RNN), has received much attention in recent years. However, the potential of RNN-based models in traffic forecasting has not yet been fully exploited in terms of the predictive power of spatial–temporal data and the capability of handling missing data. In this paper, we focus on RNN-based models and attempt to reformulate the way to incorporate RNN and its variants into traffic prediction models. A stacked bidirectional and unidirectional LSTM network architecture (SBU-LSTM) is proposed to assist the design of neural network structures for traffic state forecasting. As a key component of the architecture, the bidirectional LSTM (BDLSM) is exploited to capture the forward and backward temporal dependencies in spatiotemporal data. To deal with missing values in spatial–temporal data, we also propose a data imputation mechanism in the LSTM structure (LSTM-I) by designing an imputation unit to infer missing values and assist traffic prediction. The bidirectional version of LSTM-I is incorporated in the SBU-LSTM architecture. Two real-world network-wide traffic state datasets are used to conduct experiments and published to facilitate further traffic prediction research. The prediction performance of multiple types of multi-layer LSTM or BDLSTM models is evaluated. Experimental results indicate that the proposed SBU-LSTM architecture, especially the two-layer BDLSTM network, can achieve superior performance for the network-wide traffic prediction in both accuracy and robustness. Further, comprehensive comparison results show that the proposed data imputation mechanism in the RNN-based models can achieve outstanding prediction performance when the model’s input data contains different patterns of missing values.

Journal ArticleDOI
TL;DR: The values of three performance evaluation indicators, MBE, MAPE, and RMSE, show that the proposed hybrid deep learning model exhibits superior performance in both forecasting accuracy and stability.

Journal ArticleDOI
TL;DR: The proposed method automatically selects the best forecasting model by considering different combinations of LSTM hyperparameters for a given time series using the grid search method, which has the ability to capture nonlinear patterns in time seriesData, while considering the inherent characteristics of non-stationary time series data.

Journal ArticleDOI
TL;DR: This article aims to build a model using Recurrent Neural Networks (RNN) and especially Long-Short Term Memory model (LSTM) to predict future stock market values.

Journal ArticleDOI
TL;DR: In this article, a prediction model that can be used with different types of RNN models on subgroups of similar time series, which are identified by time series clustering techniques is presented.
Abstract: With the advent of Big Data, nowadays in many applications databases containing large quantities of similar time series are available. Forecasting time series in these domains with traditional univariate forecasting procedures leaves great potentials for producing accurate forecasts untapped. Recurrent neural networks (RNNs), and in particular Long Short Term Memory (LSTM) networks, have proven recently that they are able to outperform state-of-the-art univariate time series forecasting methods in this context, when trained across all available time series. However, if the time series database is heterogeneous, accuracy may degenerate, so that on the way towards fully automatic forecasting methods in this space, a notion of similarity between the time series needs to be built into the methods. To this end, we present a prediction model that can be used with different types of RNN models on subgroups of similar time series, which are identified by time series clustering techniques. We assess our proposed methodology using LSTM networks, a widely popular RNN variant, together with various clustering algorithms, such as kMeans, DBScan, Partition Around Medoids (PAM), and Snob. Our method achieves competitive results on benchmarking datasets under competition evaluation procedures. In particular, in terms of mean sMAPE accuracy it consistently outperforms the baseline LSTM model, and outperforms all other methods on the CIF2016 forecasting competition dataset.

Journal ArticleDOI
TL;DR: A survey of battery state estimation methods based on ML approaches such as feedforward neural networks, recurrent neural networks (RNNs), support vector machines (SVM), radial basis functions (RBF), and Hamming networks is provided.
Abstract: The growing interest and recent breakthroughs in artificial intelligence and machine learning (ML) have actively contributed to an increase in research and development of new methods to estimate the states of electrified vehicle batteries. Data-driven approaches, such as ML, are becoming more popular for estimating the state of charge (SOC) and state of health (SOH) due to greater availability of battery data and improved computing power capabilities. This paper provides a survey of battery state estimation methods based on ML approaches such as feedforward neural networks (FNNs), recurrent neural networks (RNNs), support vector machines (SVM), radial basis functions (RBF), and Hamming networks. Comparisons between methods are shown in terms of data quality, inputs and outputs, test conditions, battery types, and stated accuracy to give readers a bigger picture view of the ML landscape for SOC and SOH estimation. Additionally, to provide insight into how to best approach with the comparison of different neural network structures, an FNN and long short-term memory (LSTM) RNN are trained fifty times each for 3000 epochs. The error is somewhat different for each training repetition due to the random initial values of the trainable parameters, demonstrating that it is important to train networks multiple times to achieve the best result. Furthermore, it is recommended that when performing a comparison among estimation techniques such as those presented in this review paper, the compared networks should have a similar number of learnable parameters and be trained and tested with identical data. Otherwise, it is difficult to make a general conclusion regarding the quality of a given estimation technique.