scispace - formally typeset
Search or ask a question

Showing papers on "Autoencoder published in 2021"


Journal ArticleDOI
TL;DR: The structural principle, the characteristics, and some kinds of classic models of deep learning, such as stacked auto encoder, deep belief network, deep Boltzmann machine, and convolutional neural network are described.

408 citations


Journal ArticleDOI
TL;DR: A convolutional autoencoder deep learning framework to support unsupervised image features learning for lung nodule through unlabeled data, which only needs a small amount of labeled data for efficient feature learning.
Abstract: At present, computed tomography (CT) is widely used to assist disease diagnosis. Especially, computer aided diagnosis (CAD) based on artificial intelligence (AI) recently exhibits its importance in intelligent healthcare. However, it is a great challenge to establish an adequate labeled dataset for CT analysis assistance, due to the privacy and security issues. Therefore, this paper proposes a convolutional autoencoder deep learning framework to support unsupervised image features learning for lung nodule through unlabeled data, which only needs a small amount of labeled data for efficient feature learning. Through comprehensive experiments, it shows that the proposed scheme is superior to other approaches, which effectively solves the intrinsic labor-intensive problem during artificial image labeling. Moreover, it verifies that the proposed convolutional autoencoder approach can be extended for similarity measurement of lung nodules images. Especially, the features extracted through unsupervised learning are also applicable in other related scenarios.

345 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: Wu et al. as discussed by the authors proposed a contrastive regularization (CR) based on contrastive learning to exploit both the information of hazy images and clear images as negative and positive samples, respectively.
Abstract: Single image dehazing is a challenging ill-posed problem due to the severe information degeneration. However, existing deep learning based dehazing methods only adopt clear images as positive samples to guide the training of dehazing network while negative information is unexploited. Moreover, most of them focus on strengthening the dehazing network with an increase of depth and width, leading to a significant requirement of computation and memory. In this paper, we propose a novel contrastive regularization (CR) built upon contrastive learning to exploit both the information of hazy images and clear images as negative and positive samples, respectively. CR ensures that the restored image is pulled to closer to the clear image and pushed to far away from the hazy image in the representation space.Furthermore, considering trade-off between performance and memory storage, we develop a compact dehazing network based on autoencoder-like (AE) framework. It involves an adaptive mixup operation and a dynamic feature enhancement module, which can benefit from preserving information flow adaptively and expanding the receptive field to improve the network’s transformation capability, respectively. We term our dehazing network with autoencoder and contrastive regularization as AECR-Net. The extensive experiments on synthetic and real-world datasets demonstrate that our AECR-Net surpass the state-of-the-art approaches. The code is released in https://github.com/GlassyWu/AECR-Net.

311 citations


Journal ArticleDOI
TL;DR: A Long Short Term Memory network-based method for forecasting multivariate time series data and an LSTM Autoencoder network- based method combined with a one-class support vector machine algorithm for detecting anomalies in sales are suggested.

191 citations


Journal ArticleDOI
TL;DR: A new LIB RUL prediction method based on improved convolution neural network (CNN) and long short-term memory (L STM), namely Auto-CNN-LSTM, is proposed in this article, developed based on deep CNN and LSTM to mine deeper information in finite data.
Abstract: Integration of each aspect of the manufacturing process with the new generation of information technology such as the Internet of Things, big data, and cloud computing makes industrial manufacturing systems more flexible and intelligent. Industrial big data, recording all aspects of the industrial production process, contain the key value for industrial intelligence. For industrial manufacturing, an essential and widely used electronic device is the lithium-ion battery (LIB). However, accurately predicting the remaining useful life (RUL) of LIB is urgently needed to reduce unexpected maintenance and avoid accidents. Due to insufficient amount of degradation data, the prediction accuracy of data-driven methods is greatly limited. Besides, mathematical models established by model-driven methods to represent degradation process are unstable because of external factors like temperature. To solve this problem, a new LIB RUL prediction method based on improved convolution neural network (CNN) and long short-term memory (LSTM), namely Auto-CNN-LSTM, is proposed in this article. This method is developed based on deep CNN and LSTM to mine deeper information in finite data. In this method, an autoencoder is utilized to augment the dimensions of data for more effective training of CNN and LSTM. In order to obtain continuous and stable output, a filter to smooth the predicted value is used. Comparing with other commonly used methods, experiments on a real-world dataset demonstrate the effectiveness of the proposed method.

191 citations


Journal ArticleDOI
TL;DR: Rasmussen et al. as discussed by the authors used deep variational autoencoders to encode sequence coabundance and k-mer distribution information before clustering, and achieved state-of-the-art performance.
Abstract: Despite recent advances in metagenomic binning, reconstruction of microbial species from metagenomics data remains challenging. Here we develop variational autoencoders for metagenomic binning (VAMB), a program that uses deep variational autoencoders to encode sequence coabundance and k-mer distribution information before clustering. We show that a variational autoencoder is able to integrate these two distinct data types without any previous knowledge of the datasets. VAMB outperforms existing state-of-the-art binners, reconstructing 29–98% and 45% more near-complete (NC) genomes on simulated and real data, respectively. Furthermore, VAMB is able to separate closely related strains up to 99.5% average nucleotide identity (ANI), and reconstructed 255 and 91 NC Bacteroides vulgatus and Bacteroides dorei sample-specific genomes as two distinct clusters from a dataset of 1,000 human gut microbiome samples. We use 2,606 NC bins from this dataset to show that species of the human gut microbiome have different geographical distribution patterns. VAMB can be run on standard hardware and is freely available at https://github.com/RasmussenLab/vamb . Metagenomics data are resolved into their constituent genomes using a new deep learning method.

163 citations


Journal ArticleDOI
TL;DR: In this paper, a comprehensive review and analysis of latest deep learning methods in different image fusion scenarios is provided, and the evaluation for some representative methods in specific fusion tasks are performed qualitatively and quantitatively.

153 citations


Journal ArticleDOI
TL;DR: In this paper, a deep autoencoder based energy method (DAEM) is proposed for bending, vibration and buckling analysis of Kirchhoff plates, where the objective function is to minimize the total potential energy.
Abstract: In this paper, we present a deep autoencoder based energy method (DAEM) for the bending, vibration and buckling analysis of Kirchhoff plates. The DAEM exploits the higher order continuity of the DAEM and integrates a deep autoencoder and the minimum total potential principle in one framework yielding an unsupervised feature learning method. The DAEM is a specific type of feedforward deep neural network (DNN) and can also serve as function approximator. With robust feature extraction capacity, the DAEM can more efficiently identify patterns behind the whole energy system, such as the field variables, natural frequency and critical buckling load factor studied in this paper. The objective function is to minimize the total potential energy. The DAEM performs unsupervised learning based on generated collocation points inside the physical domain so that the total potential energy is minimized at all points. For the vibration and buckling analysis, the loss function is constructed based on Rayleigh’s principle and the fundamental frequency and the critical buckling load is extracted. A scaled hyperbolic tangent activation function for the underlying mechanical model is presented which meets the continuity requirement and alleviates the gradient vanishing/explosive problems under bending. The DAEM is implemented using Pytorch and the LBFGS optimizer. To further improve the computational efficiency and enhance the generality of this machine learning method, we employ transfer learning. A comprehensive study of the DAEM configuration is performed for several numerical examples with various geometries, load conditions, and boundary conditions.

150 citations


Journal ArticleDOI
TL;DR: This model retrofits the workhorse unsupervised dimension reduction device from the machine learning literature – autoencoder neural networks – to incorporate information from covariates along with returns themselves, and delivers estimates of nonlinear conditional exposures and the associated latent factors.

146 citations


Journal ArticleDOI
TL;DR: In this article, an automated nanoporous materials discovery platform powered by a supramolecular variational autoencoder was proposed for the generative design of reticular materials, which can efficiently explore this space.
Abstract: Reticular frameworks are crystalline porous materials that form via the self-assembly of molecular building blocks in different topologies, with many having desirable properties for gas storage, separation, catalysis, biomedical applications and so on. The notable variety of building blocks makes reticular chemistry both promising and challenging for prospective materials design. Here we propose an automated nanoporous materials discovery platform powered by a supramolecular variational autoencoder for the generative design of reticular materials. We demonstrate the automated design process with a class of metal–organic framework (MOF) structures and the goal of separating carbon dioxide from natural gas or flue gas. Our model shows high fidelity in capturing MOF structural features. We show that the autoencoder has a promising optimization capability when jointly trained with multiple top adsorbent candidates identified for superior gas separation. MOFs discovered here are strongly competitive against some of the best-performing MOFs/zeolites ever reported. Reticular frameworks are crystalline porous materials with desirable properties such as gas separation, but their large design space presents a challenge. An automated nanoporous materials discovery platform powered by a supramolecular variational autoencoder can efficiently explore this space.

133 citations


Journal ArticleDOI
11 Jun 2021-Irbm
TL;DR: The proposed hybrid model provided more effective and improvement techniques for classification and with threshold-based segmentation in terms of detection and the overall accuracy of the hybrid CNN-SVM is obtained.
Abstract: Objective In this research paper, the brain MRI images are going to classify by considering the excellence of CNN on a public dataset to classify Benign and Malignant tumors. Materials and Methods Deep learning (DL) methods due to good performance in the last few years have become more popular for Image classification. Convolution Neural Network (CNN), with several methods, can extract features without using handcrafted models, and eventually, show better accuracy of classification. The proposed hybrid model combined CNN and support vector machine (SVM) in terms of classification and with threshold-based segmentation in terms of detection. Result The findings of previous studies are based on different models with their accuracy as Rough Extreme Learning Machine (RELM)-94.233%, Deep CNN (DCNN)-95%, Deep Neural Network (DNN) and Discrete Wavelet Autoencoder (DWA)-96%, k-nearest neighbors (kNN)-96.6%, CNN-97.5%. The overall accuracy of the hybrid CNN-SVM is obtained as 98.4959%. Conclusion In today's world, brain cancer is one of the most dangerous diseases with the highest death rate, detection and classification of brain tumors due to abnormal growth of cells, shapes, orientation, and the location is a challengeable task in medical imaging. Magnetic resonance imaging (MRI) is a typical method of medical imaging for brain tumor analysis. Conventional machine learning (ML) techniques categorize brain cancer based on some handicraft property with the radiologist specialist choice. That can lead to failure in the execution and also decrease the effectiveness of an Algorithm. With a brief look came to know that the proposed hybrid model provides more effective and improvement techniques for classification.

Journal ArticleDOI
TL;DR: A novel end-to-end Learned Point Cloud Geometry Compression framework, to efficiently compress the point cloud geometry using deep neural networks (DNN) based variational autoencoders (VAE), which exceeds the geometry-based point cloud compression (G-PCC) algorithm standardized by well-known Moving Picture Experts Group (MPEG).
Abstract: This paper presents a novel end-to-end Learned Point Cloud Geometry Compression (a.k.a., Learned-PCGC) system, leveraging stacked Deep Neural Networks (DNN) based Variational AutoEncoder (VAE) to efficiently compress the Point Cloud Geometry (PCG). In this systematic exploration, PCG is first voxelized, and partitioned into non-overlapped 3D cubes, which are then fed into stacked 3D convolutions for compact latent feature and hyperprior generation. Hyperpriors are used to improve the conditional probability modeling of entropy-coded latent features. A Weighted Binary Cross-Entropy (WBCE) loss is applied in training while an adaptive thresholding is used in inference to remove false voxels and reduce the distortion. Objectively, our method exceeds the Geometry-based Point Cloud Compression (G-PCC) algorithm standardized by the Moving Picture Experts Group (MPEG) with a significant performance margin, e.g., at least 60% BD-Rate (Bjontegaard Delta Rate) savings, using common test datasets, and other public datasets. Subjectively, our method has presented better visual quality with smoother surface reconstruction and appealing details, in comparison to all existing MPEG standard compliant PCC methods. Our method requires about 2.5 MB parameters in total, which is a fairly small size for practical implementation, even on embedded platform. Additional ablation studies analyze a variety of aspects (e.g., thresholding, kernels, etc) to examine the generalization, and application capacity of our Learned-PCGC. We would like to make all materials publicly accessible at https://njuvision.github.io/PCGCv1/ for reproducible research.

Journal ArticleDOI
TL;DR: A new deep auto-encoder method with fusing discriminant information about multiple fault types is proposed for bearing fault diagnosis that can effectively improve the diagnostic accuracy with acceptable time efficiency and the results on the Kruskal–Wallis Test indicate the proposed method has better numerical stability.

Journal ArticleDOI
TL;DR: In this paper, a comparative study of unsupervised anomaly detection in brain MRI is presented, where a single architecture, a single resolution and the same dataset(s) are used.

Journal ArticleDOI
Xiaofeng Yuan1, Chen Ou1, Yalin Wang1, Chunhua Yang1, Weihua Gui1 
TL;DR: A layer-wise data augmentation (LWDA) strategy is proposed for the pretraining of deep learning networks and soft sensor modeling and the proposed LWDA-SAE model is applied to predict the 10% and 50% boiling points of the aviation kerosene in an industrial hydrocracking process.
Abstract: In industrial processes, inferential sensors have been extensively applied for prediction of quality variables that are difficult to measure online directly by hard sensors. Deep learning is a recently developed technique for feature representation of complex data, which has great potentials in soft sensor modeling. However, it often needs a large number of representative data to train and obtain a good deep network. Moreover, layer-wise pretraining often causes information loss and generalization degradation of high hidden layers. This greatly limits the implementation and application of deep learning networks in industrial processes. In this article, a layer-wise data augmentation (LWDA) strategy is proposed for the pretraining of deep learning networks and soft sensor modeling. In particular, the LWDA-based stacked autoencoder (LWDA-SAE) is developed in detail. Finally, the proposed LWDA-SAE model is applied to predict the 10% and 50% boiling points of the aviation kerosene in an industrial hydrocracking process. The results show that the LWDA-SAE-based soft sensor is superior to multilayer perceptron, traditional SAE, and the SAE with data augmentation only for its input layer (IDA-SAE). Moreover, LWDA-SAE can converge at a faster speed with a lower learning error than the other methods.

Journal ArticleDOI
TL;DR: This work forms the 6D object pose tracking problem in the Rao-Blackwellized particle filtering framework, where the 3D rotation and the3D translation of an object are decoupled, and achieves state-of-the-art results on two 6D pose estimation benchmarks.
Abstract: Tracking 6-D poses of objects from videos provides rich information to a robot in performing different tasks such as manipulation and navigation. In this article, we formulate the 6-D object pose tracking problem in the Rao–Blackwellized particle filtering framework, where the 3-D rotation and the 3-D translation of an object are decoupled. This factorization allows our approach, called PoseRBPF, to efficiently estimate the 3-D translation of an object along with the full distribution over the 3-D rotation. This is achieved by discretizing the rotation space in a fine-grained manner and training an autoencoder network to construct a codebook of feature embeddings for the discretized rotations. As a result, PoseRBPF can track objects with arbitrary symmetries while still maintaining adequate posterior distributions. Our approach achieves state-of-the-art results on two 6-D pose estimation benchmarks. We open-source our implementation at https://github.com/NVlabs/PoseRBPF .

Journal ArticleDOI
TL;DR: When deep learning SDAE is applied to IoT convergence-based intrusion security detection, the Detection load can be reduced, the detection effect can be improved, and the operation is more secure and stable.
Abstract: In order to explore the application value of deep learning denoising autoencoder (DAE) in Internet-of-Things (IoT) fusion security, in this study, a hierarchical intrusion security detection model stacked DAE supporting vector machine (SDAE-SVM) is constructed based on the three-layer neural network of self-encoder. The sample data after dimension reduction are obtained by layer by layer pretraining and fine-tuning. The traditional deep learning algorithms [stacked noise autoencoder (SNAE), stacked autoencoder (SAE), stacked contractive autoencoder (SCAE), stacked sparse autoencoder (SSAE), deep belief network (DBN)] are introduced to carry out the comparative simulation with the model in this study. The results show that when the encoder in the model is a 4-layer network structure, the accuracy rate (Ac) of the model is the highest (97.83%), the false-negative rate (Fn) (1.27%) and the false-positive rate (Fp) (3.21%) are the lowest. When the number of nodes in the first hidden layer is about 110, the model accuracy is about 98%. When comparing the model designed in this study with the common feature dimension reduction methods, the Ac, Fn, and Fp of this model are the best, which are 98.12%, 3.21%, and 1.27%, respectively. When compared with other deep learning algorithms of the same type, the recognition rate, Ac, error rate, and rejection rate show good results. In multiple data sets, the recognition rate, Ac, error rate, and rejection rate of the model in this study are always better than the traditional deep learning algorithms. In conclusion, when deep learning SDAE is applied to IoT convergence-based intrusion security detection, the detection load can be reduced, the detection effect can be improved, and the operation is more secure and stable.

Journal ArticleDOI
TL;DR: The proposed unsupervised deep learning method based on a deep auto-encoder with an one-class support vector machine only uses the measured acceleration response data acquired from intact or baseline structures as training data, which enables future structural damage to be detected.
Abstract: This article proposes an unsupervised deep learning–based approach to detect structural damage. Supervised deep learning methods have been proposed in recent years, but they require data from an in...

Proceedings ArticleDOI
01 Mar 2021
TL;DR: Li et al. as discussed by the authors proposed a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation from a monocular RGB-D image.
Abstract: In this paper, we focus on category-level 6D pose and size estimation from a monocular RGB-D image. Previous methods suffer from inefficient category-level pose feature extraction, which leads to low accuracy and inference speed. To tackle this problem, we propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation. First, we design an orientation aware autoencoder with 3D graph convolution for latent feature extraction. Thanks to the shift and scale-invariance properties of 3D graph convolution, the learned latent feature is insensitive to point shift and object size. Then, to efficiently decode category-level rotation information from the latent feature, we propose a novel decoupled rotation mechanism that employs two decoders to complementarily access the rotation information. For translation and size, we estimate them by two residuals: the difference between the mean of object points and ground truth translation, and the difference between the mean size of the category and ground truth size, respectively. Finally, to increase the generalization ability of the FS-Net, we propose an on-line box-cage based 3D deformation mechanism to augment the training data. Extensive experiments on two benchmark datasets show that the proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation. Especially in category-level pose estimation, without extra synthetic data, our method outperforms existing methods by 6.3% on the NOCS-REAL dataset 1.

Journal ArticleDOI
TL;DR: An electroencephalogram (EEG)-based remote pathology detection system that uses a deep convolutional network consisting of 1D and 2D convolutions and a fusion network is proposed, and its performance is found to be comparable with the performance obtained using only a local server.
Abstract: An electroencephalogram (EEG)-based remote pathology detection system is proposed in this study. The system uses a deep convolutional network consisting of 1D and 2D convolutions. Features from different convolutional layers are fused using a fusion network. Various types of networks are investigated; the types include a multilayer perceptron (MLP) with a varying number of hidden layers, and an autoencoder. Experiments are done using a publicly available EEG signal database that contains two classes: normal and abnormal. The experimental results demonstrate that the proposed system achieves greater than 89% accuracy using the convolutional network followed by the MLP with two hidden layers. The proposed system is also evaluated in a cloud-based framework, and its performance is found to be comparable with the performance obtained using only a local server.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a feature distance stack autoencoder (FD-SAE) for rolling bearing fault diagnosis, which has stronger feature extraction ability and faster network convergence speed.
Abstract: In recent years, autoencoder has been widely used for the fault diagnosis of mechanical equipment because of its excellent performance in feature extraction and dimension reduction; however, the original autoencoder only has limited feature extraction ability due to the lack of label information. To solve this issue, this study proposes a feature distance stack autoencoder (FD-SAE) for rolling bearing fault diagnosis. Compared with the existing methods, FD-SAE has stronger feature extraction ability and faster network convergence speed. By analyzing the characteristics of original rolling bearing data, it is found that there are evident differences between normal data and faulty data. Therefore, a simple linear support vector machine (SVM) is used to classify normal data and faulty data, and then the proposed FD-SAE is used for fault classification. The novel combination of SVM and FD-SAE has simple structure and little computational complexity. Finally, the proposed method is verified on the rolling bearing data set of Case Western Reserve University (CWRU).

Journal ArticleDOI
TL;DR: In this paper, a convolutional neural network autoencoder unmixing (CNNAEU) method was proposed to exploit the spatial and spectral structure of hyperspectral images.
Abstract: Blind hyperspectral unmixing is the process of expressing the measured spectrum of a pixel as a combination of a set of spectral signatures called endmembers and simultaneously determining their fractional abundances in the pixel. Most unmixing methods are strictly spectral and do not exploit the spatial structure of hyperspectral images (HSIs). In this article, we present a new spectral–spatial linear mixture model and an associated estimation method based on a convolutional neural network autoencoder unmixing (CNNAEU). The CNNAEU technique exploits the spatial and the spectral structure of HSIs both for endmember and abundance map estimation. As it works directly with patches of HSIs and does not use any pooling or upsampling layers, the spatial structure is preserved throughout and abundance maps are obtained as feature maps of a hidden convolutional layer. We compared the CNNAEU method to four conventional and three deep learning state-of-the-art unmixing methods using four real HSIs. Experimental results show that the proposed CNNAEU technique performs particularly well and consistently when it comes to endmembers’ extraction and outperforms all the comparison methods.

Journal ArticleDOI
TL;DR: A semi-supervised fault diagnosis method called hybrid classification autoencoder, which utilizes a softmax classifier to directly diagnose the health condition based on the encoded features from the autoenCoder, is proposed in this paper.

Journal ArticleDOI
TL;DR: In this article, the authors proposed an end-member-guided unmixing network (EGU-Net), which is a two-stream Siamese deep network that learns an additional network from the pure or nearly pure endmembers to correct the weights of another unmixer by sharing network parameters and adding spectrally meaningful constraints.
Abstract: Over the past decades, enormous efforts have been made to improve the performance of linear or nonlinear mixing models for hyperspectral unmixing (HU), yet their ability to simultaneously generalize various spectral variabilities (SVs) and extract physically meaningful endmembers still remains limited due to the poor ability in data fitting and reconstruction and the sensitivity to various SVs. Inspired by the powerful learning ability of deep learning (DL), we attempt to develop a general DL approach for HU, by fully considering the properties of endmembers extracted from the hyperspectral imagery, called endmember-guided unmixing network (EGU-Net). Beyond the alone autoencoder-like architecture, EGU-Net is a two-stream Siamese deep network, which learns an additional network from the pure or nearly pure endmembers to correct the weights of another unmixing network by sharing network parameters and adding spectrally meaningful constraints (e.g., nonnegativity and sum-to-one) toward a more accurate and interpretable unmixing solution. Furthermore, the resulting general framework is not only limited to pixelwise spectral unmixing but also applicable to spatial information modeling with convolutional operators for spatial-spectral unmixing. Experimental results conducted on three different datasets with the ground truth of abundance maps corresponding to each material demonstrate the effectiveness and superiority of the EGU-Net over state-of-the-art unmixing algorithms. The codes will be available from the website: https://github.com/danfenghong/IEEE_TNNLS_EGU-Net.

Proceedings ArticleDOI
Zhihan Li1, Youjian Zhao1, Jiaqi Han1, Ya Su1, Rui Jiao1, Xidao Wen1, Dan Pei1 
14 Aug 2021
TL;DR: InterFusion as discussed by the authors proposes an unsupervised method that simultaneously models the inter-metric and temporal dependency for multivariate time series (MTS) anomaly detection and anomaly interpretation.
Abstract: Anomaly detection is a crucial task for monitoring various status (i.e., metrics) of entities (e.g., manufacturing systems and Internet services), which are often characterized by multivariate time series (MTS). In practice, it's important to precisely detect the anomalies, as well as to interpret the detected anomalies through localizing a group of most anomalous metrics, to further assist the failure troubleshooting. In this paper, we propose InterFusion, an unsupervised method that simultaneously models the inter-metric and temporal dependency for MTS. Its core idea is to model the normal patterns inside MTS data through hierarchical Variational AutoEncoder with two stochastic latent variables, each of which learns low-dimensional inter-metric or temporal embeddings. Furthermore, we propose an MCMC-based method to obtain reasonable embeddings and reconstructions at anomalous parts for MTS anomaly interpretation. Our evaluation experiments are conducted on four real-world datasets from different industrial domains (three existing and one newly published dataset collected through our pilot deployment of InterFusion). InterFusion achieves an average anomaly detection F1-Score higher than 0.94 and anomaly interpretation performance of 0.87, significantly outperforming recent state-of-the-art MTS anomaly detection methods.

Journal ArticleDOI
TL;DR: A deep learning-based Intrusion Detection System (IDS) for ITS, in particular, to discover suspicious network activity of In-Vehicles Networks (IVN), vehicles to vehicles communications and vehicles to infrastructure (V2I) networks.
Abstract: Intelligent Transportation Systems (ITS), especially Autonomous Vehicles (AVs), are vulnerable to security and safety issues that threaten the lives of the people. Unlike manual vehicles, the security of communications and computing components of AVs can be compromised using advanced hacking techniques, thus barring AVs from the effective use in our routine lives. Once manual vehicles are connected to the Internet, called the Internet of Vehicles (IoVs), it would be exploited by cyber-attacks, like denial of service, sniffing, distributed denial of service, spoofing and replay attacks. In this article, we present a deep learning-based Intrusion Detection System (IDS) for ITS, in particular, to discover suspicious network activity of In-Vehicles Networks (IVN), vehicles to vehicles (V2V) communications and vehicles to infrastructure (V2I) networks. A Deep Learning architecture-based Long-Short Term Memory (LSTM) autoencoder algorithm is designed to recognize intrusive events from the central network gateways of AVs. The proposed IDS is evaluated using two benchmark datasets, i.e., the car hacking dataset for in-vehicle communications and the UNSW-NB15 dataset for external network communications. The experimental results demonstrated that our proposed system achieved over a 99% accuracy for detecting all types of attacks on the car hacking dataset and a 98% accuracy on the UNSW-NB15 dataset, outperforming other eight intrusion detection techniques.

Journal ArticleDOI
TL;DR: In this article, a contrastive self-supervised learning framework for anomaly detection on attributed networks is proposed, which exploits the local information from network data by sampling a novel type of contrastive instance pair, which can capture the relationship between each node and its neighboring substructure.
Abstract: Anomaly detection on attributed networks attracts considerable research interests due to wide applications of attributed networks in modeling a wide range of complex systems. Recently, the deep learning-based anomaly detection methods have shown promising results over shallow approaches, especially on networks with high-dimensional attributes and complex structures. However, existing approaches, which employ graph autoencoder as their backbone, do not fully exploit the rich information of the network, resulting in suboptimal performance. Furthermore, these methods do not directly target anomaly detection in their learning objective and fail to scale to large networks due to the full graph training mechanism. To overcome these limitations, in this article, we present a novel Contrastive self-supervised Learning framework for Anomaly detection on attributed networks (CoLA for abbreviation). Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair, which can capture the relationship between each node and its neighboring substructure in an unsupervised way. Meanwhile, a well-designed graph neural network (GNN)-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure and measure the agreement of each instance pairs with its outputted scores. The multiround predicted scores by the contrastive learning model are further used to evaluate the abnormality of each node with statistical estimation. In this way, the learning model is trained by a specific anomaly detection-aware target. Furthermore, since the input of the GNN module is batches of instance pairs instead of the full network, our framework can adapt to large networks flexibly. Experimental results show that our proposed framework outperforms the state-of-the-art baseline methods on all seven benchmark data sets.

Journal ArticleDOI
TL;DR: A hybrid method is employed, which combines handcrafted features and encoding of autoencoder to reach high performance in seizure detection in EEG signals and the computational complexity of features is investigated.
Abstract: Epilepsy, a brain disease generally associated with seizures, has tremendous effects on people’s quality of life. Diagnosis of epileptic seizures is commonly performed on electroencephalography (EEG) signals, and by using computer-aided diagnosis systems (CADS), neurologists can diagnose epileptic seizure stages more accurately. In these systems, a mandatory stage is feature extraction, performed by handcrafting features or learning them, ordinarily by a deep neural net. While researches in this field commonly show the value of a group of limited features, yet an accurate comparison between different suggested features is essential. In this article, first, a comparison between the importance of 50 different handcrafted features for seizure detection is presented. Additionally, the computational complexity of features is investigated as well. Then the best features based on Fisher scores are picked to classify signals on a benchmark dataset for evaluation. Additionally, a convolutional autoencoder with five layers is applied to learn features in order to have a complete comparison among feature extraction approaches. Finally, a hybrid method is employed, which combines handcrafted features and encoding of autoencoder to reach high performance in seizure detection in EEG signals.

Journal ArticleDOI
TL;DR: This article proposes a smoothness-inducing sequential variational auto-encoder (VAE) model for the robust estimation and anomaly detection of multidimensional time series and shows the effectiveness of the model on both synthetic data sets and public real-world benchmarks.
Abstract: Deep generative models have demonstrated their effectiveness in learning latent representation and modeling complex dependencies of time series. In this article, we present a smoothness-inducing sequential variational auto-encoder (VAE) (SISVAE) model for the robust estimation and anomaly detection of multidimensional time series. Our model is based on VAE, and its backbone is fulfilled by a recurrent neural network to capture latent temporal structures of time series for both the generative model and the inference model. Specifically, our model parameterizes mean and variance for each time-stamp with flexible neural networks, resulting in a nonstationary model that can work without the assumption of constant noise as commonly made by existing Markov models. However, such flexibility may cause the model fragile to anomalies. To achieve robust density estimation which can also benefit detection tasks, we propose a smoothness-inducing prior over possible estimations. The proposed prior works as a regularizer that places penalty at nonsmooth reconstructions. Our model is learned efficiently with a novel stochastic gradient variational Bayes estimator. In particular, we study two decision criteria for anomaly detection: reconstruction probability and reconstruction error. We show the effectiveness of our model on both synthetic data sets and public real-world benchmarks.

Journal ArticleDOI
TL;DR: A method called ensemble semi-supervised gated stacked AE (ES2GSAE) is proposed, in which different unlabeled datasets are used for training different submodels to ensure their diversities and can be utilized more efficiently and help enhance the model performance.
Abstract: Soft-sensing techniques are of great significance in industrial processes for monitoring and prediction of key performance indicators. Due to the effectiveness of nonlinear feature extraction and strong expansibility, an autoencoder (AE) and its extensions have been widely developed for industrial applications. Nevertheless, an AE commonly uses the last hidden layer for regression modeling with the output, which seems to be a kind of information waste as the shallow layers are also abstractions of input data. Besides, when there are excessive unlabeled samples, AE-based models are less likely to make full use of them or even degrade the performance. To deal with these issues, a method called ensemble semi-supervised gated stacked AE (ES2GSAE) is proposed in this article. Gate units are used to develop connections between different layers and the output layer, which also help quantify the contribution of different hidden layers. Moreover, the idea of ensemble learning is combined with semi-supervised learning, in which different unlabeled datasets are used for training different submodels to ensure their diversities. In this way, unlabeled samples can be utilized more efficiently and help enhance the model performance. The effectiveness and superiority are verified in a real industrial process by comparing the proposed method with other typical AE-based models.