scispace - formally typeset
Search or ask a question

Showing papers on "Autoencoder published in 2019"


Proceedings ArticleDOI
01 Jan 2019
TL;DR: The proposed memory-augmented autoencoder called MemAE is free of assumptions on the data type and thus general to be applied to different tasks and proves the excellent generalization and high effectiveness of the proposed MemAE.
Abstract: Deep autoencoder has been extensively used for anomaly detection. Training on the normal data, the autoencoder is expected to produce higher reconstruction error for the abnormal inputs than the normal ones, which is adopted as a criterion for identifying anomalies. However, this assumption does not always hold in practice. It has been observed that sometimes the autoencoder "generalizes" so well that it can also reconstruct anomalies well, leading to the miss detection of anomalies. To mitigate this drawback for autoencoder based anomaly detector, we propose to augment the autoencoder with a memory module and develop an improved autoencoder called memory-augmented autoencoder, i.e. MemAE. Given an input, MemAE firstly obtains the encoding from the encoder and then uses it as a query to retrieve the most relevant memory items for reconstruction. At the training stage, the memory contents are updated and are encouraged to represent the prototypical elements of the normal data. At the test stage, the learned memory will be fixed, and the reconstruction is obtained from a few selected memory records of the normal data. The reconstruction will thus tend to be close to a normal sample. Thus the reconstructed errors on anomalies will be strengthened for anomaly detection. MemAE is free of assumptions on the data type and thus general to be applied to different tasks. Experiments on various datasets prove the excellent generalization and high effectiveness of the proposed MemAE.

888 citations


Journal ArticleDOI
TL;DR: A new DTL method is proposed, which uses a three-layer sparse auto-encoder to extract the features of raw data, and applies the maximum mean discrepancy term to minimizing the discrepancy penalty between the features from training data and testing data.
Abstract: Fault diagnosis plays an important role in modern industry. With the development of smart manufacturing, the data-driven fault diagnosis becomes hot. However, traditional methods have two shortcomings: 1) their performances depend on the good design of handcrafted features of data, but it is difficult to predesign these features and 2) they work well under a general assumption: the training data and testing data should be drawn from the same distribution, but this assumption fails in many engineering applications. Since deep learning (DL) can extract the hierarchical representation features of raw data, and transfer learning provides a good way to perform a learning task on the different but related distribution datasets, deep transfer learning (DTL) has been developed for fault diagnosis. In this paper, a new DTL method is proposed. It uses a three-layer sparse auto-encoder to extract the features of raw data, and applies the maximum mean discrepancy term to minimizing the discrepancy penalty between the features from training data and testing data. The proposed DTL is tested on the famous motor bearing dataset from the Case Western Reserve University. The results show a good improvement, and DTL achieves higher prediction accuracies on most experiments than DL. The prediction accuracy of DTL, which is as high as 99.82%, is better than the results of other algorithms, including deep belief network, sparse filter, artificial neural network, support vector machine and some other traditional methods. What is more, two additional analytical experiments are conducted. The results show that a good unlabeled third dataset may be helpful to DTL, and a good linear relationship between the final prediction accuracies and their standard deviations have been observed.

760 citations


Journal ArticleDOI
TL;DR: ADenoising method based on a deep count autoencoder network that scales linearly with the number of cells, and therefore is compatible with large data sets, is developed and demonstrated that DCA denoising improves a diverse set of typical scRNA-seq data analyses using simulated and real datasets.
Abstract: Single-cell RNA sequencing (scRNA-seq) has enabled researchers to study gene expression at a cellular resolution. However, noise due to amplification and dropout may obstruct analyses, so scalable denoising methods for increasingly large but sparse scRNA-seq data are needed. We propose a deep count autoencoder network (DCA) to denoise scRNA-seq datasets. DCA takes the count distribution, overdispersion and sparsity of the data into account using a negative binomial noise model with or without zero-inflation, and nonlinear gene-gene dependencies are captured. Our method scales linearly with the number of cells and can, therefore, be applied to datasets of millions of cells. We demonstrate that DCA denoising improves a diverse set of typical scRNA-seq data analyses using simulated and real datasets. DCA outperforms existing methods for data imputation in quality and speed, enhancing biological discovery.

661 citations


Posted Content
TL;DR: It is demonstrated that a multi-scale hierarchical organization of VQ-VAE, augmented with powerful priors over the latent codes, is able to generate samples with quality that rivals that of state of the art Generative Adversarial Networks on multifaceted datasets such as ImageNet, while not suffering from GAN's known shortcomings such as mode collapse and lack of diversity.
Abstract: We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation. To this end, we scale and enhance the autoregressive priors used in VQ-VAE to generate synthetic samples of much higher coherence and fidelity than possible before. We use simple feed-forward encoder and decoder networks, making our model an attractive candidate for applications where the encoding and/or decoding speed is critical. Additionally, VQ-VAE requires sampling an autoregressive model only in the compressed latent space, which is an order of magnitude faster than sampling in the pixel space, especially for large images. We demonstrate that a multi-scale hierarchical organization of VQ-VAE, augmented with powerful priors over the latent codes, is able to generate samples with quality that rivals that of state of the art Generative Adversarial Networks on multifaceted datasets such as ImageNet, while not suffering from GAN's known shortcomings such as mode collapse and lack of diversity.

593 citations


Journal ArticleDOI
TL;DR: A custom deep autoencoder network is designed to discover a coordinate transformation into a reduced space where the dynamics may be sparsely represented, and the governing equations and the associated coordinate system are simultaneously learned.
Abstract: The discovery of governing equations from scientific data has the potential to transform data-rich fields that lack well-characterized quantitative descriptions. Advances in sparse regression are currently enabling the tractable identification of both the structure and parameters of a nonlinear dynamical system from data. The resulting models have the fewest terms necessary to describe the dynamics, balancing model complexity with descriptive ability, and thus promoting interpretability and generalizability. This provides an algorithmic approach to Occam's razor for model discovery. However, this approach fundamentally relies on an effective coordinate system in which the dynamics have a simple representation. In this work, we design a custom deep autoencoder network to discover a coordinate transformation into a reduced space where the dynamics may be sparsely represented. Thus, we simultaneously learn the governing equations and the associated coordinate system. We demonstrate this approach on several example high-dimensional systems with low-dimensional behavior. The resulting modeling framework combines the strengths of deep neural networks for flexible representation and sparse identification of nonlinear dynamics (SINDy) for parsimonious models. This method places the discovery of coordinates and models on an equal footing.

507 citations


Journal ArticleDOI
TL;DR: A generative adversarial network, which is composed of a bidirectional long short-term memory and convolutional neural network, referred as BiLSTM-CNN, to generate synthetic ECG data that agree with existing clinical data so that the features of patients with heart disease can be retained.
Abstract: Heart disease is a malignant threat to human health. Electrocardiogram (ECG) tests are used to help diagnose heart disease by recording the heart’s activity. However, automated medical-aided diagnosis with computers usually requires a large volume of labeled clinical data without patients' privacy to train the model, which is an empirical problem that still needs to be solved. To address this problem, we propose a generative adversarial network (GAN), which is composed of a bidirectional long short-term memory(LSTM) and convolutional neural network(CNN), referred as BiLSTM-CNN,to generate synthetic ECG data that agree with existing clinical data so that the features of patients with heart disease can be retained. The model includes a generator and a discriminator, where the generator employs the two layers of the BiLSTM networks and the discriminator is based on convolutional neural networks. The 48 ECG records from individuals of the MIT-BIH database were used to train the model. We compared the performance of our model with two other generative models, the recurrent neural network autoencoder(RNN-AE) and the recurrent neural network variational autoencoder (RNN-VAE). The results showed that the loss function of our model converged to zero the fastest. We also evaluated the loss of the discriminator of GANs with different combinations of generator and discriminator. The results indicated that BiLSTM-CNN GAN could generate ECG data with high morphological similarity to real ECG recordings.

436 citations


Journal ArticleDOI
TL;DR: The three popular Deep Learning algorithms for Bearing fault diagnosis including Autoencoder, Restricted Boltzmann Machine, and Convolutional Neural Network are briefly introduced and their applications are reviewed through publications and research works on the area of bearing fault diagnosis.

379 citations


Proceedings Article
02 Jun 2019
TL;DR: In this article, the authors explore the use of vector quantized variational autoencoder (VQ-VAE) models for large scale image generation and demonstrate that a multi-scale hierarchical organization with powerful priors over the latent codes is able to generate samples with quality that rivals that of state of the art Generative Adversarial Networks on multifaceted datasets such as ImageNet, while not suffering from GAN's known shortcomings such as mode collapse and lack of diversity.
Abstract: We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation. To this end, we scale and enhance the autoregressive priors used in VQ-VAE to generate synthetic samples of much higher coherence and fidelity than possible before. We use simple feed-forward encoder and decoder networks, making our model an attractive candidate for applications where the encoding and/or decoding speed is critical. Additionally, VQ-VAE requires sampling an autoregressive model only in the compressed latent space, which is an order of magnitude faster than sampling in the pixel space, especially for large images. We demonstrate that a multi-scale hierarchical organization of VQ-VAE, augmented with powerful priors over the latent codes, is able to generate samples with quality that rivals that of state of the art Generative Adversarial Networks on multifaceted datasets such as ImageNet, while not suffering from GAN's known shortcomings such as mode collapse and lack of diversity.

345 citations


Proceedings ArticleDOI
13 May 2019
TL;DR: An end-to-end network that uses a bimodal variational autoencoder coupled with a binary classifier for the task of fake news detection, which outperforms state-of-the-art methods by margins as large as ~ 6% in accuracy and ~ 5% in F1 scores.
Abstract: In recent times, fake news and misinformation have had a disruptive and adverse impact on our lives. Given the prominence of microblogging networks as a source of news for most individuals, fake news now spreads at a faster pace and has a more profound impact than ever before. This makes detection of fake news an extremely important challenge. Fake news articles, just like genuine news articles, leverage multimedia content to manipulate user opinions but spread misinformation. A shortcoming of the current approaches for the detection of fake news is their inability to learn a shared representation of multimodal (textual + visual) information. We propose an end-to-end network, Multimodal Variational Autoencoder (MVAE), which uses a bimodal variational autoencoder coupled with a binary classifier for the task of fake news detection. The model consists of three main components, an encoder, a decoder and a fake news detector module. The variational autoencoder is capable of learning probabilistic latent variable models by optimizing a bound on the marginal likelihood of the observed data. The fake news detector then utilizes the multimodal representations obtained from the bimodal variational autoencoder to classify posts as fake or not. We conduct extensive experiments on two standard fake news datasets collected from popular microblogging websites: Weibo and Twitter. The experimental results show that across the two datasets, on average our model outperforms state-of-the-art methods by margins as large as ~ 6% in accuracy and ~ 5% in F1 scores.

344 citations


Journal ArticleDOI
TL;DR: A deep transfer learning (DTL) network based on sparse autoencoder (SAE) is presented and case study on remaining useful life (RUL) prediction of cutting tool is performed to validate effectiveness of the DTL method.
Abstract: Deep learning with ability to feature learning and nonlinear function approximation has shown its effectiveness for machine fault prediction. While, how to transfer a deep network trained by historical failure data for prediction of a new object is rarely researched. In this paper, a deep transfer learning (DTL) network based on sparse autoencoder (SAE) is presented. In the DTL method, three transfer strategies, that is, weight transfer, transfer learning of hidden feature, and weight update, are used to transfer an SAE trained by historical failure data to a new object. By these strategies, prediction of the new object without supervised information for training is achieved. Moreover, the learned features by deep transfer network for the new object share joint and similar characteristic to that of historical failure data, which is beneficial to accurate prediction. Case study on remaining useful life (RUL) prediction of cutting tool is performed to validate effectiveness of the DTL method. An SAE network is first trained by run-to-failure data with RUL information of a cutting tool in an off-line process. The trained network is then transferred to a new tool under operation for on-line RUL prediction. The prediction result with high accuracy shows advantage of the DTL method for RUL prediction.

336 citations


Journal ArticleDOI
TL;DR: A network-based deep-learning approach for in silico drug repurposing by integrating 10 networks, termed deepDR, which learns high-level features of drugs from the heterogeneous networks by a multimodal deep autoencoder and infer candidates for approved drugs for which they were not originally approved.
Abstract: Motivation Traditional drug discovery and development are often time-consuming and high risk. Repurposing/repositioning of approved drugs offers a relatively low-cost and high-efficiency approach toward rapid development of efficacious treatments. The emergence of large-scale, heterogeneous biological networks has offered unprecedented opportunities for developing in silico drug repositioning approaches. However, capturing highly non-linear, heterogeneous network structures by most existing approaches for drug repositioning has been challenging. Results In this study, we developed a network-based deep-learning approach, termed deepDR, for in silico drug repurposing by integrating 10 networks: one drug-disease, one drug-side-effect, one drug-target and seven drug-drug networks. Specifically, deepDR learns high-level features of drugs from the heterogeneous networks by a multi-modal deep autoencoder. Then the learned low-dimensional representation of drugs together with clinically reported drug-disease pairs are encoded and decoded collectively via a variational autoencoder to infer candidates for approved drugs for which they were not originally approved. We found that deepDR revealed high performance [the area under receiver operating characteristic curve (AUROC) = 0.908], outperforming conventional network-based or machine learning-based approaches. Importantly, deepDR-predicted drug-disease associations were validated by the ClinicalTrials.gov database (AUROC = 0.826) and we showcased several novel deepDR-predicted approved drugs for Alzheimer's disease (e.g. risperidone and aripiprazole) and Parkinson's disease (e.g. methylphenidate and pergolide). Availability and implementation Source code and data can be downloaded from https://github.com/ChengF-Lab/deepDR. Supplementary information Supplementary data are available online at Bioinformatics.

Proceedings ArticleDOI
15 Jun 2019
TL;DR: In this article, a deep autoencoder with a parametric density estimator is used to learn the probability distribution underlying the latent representations with an autoregressive procedure, which effectively acts as a regularizer for the task at hand, by minimizing the differential entropy of the distribution spanned by latent vectors.
Abstract: Novelty detection is commonly referred as the discrimination of observations that do not conform to a learned model of regularity. Despite its importance in different application settings, designing a novelty detector is utterly complex due to the unpredictable nature of novelties and its inaccessibility during the training procedure, factors which expose the unsupervised nature of the problem. In our proposal, we design a general unsupervised framework where we equip a deep autoencoder with a parametric density estimator that learns the probability distribution underlying the latent representations with an autoregressive procedure. We show that a maximum likelihood objective, optimized in conjunction with the reconstruction of normal samples, effectively acts as a regularizer for the task at hand, by minimizing the differential entropy of the distribution spanned by latent vectors. In addition to providing a very general formulation, extensive experiments of our model on publicly available datasets deliver on-par or superior performances if compared to state-of-the-art methods in one-class and in video anomaly detection settings. Differently from our competitors, we remark that our proposal does not make any assumption about the nature of the novelties, making our work easily applicable to disparate contexts.

Journal ArticleDOI
TL;DR: It is demonstrated that the novel MCNN and CCNN fusion methods outperforms all the state-of-the-art machine learning and deep learning techniques for EEG classification.

Journal ArticleDOI
17 Jul 2019
TL;DR: Li et al. as discussed by the authors proposed an adaptive random gradient estimation strategy to balance query counts and distortion, and an autoencoder that is either trained offline with unlabeled data or a bilinear resizing operation for attack acceleration.
Abstract: Recent studies have shown that adversarial examples in state-of-the-art image classifiers trained by deep neural networks (DNN) can be easily generated when the target model is transparent to an attacker, known as the white-box setting. However, when attacking a deployed machine learning service, one can only acquire the input-output correspondences of the target model; this is the so-called black-box attack setting. The major drawback of existing black-box attacks is the need for excessive model queries, which may give a false sense of model robustness due to inefficient query designs. To bridge this gap, we propose a generic framework for query-efficient blackbox attacks. Our framework, AutoZOOM, which is short for Autoencoder-based Zeroth Order Optimization Method, has two novel building blocks towards efficient black-box attacks: (i) an adaptive random gradient estimation strategy to balance query counts and distortion, and (ii) an autoencoder that is either trained offline with unlabeled data or a bilinear resizing operation for attack acceleration. Experimental results suggest that, by applying AutoZOOM to a state-of-the-art black-box attack (ZOO), a significant reduction in model queries can be achieved without sacrificing the attack success rate and the visual quality of the resulting adversarial examples. In particular, when compared to the standard ZOO method, AutoZOOM can consistently reduce the mean query counts in finding successful adversarial examples (or reaching the same distortion level) by at least 93% on MNIST, CIFAR-10 and ImageNet datasets, leading to novel insights on adversarial robustness.

Proceedings ArticleDOI
31 Mar 2019
TL;DR: In this article, a pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner is proposed, where the VAE tries to trick the adversarial network into predicting that all data points are from the labeled pool.
Abstract: Active learning aims to develop label-efficient algorithms by sampling the most representative queries to be labeled by an oracle. We describe a pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner. Our method learns a latent space using a variational autoencoder (VAE) and an adversarial network trained to discriminate between unlabeled and labeled data. The mini-max game between the VAE and the adversarial network is played such that while the VAE tries to trick the adversarial network into predicting that all data points are from the labeled pool, the adversarial network learns how to discriminate between dissimilarities in the latent space. We extensively evaluate our method on various image classification and semantic segmentation benchmark datasets and establish a new state of the art on CIFAR10/100, Caltech-256, ImageNet, Cityscapes, and BDD100K. Our results demonstrate that our adversarial approach learns an effective low dimensional latent space in large-scale settings and provides for a computationally efficient sampling method. Our code is available at \url{https://github.com/sinhasam/vaal}.

Journal ArticleDOI
TL;DR: A regularization scheme is introduced that forces the representations to focus on the phonetic content of the utterance and report performance comparable with the top entries in the ZeroSpeech 2017 unsupervised acoustic unit discovery task.
Abstract: We consider the task of unsupervised extraction of meaningful latent representations of speech by applying autoencoding neural networks to speech waveforms. The goal is to learn a representation able to capture high level semantic content from the signal, e.g. phoneme identities, while being invariant to confounding low level details in the signal such as the underlying pitch contour or background noise. Since the learned representation is tuned to contain only phonetic content, we resort to using a high capacity WaveNet decoder to infer information discarded by the encoder from previous samples. Moreover, the behavior of autoencoder models depends on the kind of constraint that is applied to the latent representation. We compare three variants: a simple dimensionality reduction bottleneck, a Gaussian Variational Autoencoder (VAE), and a discrete Vector Quantized VAE (VQ-VAE). We analyze the quality of learned representations in terms of speaker independence, the ability to predict phonetic content, and the ability to accurately reconstruct individual spectrogram frames. Moreover, for discrete encodings extracted using the VQ-VAE, we measure the ease of mapping them to phonemes. We introduce a regularization scheme that forces the representations to focus on the phonetic content of the utterance and report performance comparable with the top entries in the ZeroSpeech 2017 unsupervised acoustic unit discovery task.

Proceedings Article
16 Jan 2019
TL;DR: This paper investigates posterior collapse from the perspective of training dynamics and proposes an extremely simple modification to VAE training to reduce inference lag: depending on the model's current mutual information between latent variable and observation, the inference network is optimized before performing each model update.
Abstract: The variational autoencoder (VAE) is a popular combination of deep latent variable model and accompanying variational learning technique. By using a neural inference network to approximate the model's posterior on latent variables, VAEs efficiently parameterize a lower bound on marginal data likelihood that can be optimized directly via gradient methods. In practice, however, VAE training often results in a degenerate local optimum known as "posterior collapse" where the model learns to ignore the latent variable and the approximate posterior mimics the prior. In this paper, we investigate posterior collapse from the perspective of training dynamics. We find that during the initial stages of training the inference network fails to approximate the model's true posterior, which is a moving target. As a result, the model is encouraged to ignore the latent encoding and posterior collapse occurs. Based on this observation, we propose an extremely simple modification to VAE training to reduce inference lag: depending on the model's current mutual information between latent variable and observation, we aggressively optimize the inference network before performing each model update. Despite introducing neither new model components nor significant complexity over basic VAE, our approach is able to avoid the problem of collapse that has plagued a large amount of previous work. Empirically, our approach outperforms strong autoregressive baselines on text and image benchmarks in terms of held-out likelihood, and is competitive with more complex techniques for avoiding collapse while being substantially faster.

Proceedings ArticleDOI
30 Apr 2019
TL;DR: A joint learning framework for discriminative embedding and spectral clustering is proposed, which can significantly outperform state-of-the-art clustering approaches and be more robust to noise.
Abstract: The clustering methods have recently absorbed even-increasing attention in learning and vision. Deep clustering combines embedding and clustering together to obtain optimal embedding subspace for clustering, which can be more effective compared with conventional clustering methods. In this paper, we propose a joint learning framework for discriminative embedding and spectral clustering. We first devise a dual autoencoder network, which enforces the reconstruction constraint for the latent representations and their noisy versions, to embed the inputs into a latent space for clustering. As such the learned latent representations can be more robust to noise. Then the mutual information estimation is utilized to provide more discriminative information from the inputs. Furthermore, a deep spectral clustering method is applied to embed the latent representations into the eigenspace and subsequently clusters them, which can fully exploit the relationship between inputs to achieve optimal clustering results. Experimental results on benchmark datasets show that our method can significantly outperform state-of-the-art clustering approaches.

Journal ArticleDOI
TL;DR: The proposed CDSAE framework comprises two stages with different optimization objectives, which can learn discriminative low-dimensional feature mappings and train an effective classifier progressively, and imposes a local Fisher discriminant regularization on each hidden layer of stacked autoencoder (SAE) to train discrim inative SAE (DSAE).
Abstract: As one of the fundamental research topics in remote sensing image analysis, hyperspectral image (HSI) classification has been extensively studied so far. However, how to discriminatively learn a low-dimensional feature space, in which the mapped features have small within-class scatter and big between-class separation, is still a challenging problem. To address this issue, this paper proposes an effective framework, named compact and discriminative stacked autoencoder (CDSAE), for HSI classification. The proposed CDSAE framework comprises two stages with different optimization objectives, which can learn discriminative low-dimensional feature mappings and train an effective classifier progressively. First, we impose a local Fisher discriminant regularization on each hidden layer of stacked autoencoder (SAE) to train discriminative SAE (DSAE) by minimizing reconstruction error. This stage can learn feature mappings, in which the pixels from the same land-cover class are mapped as nearly as possible and the pixels from different land-cover categories are separated by a large margin. Second, we learn an effective classifier and meanwhile update DSAE with a local Fisher discriminant regularization being embedded on the top of feature representations. Moreover, to learn a compact DSAE with as small number of hidden neurons as possible, we impose a diversity regularization on the hidden neurons of DSAE to balance the feature dimensionality and the feature representation capability. The experimental results on three widely-used HSI data sets and comprehensive comparisons with existing methods demonstrate that our proposed method is effective.

Journal ArticleDOI
TL;DR: This work reconstructs the high-dimensional features of Android applications (apps) and employ multiple CNN to detect Android malware and proposes a hybrid model based on deep autoencoder (DAE) and convolutional neural network (CNN), which shows powerful ability in feature extraction and malware detection.
Abstract: Android security incidents occurred frequently in recent years. To improve the accuracy and efficiency of large-scale Android malware detection, in this work, we propose a hybrid model based on deep autoencoder (DAE) and convolutional neural network (CNN). First, to improve the accuracy of malware detection, we reconstruct the high-dimensional features of Android applications (apps) and employ multiple CNN to detect Android malware. In the serial convolutional neural network architecture (CNN-S), we use Relu, a non-linear function, as the activation function to increase sparseness and “dropout” to prevent over-fitting. The convolutional layer and pooling layer are combined with the full-connection layer to enhance feature extraction capability. Under these conditions, CNN-S shows powerful ability in feature extraction and malware detection. Second, to reduce the training time, we use deep autoencoder as a pre-training method of CNN. With the combination, deep autoencoder and CNN model (DAE-CNN) can learn more flexible patterns in a short time. We conduct experiments on 10,000 benign apps and 13,000 malicious apps. CNN-S demonstrates a significant improvement compared with traditional machine learning methods in Android malware detection. In details, compared with SVM, the accuracy with the CNN-S model is improved by 5%, while the training time using DAE-CNN model is reduced by 83% compared with CNN-S model.


Journal ArticleDOI
TL;DR: Experimental results on several benchmark hyperspectral data sets have demonstrated that the proposed 3D-CAE is very effective in extracting spatial–spectral features and outperforms not only traditional unsupervised feature extraction algorithms but also many supervised feature extraction algorithm in classification application.
Abstract: Feature learning technologies using convolutional neural networks (CNNs) have shown superior performance over traditional hand-crafted feature extraction algorithms. However, a large number of labeled samples are generally required for CNN to learn effective features under classification task, which are hard to be obtained for hyperspectral remote sensing images. Therefore, in this paper, an unsupervised spatial–spectral feature learning strategy is proposed for hyperspectral images using 3-Dimensional (3D) convolutional autoencoder (3D-CAE). The proposed 3D-CAE consists of 3D or elementwise operations only, such as 3D convolution, 3D pooling, and 3D batch normalization, to maximally explore spatial–spectral structure information for feature extraction. A companion 3D convolutional decoder network is also designed to reconstruct the input patterns to the proposed 3D-CAE, by which all the parameters involved in the network can be trained without labeled training samples. As a result, effective features are learned in an unsupervised mode that label information of pixels is not required. Experimental results on several benchmark hyperspectral data sets have demonstrated that our proposed 3D-CAE is very effective in extracting spatial–spectral features and outperforms not only traditional unsupervised feature extraction algorithms but also many supervised feature extraction algorithms in classification application.

Journal ArticleDOI
TL;DR: A sensor-based data-driven scheme using a deep learning tool and the similarity-based curve matching technique to estimate the RUL of a system, which demonstrates the competitiveness of the proposed method used for RUL estimation of systems.

Journal ArticleDOI
15 Aug 2019-Methods
TL;DR: This review provides both the exoteric introduction of deep learning, and concrete examples and implementations of its representative applications in bioinformatics, and introduces deep learning in an easy-to-understand fashion.

Journal ArticleDOI
TL;DR: A new deep learning architecture, LatentGAN, which combines an autoencoder and a generative adversarial neural network for de novo molecular design is proposed, indicating that both methods can be used complementarily.
Abstract: Deep learning methods applied to drug discovery have been used to generate novel structures. In this study, we propose a new deep learning architecture, LatentGAN, which combines an autoencoder and a generative adversarial neural network for de novo molecular design. We applied the method in two scenarios: one to generate random drug-like compounds and another to generate target-biased compounds. Our results show that the method works well in both cases. Sampled compounds from the trained model can largely occupy the same chemical space as the training set and also generate a substantial fraction of novel compounds. Moreover, the drug-likeness score of compounds sampled from LatentGAN is also similar to that of the training set. Lastly, generated compounds differ from those obtained with a Recurrent Neural Network-based generative model approach, indicating that both methods can be used complementarily.

Posted Content
TL;DR: A pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner that learns an effective low dimensional latent space in large-scale settings and provides for a computationally efficient sampling method.
Abstract: Active learning aims to develop label-efficient algorithms by sampling the most representative queries to be labeled by an oracle. We describe a pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner. Unlike conventional active learning algorithms, our approach is task agnostic, i.e., it does not depend on the performance of the task for which we are trying to acquire labeled data. Our method learns a latent space using a variational autoencoder (VAE) and an adversarial network trained to discriminate between unlabeled and labeled data. The mini-max game between the VAE and the adversarial network is played such that while the VAE tries to trick the adversarial network into predicting that all data points are from the labeled pool, the adversarial network learns how to discriminate between dissimilarities in the latent space. We extensively evaluate our method on various image classification and semantic segmentation benchmark datasets and establish a new state of the art on $\text{CIFAR10/100}$, $\text{Caltech-256}$, $\text{ImageNet}$, $\text{Cityscapes}$, and $\text{BDD100K}$. Our results demonstrate that our adversarial approach learns an effective low dimensional latent space in large-scale settings and provides for a computationally efficient sampling method. Our code is available at this https URL.

Proceedings Article
Yue Yu1, Jie Chen2, Tian Gao2, Mo Yu2
24 May 2019
TL;DR: A deep generative model is proposed and a variant of the structural constraint to learn the DAG is applied that learns more accurate graphs for nonlinearly generated samples; and on benchmark data sets with discrete variables, the learned graphs are reasonably close to the global optima.
Abstract: Learning a faithful directed acyclic graph (DAG) from samples of a joint distribution is a challenging combinatorial problem, owing to the intractable search space superexponential in the number of graph nodes. A recent breakthrough formulates the problem as a continuous optimization with a structural constraint that ensures acyclicity (Zheng et al., 2018). The authors apply the approach to the linear structural equation model (SEM) and the least-squares loss function that are statistically well justified but nevertheless limited. Motivated by the widespread success of deep learning that is capable of capturing complex nonlinear mappings, in this work we propose a deep generative model and apply a variant of the structural constraint to learn the DAG. At the heart of the generative model is a variational autoencoder parameterized by a novel graph neural network architecture, which we coin DAG-GNN. In addition to the richer capacity, an advantage of the proposed model is that it naturally handles discrete variables as well as vector-valued ones. We demonstrate that on synthetic data sets, the proposed method learns more accurate graphs for nonlinearly generated samples; and on benchmark data sets with discrete variables, the learned graphs are reasonably close to the global optima. The code is available at \url{this https URL}.

Journal ArticleDOI
TL;DR: A fully convolutional neural network is used to create time-resolved three-dimensional dense segmentations of heart images that can efficiently predict human survival.
Abstract: Motion analysis is used in computer vision to understand the behaviour of moving objects in sequences of images. Optimising the interpretation of dynamic biological systems requires accurate and precise motion tracking as well as efficient representations of high-dimensional motion trajectories so that these can be used for prediction tasks. Here we use image sequences of the heart, acquired using cardiac magnetic resonance imaging, to create time-resolved three-dimensional segmentations using a fully convolutional network trained on anatomical shape priors. This dense motion model formed the input to a supervised denoising autoencoder (4Dsurvival), which is a hybrid network consisting of an autoencoder that learns a task-specific latent code representation trained on observed outcome data, yielding a latent representation optimised for survival prediction. To handle right-censored survival outcomes, our network used a Cox partial likelihood loss function. In a study of 302 patients the predictive accuracy (quantified by Harrell's C-index) was significantly higher (p = .0012) for our model C=0.75 (95% CI: 0.70 - 0.79) than the human benchmark of C=0.59 (95% CI: 0.53 - 0.65). This work demonstrates how a complex computer vision task using high-dimensional medical image data can efficiently predict human survival.

Journal ArticleDOI
TL;DR: In this paper, a method for learning low-dimensional approximations of nonlinear dynamical systems, based on neural network approximation of the underlying Koopman operator, is described.
Abstract: This paper describes a method for learning low-dimensional approximations of nonlinear dynamical systems, based on neural network approximations of the underlying Koopman operator. Extended Dynamic...

Journal ArticleDOI
TL;DR: A new technique for unsupervised unmixing which is based on a deep autoencoder network (DAEN), which can unmix data sets with outliers and low signal-to-noise ratio and demonstrates very competitive performance.
Abstract: Spectral unmixing is a technique for remotely sensed image interpretation that expresses each (possibly mixed) pixel as a combination of pure spectral signatures (endmembers) and their fractional abundances. In this paper, we develop a new technique for unsupervised unmixing which is based on a deep autoencoder network (DAEN). Our newly developed DAEN consists of two parts. The first part of the network adopts stacked autoencoders (SAEs) to learn spectral signatures, so as to generate a good initialization for the unmixing process. In the second part of the network, a variational autoencoder (VAE) is employed to perform blind source separation, aimed at obtaining the endmember signatures and abundance fractions simultaneously. By taking advantage from the SAEs, the robustness of the proposed approach is remarkable as it can unmix data sets with outliers and low signal-to-noise ratio. Moreover, the multihidden layers of the VAE ensure the required constraints (nonnegativity and sum-to-one) when estimating the abundances. The effectiveness of the proposed method is evaluated using both synthetic and real hyperspectral data. When compared with other unmixing methods, the proposed approach demonstrates very competitive performance.