Showing papers on "Autoencoder published in 2019"

PDF

Open Access

Proceedings Article•DOI•

Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection

[...]

Dong Gong¹, Lingqiao Liu¹, Vuong Le², Budhaditya Saha², Moussa Reda Mansour, Svetha Venkatesh², Anton van den Hengel¹ - Show less +3 more•Institutions (2)

University of Adelaide¹, Deakin University²

01 Jan 2019

TL;DR: The proposed memory-augmented autoencoder called MemAE is free of assumptions on the data type and thus general to be applied to different tasks and proves the excellent generalization and high effectiveness of the proposed MemAE.

...read moreread less

Abstract: Deep autoencoder has been extensively used for anomaly detection. Training on the normal data, the autoencoder is expected to produce higher reconstruction error for the abnormal inputs than the normal ones, which is adopted as a criterion for identifying anomalies. However, this assumption does not always hold in practice. It has been observed that sometimes the autoencoder "generalizes" so well that it can also reconstruct anomalies well, leading to the miss detection of anomalies. To mitigate this drawback for autoencoder based anomaly detector, we propose to augment the autoencoder with a memory module and develop an improved autoencoder called memory-augmented autoencoder, i.e. MemAE. Given an input, MemAE firstly obtains the encoding from the encoder and then uses it as a query to retrieve the most relevant memory items for reconstruction. At the training stage, the memory contents are updated and are encouraged to represent the prototypical elements of the normal data. At the test stage, the learned memory will be fixed, and the reconstruction is obtained from a few selected memory records of the normal data. The reconstruction will thus tend to be close to a normal sample. Thus the reconstructed errors on anomalies will be strengthened for anomaly detection. MemAE is free of assumptions on the data type and thus general to be applied to different tasks. Experiments on various datasets prove the excellent generalization and high effectiveness of the proposed MemAE.

...read moreread less

888 citations

Journal Article•DOI•

A New Deep Transfer Learning Based on Sparse Auto-Encoder for Fault Diagnosis

[...]

Long Wen¹, Liang Gao¹, Xinyu Li¹•Institutions (1)

Huazhong University of Science and Technology¹

01 Jan 2019-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: A new DTL method is proposed, which uses a three-layer sparse auto-encoder to extract the features of raw data, and applies the maximum mean discrepancy term to minimizing the discrepancy penalty between the features from training data and testing data.

...read moreread less

Abstract: Fault diagnosis plays an important role in modern industry. With the development of smart manufacturing, the data-driven fault diagnosis becomes hot. However, traditional methods have two shortcomings: 1) their performances depend on the good design of handcrafted features of data, but it is difficult to predesign these features and 2) they work well under a general assumption: the training data and testing data should be drawn from the same distribution, but this assumption fails in many engineering applications. Since deep learning (DL) can extract the hierarchical representation features of raw data, and transfer learning provides a good way to perform a learning task on the different but related distribution datasets, deep transfer learning (DTL) has been developed for fault diagnosis. In this paper, a new DTL method is proposed. It uses a three-layer sparse auto-encoder to extract the features of raw data, and applies the maximum mean discrepancy term to minimizing the discrepancy penalty between the features from training data and testing data. The proposed DTL is tested on the famous motor bearing dataset from the Case Western Reserve University. The results show a good improvement, and DTL achieves higher prediction accuracies on most experiments than DL. The prediction accuracy of DTL, which is as high as 99.82%, is better than the results of other algorithms, including deep belief network, sparse filter, artificial neural network, support vector machine and some other traditional methods. What is more, two additional analytical experiments are conducted. The results show that a good unlabeled third dataset may be helpful to DTL, and a good linear relationship between the final prediction accuracies and their standard deviations have been observed.

...read moreread less

760 citations

Journal Article•DOI•

Single-cell RNA-seq denoising using a deep count autoencoder

[...]

Gökcen Eraslan¹, Lukas M. Simon, Maria Mircea, Nikola S. Mueller, Fabian J. Theis¹ - Show less +1 more•Institutions (1)

Technische Universität München¹

23 Jan 2019-Nature Communications

TL;DR: ADenoising method based on a deep count autoencoder network that scales linearly with the number of cells, and therefore is compatible with large data sets, is developed and demonstrated that DCA denoising improves a diverse set of typical scRNA-seq data analyses using simulated and real datasets.

...read moreread less

Abstract: Single-cell RNA sequencing (scRNA-seq) has enabled researchers to study gene expression at a cellular resolution. However, noise due to amplification and dropout may obstruct analyses, so scalable denoising methods for increasingly large but sparse scRNA-seq data are needed. We propose a deep count autoencoder network (DCA) to denoise scRNA-seq datasets. DCA takes the count distribution, overdispersion and sparsity of the data into account using a negative binomial noise model with or without zero-inflation, and nonlinear gene-gene dependencies are captured. Our method scales linearly with the number of cells and can, therefore, be applied to datasets of millions of cells. We demonstrate that DCA denoising improves a diverse set of typical scRNA-seq data analyses using simulated and real datasets. DCA outperforms existing methods for data imputation in quality and speed, enhancing biological discovery.

...read moreread less

661 citations

Posted Content•

Generating Diverse High-Fidelity Images with VQ-VAE-2

[...]

Ali Razavi¹, Aaron van den Oord¹, Oriol Vinyals¹•Institutions (1)

Google¹

02 Jun 2019-arXiv: Learning

TL;DR: It is demonstrated that a multi-scale hierarchical organization of VQ-VAE, augmented with powerful priors over the latent codes, is able to generate samples with quality that rivals that of state of the art Generative Adversarial Networks on multifaceted datasets such as ImageNet, while not suffering from GAN's known shortcomings such as mode collapse and lack of diversity.

...read moreread less

Abstract: We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation. To this end, we scale and enhance the autoregressive priors used in VQ-VAE to generate synthetic samples of much higher coherence and fidelity than possible before. We use simple feed-forward encoder and decoder networks, making our model an attractive candidate for applications where the encoding and/or decoding speed is critical. Additionally, VQ-VAE requires sampling an autoregressive model only in the compressed latent space, which is an order of magnitude faster than sampling in the pixel space, especially for large images. We demonstrate that a multi-scale hierarchical organization of VQ-VAE, augmented with powerful priors over the latent codes, is able to generate samples with quality that rivals that of state of the art Generative Adversarial Networks on multifaceted datasets such as ImageNet, while not suffering from GAN's known shortcomings such as mode collapse and lack of diversity.

...read moreread less

593 citations

Journal Article•DOI•

Data-driven discovery of coordinates and governing equations

[...]

Kathleen Champion¹, Bethany Lusch², J. Nathan Kutz¹, Steven L. Brunton¹•Institutions (2)

University of Washington¹, Argonne National Laboratory²

05 Nov 2019-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A custom deep autoencoder network is designed to discover a coordinate transformation into a reduced space where the dynamics may be sparsely represented, and the governing equations and the associated coordinate system are simultaneously learned.

...read moreread less

Abstract: The discovery of governing equations from scientific data has the potential to transform data-rich fields that lack well-characterized quantitative descriptions. Advances in sparse regression are currently enabling the tractable identification of both the structure and parameters of a nonlinear dynamical system from data. The resulting models have the fewest terms necessary to describe the dynamics, balancing model complexity with descriptive ability, and thus promoting interpretability and generalizability. This provides an algorithmic approach to Occam's razor for model discovery. However, this approach fundamentally relies on an effective coordinate system in which the dynamics have a simple representation. In this work, we design a custom deep autoencoder network to discover a coordinate transformation into a reduced space where the dynamics may be sparsely represented. Thus, we simultaneously learn the governing equations and the associated coordinate system. We demonstrate this approach on several example high-dimensional systems with low-dimensional behavior. The resulting modeling framework combines the strengths of deep neural networks for flexible representation and sparse identification of nonlinear dynamics (SINDy) for parsimonious models. This method places the discovery of coordinates and models on an equal footing.

...read moreread less

507 citations

Journal Article•DOI•

Electrocardiogram generation with a bidirectional LSTM-CNN generative adversarial network.

[...]

Fei Zhu¹, Ye Fei¹, Yuchen Fu², Quan Liu¹, Bairong Shen³ - Show less +1 more•Institutions (3)

Soochow University (Suzhou)¹, Changshu Institute of Technology², Sichuan University³

01 May 2019-Scientific Reports

TL;DR: A generative adversarial network, which is composed of a bidirectional long short-term memory and convolutional neural network, referred as BiLSTM-CNN, to generate synthetic ECG data that agree with existing clinical data so that the features of patients with heart disease can be retained.

...read moreread less

Abstract: Heart disease is a malignant threat to human health. Electrocardiogram (ECG) tests are used to help diagnose heart disease by recording the heart’s activity. However, automated medical-aided diagnosis with computers usually requires a large volume of labeled clinical data without patients' privacy to train the model, which is an empirical problem that still needs to be solved. To address this problem, we propose a generative adversarial network (GAN), which is composed of a bidirectional long short-term memory(LSTM) and convolutional neural network(CNN), referred as BiLSTM-CNN,to generate synthetic ECG data that agree with existing clinical data so that the features of patients with heart disease can be retained. The model includes a generator and a discriminator, where the generator employs the two layers of the BiLSTM networks and the discriminator is based on convolutional neural networks. The 48 ECG records from individuals of the MIT-BIH database were used to train the model. We compared the performance of our model with two other generative models, the recurrent neural network autoencoder(RNN-AE) and the recurrent neural network variational autoencoder (RNN-VAE). The results showed that the loss function of our model converged to zero the fastest. We also evaluated the loss of the discriminator of GANs with different combinations of generator and discriminator. The results indicated that BiLSTM-CNN GAN could generate ECG data with high morphological similarity to real ECG recordings.

...read moreread less

436 citations

Journal Article•DOI•

A survey on Deep Learning based bearing fault diagnosis

[...]

Duy-Tang Hoang¹, Hee-Jun Kang¹•Institutions (1)

University of Ulsan¹

28 Mar 2019-Neurocomputing

TL;DR: The three popular Deep Learning algorithms for Bearing fault diagnosis including Autoencoder, Restricted Boltzmann Machine, and Convolutional Neural Network are briefly introduced and their applications are reviewed through publications and research works on the area of bearing fault diagnosis.

...read moreread less

379 citations

Proceedings Article•

Generating Diverse High-Fidelity Images with VQ-VAE-2

[...]

Ali Razavi¹, Aaron van den Oord¹, Oriol Vinyals¹•Institutions (1)

Google¹

02 Jun 2019

TL;DR: In this article, the authors explore the use of vector quantized variational autoencoder (VQ-VAE) models for large scale image generation and demonstrate that a multi-scale hierarchical organization with powerful priors over the latent codes is able to generate samples with quality that rivals that of state of the art Generative Adversarial Networks on multifaceted datasets such as ImageNet, while not suffering from GAN's known shortcomings such as mode collapse and lack of diversity.

...read moreread less

345 citations

Proceedings Article•DOI•

MVAE: Multimodal Variational Autoencoder for Fake News Detection

[...]

Dhruv Khattar¹, Jaipal Singh Goud¹, Manish Gupta¹, Vasudeva Varma¹•Institutions (1)

International Institute of Information Technology, Hyderabad¹

13 May 2019

TL;DR: An end-to-end network that uses a bimodal variational autoencoder coupled with a binary classifier for the task of fake news detection, which outperforms state-of-the-art methods by margins as large as ~ 6% in accuracy and ~ 5% in F1 scores.

...read moreread less

Abstract: In recent times, fake news and misinformation have had a disruptive and adverse impact on our lives. Given the prominence of microblogging networks as a source of news for most individuals, fake news now spreads at a faster pace and has a more profound impact than ever before. This makes detection of fake news an extremely important challenge. Fake news articles, just like genuine news articles, leverage multimedia content to manipulate user opinions but spread misinformation. A shortcoming of the current approaches for the detection of fake news is their inability to learn a shared representation of multimodal (textual + visual) information. We propose an end-to-end network, Multimodal Variational Autoencoder (MVAE), which uses a bimodal variational autoencoder coupled with a binary classifier for the task of fake news detection. The model consists of three main components, an encoder, a decoder and a fake news detector module. The variational autoencoder is capable of learning probabilistic latent variable models by optimizing a bound on the marginal likelihood of the observed data. The fake news detector then utilizes the multimodal representations obtained from the bimodal variational autoencoder to classify posts as fake or not. We conduct extensive experiments on two standard fake news datasets collected from popular microblogging websites: Weibo and Twitter. The experimental results show that across the two datasets, on average our model outperforms state-of-the-art methods by margins as large as ~ 6% in accuracy and ~ 5% in F1 scores.

...read moreread less

344 citations

Journal Article•DOI•

Deep Transfer Learning Based on Sparse Autoencoder for Remaining Useful Life Prediction of Tool in Manufacturing

[...]

Chuang Sun¹, Meng Ma², Zhibin Zhao¹, Shaohua Tian¹, Ruqiang Yan¹, Xuefeng Chen¹ - Show less +2 more•Institutions (2)

Xi'an Jiaotong University¹, University of Massachusetts Lowell²

01 Apr 2019-IEEE Transactions on Industrial Informatics

TL;DR: A deep transfer learning (DTL) network based on sparse autoencoder (SAE) is presented and case study on remaining useful life (RUL) prediction of cutting tool is performed to validate effectiveness of the DTL method.

...read moreread less

Abstract: Deep learning with ability to feature learning and nonlinear function approximation has shown its effectiveness for machine fault prediction. While, how to transfer a deep network trained by historical failure data for prediction of a new object is rarely researched. In this paper, a deep transfer learning (DTL) network based on sparse autoencoder (SAE) is presented. In the DTL method, three transfer strategies, that is, weight transfer, transfer learning of hidden feature, and weight update, are used to transfer an SAE trained by historical failure data to a new object. By these strategies, prediction of the new object without supervised information for training is achieved. Moreover, the learned features by deep transfer network for the new object share joint and similar characteristic to that of historical failure data, which is beneficial to accurate prediction. Case study on remaining useful life (RUL) prediction of cutting tool is performed to validate effectiveness of the DTL method. An SAE network is first trained by run-to-failure data with RUL information of a cutting tool in an off-line process. The trained network is then transferred to a new tool under operation for on-line RUL prediction. The prediction result with high accuracy shows advantage of the DTL method for RUL prediction.

...read moreread less

336 citations

Journal Article•DOI•

deepDR: a network-based deep learning approach to in silico drug repositioning.

[...]

Xiangxiang Zeng¹, Siyi Zhu¹, Xiangrong Liu¹, Yadi Zhou², Ruth Nussinov³, Feixiong Cheng⁴, Feixiong Cheng⁵, Feixiong Cheng² - Show less +4 more•Institutions (5)

Xiamen University¹, Cleveland Clinic Lerner Research Institute², Tel Aviv University³, Case Western Reserve University⁴, Cleveland Clinic Lerner College of Medicine⁵

15 Dec 2019-Bioinformatics

TL;DR: A network-based deep-learning approach for in silico drug repurposing by integrating 10 networks, termed deepDR, which learns high-level features of drugs from the heterogeneous networks by a multimodal deep autoencoder and infer candidates for approved drugs for which they were not originally approved.

...read moreread less

Abstract: Motivation Traditional drug discovery and development are often time-consuming and high risk. Repurposing/repositioning of approved drugs offers a relatively low-cost and high-efficiency approach toward rapid development of efficacious treatments. The emergence of large-scale, heterogeneous biological networks has offered unprecedented opportunities for developing in silico drug repositioning approaches. However, capturing highly non-linear, heterogeneous network structures by most existing approaches for drug repositioning has been challenging. Results In this study, we developed a network-based deep-learning approach, termed deepDR, for in silico drug repurposing by integrating 10 networks: one drug-disease, one drug-side-effect, one drug-target and seven drug-drug networks. Specifically, deepDR learns high-level features of drugs from the heterogeneous networks by a multi-modal deep autoencoder. Then the learned low-dimensional representation of drugs together with clinically reported drug-disease pairs are encoded and decoded collectively via a variational autoencoder to infer candidates for approved drugs for which they were not originally approved. We found that deepDR revealed high performance [the area under receiver operating characteristic curve (AUROC) = 0.908], outperforming conventional network-based or machine learning-based approaches. Importantly, deepDR-predicted drug-disease associations were validated by the ClinicalTrials.gov database (AUROC = 0.826) and we showcased several novel deepDR-predicted approved drugs for Alzheimer's disease (e.g. risperidone and aripiprazole) and Parkinson's disease (e.g. methylphenidate and pergolide). Availability and implementation Source code and data can be downloaded from https://github.com/ChengF-Lab/deepDR. Supplementary information Supplementary data are available online at Bioinformatics.

...read moreread less

Proceedings Article•DOI•

Latent Space Autoregression for Novelty Detection

[...]

Davide Abati¹, Angelo Porrello¹, Simone Calderara¹, Rita Cucchiara•Institutions (1)

University of Modena and Reggio Emilia¹

15 Jun 2019

TL;DR: In this article, a deep autoencoder with a parametric density estimator is used to learn the probability distribution underlying the latent representations with an autoregressive procedure, which effectively acts as a regularizer for the task at hand, by minimizing the differential entropy of the distribution spanned by latent vectors.

...read moreread less

Abstract: Novelty detection is commonly referred as the discrimination of observations that do not conform to a learned model of regularity. Despite its importance in different application settings, designing a novelty detector is utterly complex due to the unpredictable nature of novelties and its inaccessibility during the training procedure, factors which expose the unsupervised nature of the problem. In our proposal, we design a general unsupervised framework where we equip a deep autoencoder with a parametric density estimator that learns the probability distribution underlying the latent representations with an autoregressive procedure. We show that a maximum likelihood objective, optimized in conjunction with the reconstruction of normal samples, effectively acts as a regularizer for the task at hand, by minimizing the differential entropy of the distribution spanned by latent vectors. In addition to providing a very general formulation, extensive experiments of our model on publicly available datasets deliver on-par or superior performances if compared to state-of-the-art methods in one-class and in video anomaly detection settings. Differently from our competitors, we remark that our proposal does not make any assumption about the nature of the novelties, making our work easily applicable to disparate contexts.

...read moreread less

Journal Article•DOI•

Deep Learning for EEG motor imagery classification based on multi-layer CNNs feature fusion

[...]

Syed Umar Amin¹, Mansour Alsulaiman¹, Ghulam Muhammad¹, Mohamed Amine Mekhtiche¹, M. Shamim Hossain¹ - Show less +1 more•Institutions (1)

King Saud University¹

01 Dec 2019-Future Generation Computer Systems

TL;DR: It is demonstrated that the novel MCNN and CCNN fusion methods outperforms all the state-of-the-art machine learning and deep learning techniques for EEG classification.

...read moreread less

Journal Article•DOI•

AutoZOOM: Autoencoder-Based Zeroth Order Optimization Method for Attacking Black-Box Neural Networks

[...]

Chun-Chen Tu¹, Paishun Ting¹, Pin-Yu Chen², Sijia Liu², Huan Zhang³, Jinfeng Yi, Cho-Jui Hsieh³, Shin-Ming Cheng⁴ - Show less +4 more•Institutions (4)

University of Michigan¹, IBM², University of California, Los Angeles³, National Taiwan University of Science and Technology⁴

17 Jul 2019

TL;DR: Li et al. as discussed by the authors proposed an adaptive random gradient estimation strategy to balance query counts and distortion, and an autoencoder that is either trained offline with unlabeled data or a bilinear resizing operation for attack acceleration.

...read moreread less

Abstract: Recent studies have shown that adversarial examples in state-of-the-art image classifiers trained by deep neural networks (DNN) can be easily generated when the target model is transparent to an attacker, known as the white-box setting. However, when attacking a deployed machine learning service, one can only acquire the input-output correspondences of the target model; this is the so-called black-box attack setting. The major drawback of existing black-box attacks is the need for excessive model queries, which may give a false sense of model robustness due to inefficient query designs. To bridge this gap, we propose a generic framework for query-efficient blackbox attacks. Our framework, AutoZOOM, which is short for Autoencoder-based Zeroth Order Optimization Method, has two novel building blocks towards efficient black-box attacks: (i) an adaptive random gradient estimation strategy to balance query counts and distortion, and (ii) an autoencoder that is either trained offline with unlabeled data or a bilinear resizing operation for attack acceleration. Experimental results suggest that, by applying AutoZOOM to a state-of-the-art black-box attack (ZOO), a significant reduction in model queries can be achieved without sacrificing the attack success rate and the visual quality of the resulting adversarial examples. In particular, when compared to the standard ZOO method, AutoZOOM can consistently reduce the mean query counts in finding successful adversarial examples (or reaching the same distortion level) by at least 93% on MNIST, CIFAR-10 and ImageNet datasets, leading to novel insights on adversarial robustness.

...read moreread less

Proceedings Article•DOI•

Variational Adversarial Active Learning

[...]

Samarth Sinha¹, Sayna Ebrahimi², Trevor Darrell²•Institutions (2)

University of Toronto¹, University of California, Berkeley²

31 Mar 2019

TL;DR: In this article, a pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner is proposed, where the VAE tries to trick the adversarial network into predicting that all data points are from the labeled pool.

...read moreread less

Abstract: Active learning aims to develop label-efficient algorithms by sampling the most representative queries to be labeled by an oracle. We describe a pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner. Our method learns a latent space using a variational autoencoder (VAE) and an adversarial network trained to discriminate between unlabeled and labeled data. The mini-max game between the VAE and the adversarial network is played such that while the VAE tries to trick the adversarial network into predicting that all data points are from the labeled pool, the adversarial network learns how to discriminate between dissimilarities in the latent space. We extensively evaluate our method on various image classification and semantic segmentation benchmark datasets and establish a new state of the art on CIFAR10/100, Caltech-256, ImageNet, Cityscapes, and BDD100K. Our results demonstrate that our adversarial approach learns an effective low dimensional latent space in large-scale settings and provides for a computationally efficient sampling method. Our code is available at \url{https://github.com/sinhasam/vaal}.

...read moreread less

Journal Article•DOI•

Unsupervised Speech Representation Learning Using WaveNet Autoencoders

[...]

Jan Chorowski¹, Ron Weiss², Samy Bengio², Aaron van den Oord•Institutions (2)

University of Wrocław¹, Google²

01 Dec 2019-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: A regularization scheme is introduced that forces the representations to focus on the phonetic content of the utterance and report performance comparable with the top entries in the ZeroSpeech 2017 unsupervised acoustic unit discovery task.

...read moreread less

Abstract: We consider the task of unsupervised extraction of meaningful latent representations of speech by applying autoencoding neural networks to speech waveforms. The goal is to learn a representation able to capture high level semantic content from the signal, e.g. phoneme identities, while being invariant to confounding low level details in the signal such as the underlying pitch contour or background noise. Since the learned representation is tuned to contain only phonetic content, we resort to using a high capacity WaveNet decoder to infer information discarded by the encoder from previous samples. Moreover, the behavior of autoencoder models depends on the kind of constraint that is applied to the latent representation. We compare three variants: a simple dimensionality reduction bottleneck, a Gaussian Variational Autoencoder (VAE), and a discrete Vector Quantized VAE (VQ-VAE). We analyze the quality of learned representations in terms of speaker independence, the ability to predict phonetic content, and the ability to accurately reconstruct individual spectrogram frames. Moreover, for discrete encodings extracted using the VQ-VAE, we measure the ease of mapping them to phonemes. We introduce a regularization scheme that forces the representations to focus on the phonetic content of the utterance and report performance comparable with the top entries in the ZeroSpeech 2017 unsupervised acoustic unit discovery task.

...read moreread less

Proceedings Article•

Lagging Inference Networks and Posterior Collapse in Variational Autoencoders

[...]

Junxian He¹, Daniel Spokoyny¹, Graham Neubig¹, Taylor Berg-Kirkpatrick²•Institutions (2)

Carnegie Mellon University¹, University of California, San Diego²

16 Jan 2019

TL;DR: This paper investigates posterior collapse from the perspective of training dynamics and proposes an extremely simple modification to VAE training to reduce inference lag: depending on the model's current mutual information between latent variable and observation, the inference network is optimized before performing each model update.

...read moreread less

Abstract: The variational autoencoder (VAE) is a popular combination of deep latent variable model and accompanying variational learning technique. By using a neural inference network to approximate the model's posterior on latent variables, VAEs efficiently parameterize a lower bound on marginal data likelihood that can be optimized directly via gradient methods. In practice, however, VAE training often results in a degenerate local optimum known as "posterior collapse" where the model learns to ignore the latent variable and the approximate posterior mimics the prior. In this paper, we investigate posterior collapse from the perspective of training dynamics. We find that during the initial stages of training the inference network fails to approximate the model's true posterior, which is a moving target. As a result, the model is encouraged to ignore the latent encoding and posterior collapse occurs. Based on this observation, we propose an extremely simple modification to VAE training to reduce inference lag: depending on the model's current mutual information between latent variable and observation, we aggressively optimize the inference network before performing each model update. Despite introducing neither new model components nor significant complexity over basic VAE, our approach is able to avoid the problem of collapse that has plagued a large amount of previous work. Empirically, our approach outperforms strong autoregressive baselines on text and image benchmarks in terms of held-out likelihood, and is competitive with more complex techniques for avoiding collapse while being substantially faster.

...read moreread less

Proceedings Article•DOI•

Deep Spectral Clustering Using Dual Autoencoder Network

[...]

Xu Yang¹, Cheng Deng¹, Feng Zheng², Junchi Yan³, Wei Liu⁴ - Show less +1 more•Institutions (4)

Xidian University¹, Southern University of Science and Technology², Shanghai Jiao Tong University³, Tencent⁴

30 Apr 2019

TL;DR: A joint learning framework for discriminative embedding and spectral clustering is proposed, which can significantly outperform state-of-the-art clustering approaches and be more robust to noise.

...read moreread less

Abstract: The clustering methods have recently absorbed even-increasing attention in learning and vision. Deep clustering combines embedding and clustering together to obtain optimal embedding subspace for clustering, which can be more effective compared with conventional clustering methods. In this paper, we propose a joint learning framework for discriminative embedding and spectral clustering. We first devise a dual autoencoder network, which enforces the reconstruction constraint for the latent representations and their noisy versions, to embed the inputs into a latent space for clustering. As such the learned latent representations can be more robust to noise. Then the mutual information estimation is utilized to provide more discriminative information from the inputs. Furthermore, a deep spectral clustering method is applied to embed the latent representations into the eigenspace and subsequently clusters them, which can fully exploit the relationship between inputs to achieve optimal clustering results. Experimental results on benchmark datasets show that our method can significantly outperform state-of-the-art clustering approaches.

...read moreread less

Journal Article•DOI•

Learning Compact and Discriminative Stacked Autoencoder for Hyperspectral Image Classification

[...]

Peicheng Zhou¹, Junwei Han¹, Gong Cheng¹, Baochang Zhang²•Institutions (2)

Northwestern Polytechnical University¹, Beihang University²

13 Feb 2019-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: The proposed CDSAE framework comprises two stages with different optimization objectives, which can learn discriminative low-dimensional feature mappings and train an effective classifier progressively, and imposes a local Fisher discriminant regularization on each hidden layer of stacked autoencoder (SAE) to train discrim inative SAE (DSAE).

...read moreread less

Abstract: As one of the fundamental research topics in remote sensing image analysis, hyperspectral image (HSI) classification has been extensively studied so far. However, how to discriminatively learn a low-dimensional feature space, in which the mapped features have small within-class scatter and big between-class separation, is still a challenging problem. To address this issue, this paper proposes an effective framework, named compact and discriminative stacked autoencoder (CDSAE), for HSI classification. The proposed CDSAE framework comprises two stages with different optimization objectives, which can learn discriminative low-dimensional feature mappings and train an effective classifier progressively. First, we impose a local Fisher discriminant regularization on each hidden layer of stacked autoencoder (SAE) to train discriminative SAE (DSAE) by minimizing reconstruction error. This stage can learn feature mappings, in which the pixels from the same land-cover class are mapped as nearly as possible and the pixels from different land-cover categories are separated by a large margin. Second, we learn an effective classifier and meanwhile update DSAE with a local Fisher discriminant regularization being embedded on the top of feature representations. Moreover, to learn a compact DSAE with as small number of hidden neurons as possible, we impose a diversity regularization on the hidden neurons of DSAE to balance the feature dimensionality and the feature representation capability. The experimental results on three widely-used HSI data sets and comprehensive comparisons with existing methods demonstrate that our proposed method is effective.

...read moreread less

Journal Article•DOI•

Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network

[...]

Wei Wang¹, Mengxue Zhao¹, Jigang Wang²•Institutions (2)

Beijing Jiaotong University¹, ZTE²

01 Aug 2019-Journal of Ambient Intelligence and Humanized Computing

TL;DR: This work reconstructs the high-dimensional features of Android applications (apps) and employ multiple CNN to detect Android malware and proposes a hybrid model based on deep autoencoder (DAE) and convolutional neural network (CNN), which shows powerful ability in feature extraction and malware detection.

...read moreread less

Abstract: Android security incidents occurred frequently in recent years. To improve the accuracy and efficiency of large-scale Android malware detection, in this work, we propose a hybrid model based on deep autoencoder (DAE) and convolutional neural network (CNN). First, to improve the accuracy of malware detection, we reconstruct the high-dimensional features of Android applications (apps) and employ multiple CNN to detect Android malware. In the serial convolutional neural network architecture (CNN-S), we use Relu, a non-linear function, as the activation function to increase sparseness and “dropout” to prevent over-fitting. The convolutional layer and pooling layer are combined with the full-connection layer to enhance feature extraction capability. Under these conditions, CNN-S shows powerful ability in feature extraction and malware detection. Second, to reduce the training time, we use deep autoencoder as a pre-training method of CNN. With the combination, deep autoencoder and CNN model (DAE-CNN) can learn more flexible patterns in a short time. We conduct experiments on 10,000 benign apps and 13,000 malicious apps. CNN-S demonstrates a significant improvement compared with traditional machine learning methods in Android malware detection. In details, compared with SVM, the accuracy with the CNN-S model is improved by 5%, while the training time using DAE-CNN model is reduced by 83% compared with CNN-S model.

...read moreread less

Posted Content•

Zero-Shot Voice Style Transfer with Only Autoencoder Loss.

[...]

Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Mark Hasegawa-Johnson - Show less +1 more

14 May 2019

Journal Article•DOI•

Unsupervised Spatial–Spectral Feature Learning by 3D Convolutional Autoencoder for Hyperspectral Classification

[...]

Shaohui Mei¹, Jingyu Ji¹, Yunhao Geng¹, Zhi Zhang², Xu Li¹, Qian Du³ - Show less +2 more•Institutions (3)

Northwestern Polytechnical University¹, Chinese Academy of Sciences², Mississippi State University³

22 Apr 2019-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: Experimental results on several benchmark hyperspectral data sets have demonstrated that the proposed 3D-CAE is very effective in extracting spatial–spectral features and outperforms not only traditional unsupervised feature extraction algorithms but also many supervised feature extraction algorithm in classification application.

...read moreread less

Abstract: Feature learning technologies using convolutional neural networks (CNNs) have shown superior performance over traditional hand-crafted feature extraction algorithms. However, a large number of labeled samples are generally required for CNN to learn effective features under classification task, which are hard to be obtained for hyperspectral remote sensing images. Therefore, in this paper, an unsupervised spatial–spectral feature learning strategy is proposed for hyperspectral images using 3-Dimensional (3D) convolutional autoencoder (3D-CAE). The proposed 3D-CAE consists of 3D or elementwise operations only, such as 3D convolution, 3D pooling, and 3D batch normalization, to maximally explore spatial–spectral structure information for feature extraction. A companion 3D convolutional decoder network is also designed to reconstruct the input patterns to the proposed 3D-CAE, by which all the parameters involved in the network can be trained without labeled training samples. As a result, effective features are learned in an unsupervised mode that label information of pixels is not required. Experimental results on several benchmark hyperspectral data sets have demonstrated that our proposed 3D-CAE is very effective in extracting spatial–spectral features and outperforms not only traditional unsupervised feature extraction algorithms but also many supervised feature extraction algorithms in classification application.

...read moreread less

Journal Article•DOI•

Remaining useful life estimation using a bidirectional recurrent neural network based autoencoder scheme

[...]

Wennian Yu¹, Ii Yong Kim¹, Chris K. Mechefske¹•Institutions (1)

Queen's University¹

15 Aug 2019-Mechanical Systems and Signal Processing

TL;DR: A sensor-based data-driven scheme using a deep learning tool and the similarity-based curve matching technique to estimate the RUL of a system, which demonstrates the competitiveness of the proposed method used for RUL estimation of systems.

...read moreread less

Journal Article•DOI•

Deep learning in bioinformatics: Introduction, application, and perspective in the big data era.

[...]

Yu Li¹, Chao Huang², Lizhong Ding, Zhongxiao Li¹, Yijie Pan², Xin Gao¹ - Show less +2 more•Institutions (2)

King Abdullah University of Science and Technology¹, Chinese Academy of Sciences²

15 Aug 2019-Methods

TL;DR: This review provides both the exoteric introduction of deep learning, and concrete examples and implementations of its representative applications in bioinformatics, and introduces deep learning in an easy-to-understand fashion.

...read moreread less

Journal Article•DOI•

A de novo molecular generation method using latent vector based generative adversarial network

[...]

Oleksii Prykhodko¹, Oleksii Prykhodko², Simon Johansson¹, Simon Johansson², Panagiotis-Christos Kotsias¹, Josep Arús-Pous¹, Josep Arús-Pous³, Esben Jannik Bjerrum¹, Ola Engkvist¹, Hongming Chen¹ - Show less +6 more•Institutions (3)

AstraZeneca¹, Chalmers University of Technology², University of Bern³

03 Dec 2019-Journal of Cheminformatics

TL;DR: A new deep learning architecture, LatentGAN, which combines an autoencoder and a generative adversarial neural network for de novo molecular design is proposed, indicating that both methods can be used complementarily.

...read moreread less

Abstract: Deep learning methods applied to drug discovery have been used to generate novel structures. In this study, we propose a new deep learning architecture, LatentGAN, which combines an autoencoder and a generative adversarial neural network for de novo molecular design. We applied the method in two scenarios: one to generate random drug-like compounds and another to generate target-biased compounds. Our results show that the method works well in both cases. Sampled compounds from the trained model can largely occupy the same chemical space as the training set and also generate a substantial fraction of novel compounds. Moreover, the drug-likeness score of compounds sampled from LatentGAN is also similar to that of the training set. Lastly, generated compounds differ from those obtained with a Recurrent Neural Network-based generative model approach, indicating that both methods can be used complementarily.

...read moreread less

Posted Content•

Variational Adversarial Active Learning

[...]

Samarth Sinha¹, Sayna Ebrahimi², Trevor Darrell²•Institutions (2)

University of Toronto¹, University of California, Berkeley²

31 Mar 2019-arXiv: Learning

TL;DR: A pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner that learns an effective low dimensional latent space in large-scale settings and provides for a computationally efficient sampling method.

...read moreread less

Abstract: Active learning aims to develop label-efficient algorithms by sampling the most representative queries to be labeled by an oracle. We describe a pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner. Unlike conventional active learning algorithms, our approach is task agnostic, i.e., it does not depend on the performance of the task for which we are trying to acquire labeled data. Our method learns a latent space using a variational autoencoder (VAE) and an adversarial network trained to discriminate between unlabeled and labeled data. The mini-max game between the VAE and the adversarial network is played such that while the VAE tries to trick the adversarial network into predicting that all data points are from the labeled pool, the adversarial network learns how to discriminate between dissimilarities in the latent space. We extensively evaluate our method on various image classification and semantic segmentation benchmark datasets and establish a new state of the art on $\text{CIFAR10/100}$, $\text{Caltech-256}$, $\text{ImageNet}$, $\text{Cityscapes}$, and $\text{BDD100K}$. Our results demonstrate that our adversarial approach learns an effective low dimensional latent space in large-scale settings and provides for a computationally efficient sampling method. Our code is available at this https URL.

...read moreread less

Proceedings Article•

DAG-GNN: DAG Structure Learning with Graph Neural Networks

[...]

Yue Yu¹, Jie Chen², Tian Gao², Mo Yu²•Institutions (2)

Lehigh University¹, IBM²

24 May 2019

TL;DR: A deep generative model is proposed and a variant of the structural constraint to learn the DAG is applied that learns more accurate graphs for nonlinearly generated samples; and on benchmark data sets with discrete variables, the learned graphs are reasonably close to the global optima.

...read moreread less

Abstract: Learning a faithful directed acyclic graph (DAG) from samples of a joint distribution is a challenging combinatorial problem, owing to the intractable search space superexponential in the number of graph nodes. A recent breakthrough formulates the problem as a continuous optimization with a structural constraint that ensures acyclicity (Zheng et al., 2018). The authors apply the approach to the linear structural equation model (SEM) and the least-squares loss function that are statistically well justified but nevertheless limited. Motivated by the widespread success of deep learning that is capable of capturing complex nonlinear mappings, in this work we propose a deep generative model and apply a variant of the structural constraint to learn the DAG. At the heart of the generative model is a variational autoencoder parameterized by a novel graph neural network architecture, which we coin DAG-GNN. In addition to the richer capacity, an advantage of the proposed model is that it naturally handles discrete variables as well as vector-valued ones. We demonstrate that on synthetic data sets, the proposed method learns more accurate graphs for nonlinearly generated samples; and on benchmark data sets with discrete variables, the learned graphs are reasonably close to the global optima. The code is available at \url{this https URL}.

...read moreread less

Journal Article•DOI•

Deep-learning cardiac motion analysis for human survival prediction

[...]

Ghalib Bello¹, Timothy J W Dawes², Timothy J W Dawes¹, Jinming Duan¹, Carlo Biffi¹, Antonio de Marvao¹, Luke S. Howard³, J. Simon R. Gibbs³, J. Simon R. Gibbs², Martin R. Wilkins¹, Stuart A. Cook, Daniel Rueckert¹, Declan P. O'Regan¹ - Show less +9 more•Institutions (3)

Imperial College London¹, National Institutes of Health², Imperial College Healthcare³

11 Feb 2019-Nature Machine Intelligence

TL;DR: A fully convolutional neural network is used to create time-resolved three-dimensional dense segmentations of heart images that can efficiently predict human survival.

...read moreread less

Abstract: Motion analysis is used in computer vision to understand the behaviour of moving objects in sequences of images. Optimising the interpretation of dynamic biological systems requires accurate and precise motion tracking as well as efficient representations of high-dimensional motion trajectories so that these can be used for prediction tasks. Here we use image sequences of the heart, acquired using cardiac magnetic resonance imaging, to create time-resolved three-dimensional segmentations using a fully convolutional network trained on anatomical shape priors. This dense motion model formed the input to a supervised denoising autoencoder (4Dsurvival), which is a hybrid network consisting of an autoencoder that learns a task-specific latent code representation trained on observed outcome data, yielding a latent representation optimised for survival prediction. To handle right-censored survival outcomes, our network used a Cox partial likelihood loss function. In a study of 302 patients the predictive accuracy (quantified by Harrell's C-index) was significantly higher (p = .0012) for our model C=0.75 (95% CI: 0.70 - 0.79) than the human benchmark of C=0.59 (95% CI: 0.53 - 0.65). This work demonstrates how a complex computer vision task using high-dimensional medical image data can efficiently predict human survival.

...read moreread less

Journal Article•DOI•

Linearly Recurrent Autoencoder Networks for Learning Dynamics

[...]

Samuel E. Otto¹, Clarence W. Rowley¹•Institutions (1)

Princeton University¹

28 Mar 2019-Siam Journal on Applied Dynamical Systems

TL;DR: In this paper, a method for learning low-dimensional approximations of nonlinear dynamical systems, based on neural network approximation of the underlying Koopman operator, is described.

...read moreread less

Abstract: This paper describes a method for learning low-dimensional approximations of nonlinear dynamical systems, based on neural network approximations of the underlying Koopman operator. Extended Dynamic...

...read moreread less

Journal Article•DOI•

DAEN: Deep Autoencoder Networks for Hyperspectral Unmixing

[...]

Yuanchao Su¹, Jun Li², Antonio Plaza³, Andrea Marinoni, Paolo Gamba⁴, Somdatta Chakravortty⁵ - Show less +2 more•Institutions (5)

Sun Yat-sen University¹, Hunan University², University of Extremadura³, University of Pavia⁴, Islamic Azad University⁵

28 Jan 2019-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: A new technique for unsupervised unmixing which is based on a deep autoencoder network (DAEN), which can unmix data sets with outliers and low signal-to-noise ratio and demonstrates very competitive performance.

...read moreread less

Abstract: Spectral unmixing is a technique for remotely sensed image interpretation that expresses each (possibly mixed) pixel as a combination of pure spectral signatures (endmembers) and their fractional abundances. In this paper, we develop a new technique for unsupervised unmixing which is based on a deep autoencoder network (DAEN). Our newly developed DAEN consists of two parts. The first part of the network adopts stacked autoencoders (SAEs) to learn spectral signatures, so as to generate a good initialization for the unmixing process. In the second part of the network, a variational autoencoder (VAE) is employed to perform blind source separation, aimed at obtaining the endmember signatures and abundance fractions simultaneously. By taking advantage from the SAEs, the robustness of the proposed approach is remarkable as it can unmix data sets with outliers and low signal-to-noise ratio. Moreover, the multihidden layers of the VAE ensure the required constraints (nonnegativity and sum-to-one) when estimating the abundances. The effectiveness of the proposed method is evaluated using both synthetic and real hyperspectral data. When compared with other unmixing methods, the proposed approach demonstrates very competitive performance.

...read moreread less

Collapse