Showing papers on "Autoencoder published in 2020"

PDF

Open Access

Journal Article•DOI•

Remote Sensing Image Scene Classification Meets Deep Learning: Challenges, Methods, Benchmarks, and Opportunities

[...]

Gong Cheng¹, Xingxing Xie¹, Junwei Han¹, Lei Guo¹, Gui-Song Xia² - Show less +1 more•Institutions (2)

Northwestern Polytechnical University¹, Wuhan University²

29 Jun 2020-IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

TL;DR: This article provides a systematic survey of deep learning methods for remote sensing image scene classification by covering more than 160 papers and discusses the main challenges of remote sensing images classification and survey.

...read moreread less

Abstract: Remote sensing image scene classification, which aims at labeling remote sensing images with a set of semantic categories based on their contents, has broad applications in a range of fields. Propelled by the powerful feature learning capabilities of deep neural networks, remote sensing image scene classification driven by deep learning has drawn remarkable attention and achieved significant breakthroughs. However, to the best of our knowledge, a comprehensive review of recent achievements regarding deep learning for scene classification of remote sensing images is still lacking. Considering the rapid evolution of this field, this article provides a systematic survey of deep learning methods for remote sensing image scene classification by covering more than 160 papers. To be specific, we discuss the main challenges of remote sensing image scene classification and survey: first, autoencoder-based remote sensing image scene classification methods; second, convolutional neural network-based remote sensing image scene classification methods; and third, generative adversarial network-based remote sensing image scene classification methods. In addition, we introduce the benchmarks used for remote sensing image scene classification and summarize the performance of more than two dozen of representative algorithms on three commonly used benchmark datasets. Finally, we discuss the promising opportunities for further research.

...read moreread less

450 citations

Journal Article•DOI•

Deep packet: a novel approach for encrypted traffic classification using deep learning

[...]

Mohammad Lotfollahi¹, Mahdi Jafari Siavoshani¹, Ramin Shirali Hossein Zade¹, Mohammdsadegh Saberian¹•Institutions (1)

Sharif University of Technology¹

01 Feb 2020

TL;DR: Deep Packet can identify encrypted traffic and also distinguishes between VPN and non-VPN network traffic, and outperforms all of the proposed classification methods on UNB ISCX VPN-nonVPN dataset.

...read moreread less

Abstract: Network traffic classification has become more important with the rapid growth of Internet and online applications. Numerous studies have been done on this topic which have led to many different approaches. Most of these approaches use predefined features extracted by an expert in order to classify network traffic. In contrast, in this study, we propose a deep learning-based approach which integrates both feature extraction and classification phases into one system. Our proposed scheme, called “Deep Packet,” can handle both traffic characterization in which the network traffic is categorized into major classes (e.g., FTP and P2P) and application identification in which identifying end-user applications (e.g., BitTorrent and Skype) is desired. Contrary to most of the current methods, Deep Packet can identify encrypted traffic and also distinguishes between VPN and non-VPN network traffic. The Deep Packet framework employs two deep neural network structures, namely stacked autoencoder (SAE) and convolution neural network (CNN) in order to classify network traffic. Our experiments show that the best result is achieved when Deep Packet uses CNN as its classification model where it achieves recall of 0.98 in application identification task and 0.94 in traffic categorization task. To the best of our knowledge, Deep Packet outperforms all of the proposed classification methods on UNB ISCX VPN-nonVPN dataset.

...read moreread less

417 citations

Posted Content•

NVAE: A Deep Hierarchical Variational Autoencoder

[...]

Arash Vahdat¹, Jan Kautz¹•Institutions (1)

Nvidia¹

08 Jul 2020-arXiv: Machine Learning

TL;DR: NVAE is the first successful VAE applied to natural images as large as 256$\times$256 pixels and achieves state-of-the-art results among non-autoregressive likelihood-based models on the MNIST, CIFAR-10, CelebA 64, and CelebA HQ datasets and it provides a strong baseline on FFHQ.

...read moreread less

Abstract: Normalizing flows, autoregressive models, variational autoencoders (VAEs), and deep energy-based models are among competing likelihood-based frameworks for deep generative learning. Among them, VAEs have the advantage of fast and tractable sampling and easy-to-access encoding networks. However, they are currently outperformed by other models such as normalizing flows and autoregressive models. While the majority of the research in VAEs is focused on the statistical challenges, we explore the orthogonal direction of carefully designing neural architectures for hierarchical VAEs. We propose Nouveau VAE (NVAE), a deep hierarchical VAE built for image generation using depth-wise separable convolutions and batch normalization. NVAE is equipped with a residual parameterization of Normal distributions and its training is stabilized by spectral regularization. We show that NVAE achieves state-of-the-art results among non-autoregressive likelihood-based models on the MNIST, CIFAR-10, CelebA 64, and CelebA HQ datasets and it provides a strong baseline on FFHQ. For example, on CIFAR-10, NVAE pushes the state-of-the-art from 2.98 to 2.91 bits per dimension, and it produces high-quality images on CelebA HQ. To the best of our knowledge, NVAE is the first successful VAE applied to natural images as large as 256$\times$256 pixels. The source code is available at this https URL .

...read moreread less

391 citations

Journal Article•DOI•

Deep learning methods for forecasting COVID-19 time-Series data: A Comparative study

[...]

Abdelhafid Zeroual, Fouzi Harrou¹, Abdelkader Dairi², Ying Sun¹•Institutions (2)

King Abdullah University of Science and Technology¹, University of Science and Technology of Oran Mohamed-Boudiaf²

15 Jul 2020-Chaos Solitons & Fractals

TL;DR: Results demonstrate the promising potential of the deep learning model in forecasting COVID-19 cases and highlight the superior performance of the VAE compared to the other algorithms.

...read moreread less

Abstract: The novel coronavirus (COVID-19) has significantly spread over the world and comes up with new challenges to the research community. Although governments imposing numerous containment and social distancing measures, the need for the healthcare systems has dramatically increased and the effective management of infected patients becomes a challenging problem for hospitals. Thus, accurate short-term forecasting of the number of new contaminated and recovered cases is crucial for optimizing the available resources and arresting or slowing down the progression of such diseases. Recently, deep learning models demonstrated important improvements when handling time-series data in different applications. This paper presents a comparative study of five deep learning methods to forecast the number of new cases and recovered cases. Specifically, simple Recurrent Neural Network (RNN), Long short-term memory (LSTM), Bidirectional LSTM (BiLSTM), Gated recurrent units (GRUs) and Variational AutoEncoder (VAE) algorithms have been applied for global forecasting of COVID-19 cases based on a small volume of data. This study is based on daily confirmed and recovered cases collected from six countries namely Italy, Spain, France, China, USA, and Australia. Results demonstrate the promising potential of the deep learning model in forecasting COVID-19 cases and highlight the superior performance of the VAE compared to the other algorithms.

...read moreread less

306 citations

Proceedings Article•DOI•

USAD: UnSupervised Anomaly Detection on Multivariate Time Series

[...]

Julien Audibert¹, Pietro Michiardi¹, Frédéric Guyard, Sébastien Marti, Maria A. Zuluaga¹ - Show less +1 more•Institutions (1)

Institut Eurécom¹

23 Aug 2020

TL;DR: A fast and stable method called UnSupervised Anomaly Detection for multivariate time series (USAD) based on adversely trained autoencoders capable of learning in an unsupervised way is proposed.

...read moreread less

Abstract: The automatic supervision of IT systems is a current challenge at Orange. Given the size and complexity reached by its IT operations, the number of sensors needed to obtain measurements over time, used to infer normal and abnormal behaviors, has increased dramatically making traditional expert-based supervision methods slow or prone to errors. In this paper, we propose a fast and stable method called UnSupervised Anomaly Detection for multivariate time series (USAD) based on adversely trained autoencoders. Its autoencoder architecture makes it capable of learning in an unsupervised way. The use of adversarial training and its architecture allows it to isolate anomalies while providing fast training. We study the properties of our methods through experiments on five public datasets, thus demonstrating its robustness, training speed and high anomaly detection performance. Through a feasibility study using Orange's proprietary data we have been able to validate Orange's requirements on scalability, stability, robustness, training speed and high performance.

...read moreread less

283 citations

Proceedings Article•DOI•

Local Implicit Grid Representations for 3D Scenes

[...]

Chiyu Jiang¹, Avneesh Sud², Ameesh Makadia², Jingwei Huang³, Matthias NieBner⁴, Thomas Funkhouser² - Show less +2 more•Institutions (4)

University of California, Berkeley¹, Google², Stanford University³, Technische Universität München⁴

14 Jun 2020

TL;DR: Local Implicit Grid Representations (LIGR) as mentioned in this paper is a 3D shape representation designed for scalability and generality, which can be used to reconstruct 3D objects from partial or noisy data.

...read moreread less

Abstract: Shape priors learned from data are commonly used to reconstruct 3D objects from partial or noisy data. Yet no such shape priors are available for indoor scenes, since typical 3D autoencoders cannot handle their scale, complexity, or diversity. In this paper, we introduce Local Implicit Grid Representations, a new 3D shape representation designed for scalability and generality. The motivating idea is that most 3D surfaces share geometric details at some scale -- i.e., at a scale smaller than an entire object and larger than a small patch. We train an autoencoder to learn an embedding of local crops of 3D shapes at that size. Then, we use the decoder as a component in a shape optimization that solves for a set of latent codes on a regular grid of overlapping crops such that an interpolation of the decoded local shapes matches a partial or noisy observation. We demonstrate the value of this proposed approach for 3D surface reconstruction from sparse point observations, showing significantly better results than alternative approaches.

...read moreread less

255 citations

Posted Content•

Local Implicit Grid Representations for 3D Scenes

[...]

Chiyu "Max" Jiang, Avneesh Sud, Ameesh Makadia, Jingwei Huang, Matthias Nießner, Thomas Funkhouser - Show less +2 more

19 Mar 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper introduces Local Implicit Grid Representations, a new 3D shape representation designed for scalability and generality and demonstrates the value of this proposed approach for 3D surface reconstruction from sparse point observations, showing significantly better results than alternative approaches.

...read moreread less

250 citations

Journal Article•DOI•

A deep learning algorithm using a fully connected sparse autoencoder neural network for landslide susceptibility prediction

[...]

Faming Huang¹, Jing Zhang¹, Chuangbing Zhou¹, Yuhao Wang¹, Jinsong Huang², Li Zhu¹ - Show less +2 more•Institutions (2)

Nanchang University¹, University of Newcastle²

01 Jan 2020-Landslides

TL;DR: The asymmetric and unsupervised FC-SAE can extract optimal non-linear features from environmental factors successfully, outperforms some conventional machine learning methods, and is promising for LSP.

...read moreread less

Abstract: The environmental factors of landslide susceptibility are generally uncorrelated or non-linearly correlated, resulting in the limited prediction performances of conventional machine learning methods for landslide susceptibility prediction (LSP). Deep learning methods can exploit low-level features and high-level representations of information from environmental factors. In this paper, a novel deep learning–based algorithm, the fully connected spare autoencoder (FC-SAE), is proposed for LSP. The FC-SAE consists of four steps: raw feature dropout in input layers, a sparse feature encoder in hidden layers, sparse feature extraction in output layers, and classification and prediction. The Sinan County of Guizhou Province in China, with a total of 23,195 landslide grid cells (306 recorded landslides) and 23,195 randomly selected non-landslide grid cells, was used as study case. The frequency ratio values of 27 environmental factors were taken as the input variables of FC-SAE. All 46,390 landslide and non-landslide grid cells were randomly divided into a training dataset (70%) and a test dataset (30%). By analyzing real landslide/non-landslide data, the performances of the FC-SAE and two other conventional machine learning methods, support vector machine (SVM) and back-propagation neural network (BPNN), were compared. The results show that the prediction rate and total accuracies of the FC-SAE are 0.854 and 85.2% which are higher than those of the SVM-only (0.827 and 81.56%) and BPNN (0.819 and 80.86%), respectively. In conclusion, the asymmetric and unsupervised FC-SAE can extract optimal non-linear features from environmental factors successfully, outperforms some conventional machine learning methods, and is promising for LSP.

...read moreread less

233 citations

Proceedings Article•DOI•

Structural Deep Clustering Network

[...]

Deyu Bo¹, Xiao Wang¹, Chuan Shi¹, Meiqi Zhu¹, Emiao Lu², Peng Cui³ - Show less +2 more•Institutions (3)

Beijing University of Posts and Telecommunications¹, Tencent², Tsinghua University³

20 Apr 2020

TL;DR: Structural Deep Clustering Network (SDCN) as discussed by the authors integrates the structural information into deep clustering by designing a delivery operator to transfer the representations learned by autoencoder to the corresponding GCN layer, and a dual self-supervised mechanism to unify these two different deep neural architectures.

...read moreread less

Abstract: Clustering is a fundamental task in data analysis. Recently, deep clustering, which derives inspiration primarily from deep learning approaches, achieves state-of-the-art performance and has attracted considerable attention. Current deep clustering methods usually boost the clustering results by means of the powerful representation ability of deep learning, e.g., autoencoder, suggesting that learning an effective representation for clustering is a crucial requirement. The strength of deep clustering methods is to extract the useful representations from the data itself, rather than the structure of data, which receives scarce attention in representation learning. Motivated by the great success of Graph Convolutional Network (GCN) in encoding the graph structure, we propose a Structural Deep Clustering Network (SDCN) to integrate the structural information into deep clustering. Specifically, we design a delivery operator to transfer the representations learned by autoencoder to the corresponding GCN layer, and a dual self-supervised mechanism to unify these two different deep neural architectures and guide the update of the whole model. In this way, the multiple structures of data, from low-order to high-order, are naturally combined with the multiple representations learned by autoencoder. Furthermore, we theoretically analyze the delivery operator, i.e., with the delivery operator, GCN improves the autoencoder-specific representation as a high-order graph regularization constraint and autoencoder helps alleviate the over-smoothing problem in GCN. Through comprehensive experiments, we demonstrate that our propose model can consistently perform better over the state-of-the-art techniques.

...read moreread less

230 citations

Proceedings Article•DOI•

Adversarial Latent Autoencoders

[...]

Stanislav Pidhorskyi¹, Donald A. Adjeroh¹, Gianfranco Doretto¹•Institutions (1)

West Virginia University¹

14 Jun 2020

Abstract: Autoencoder networks are unsupervised approaches aiming at combining generative and representational properties by learning simultaneously an encoder-generator map. Although studied extensively, the issues of whether they have the same generative power of GANs, or learn disentangled representations, have not been fully addressed. We introduce an autoencoder that tackles these issues jointly, which we call Adversarial Latent Autoencoder (ALAE). It is a general architecture that can leverage recent improvements on GAN training procedures. We designed two autoencoders: one based on a MLP encoder, and another based on a StyleGAN generator, which we call StyleALAE. We verify the disentanglement properties of both architectures. We show that StyleALAE can not only generate 1024x1024 face images with comparable quality of StyleGAN, but at the same resolution can also produce face reconstructions and manipulations based on real images. This makes ALAE the first autoencoder able to compare with, and go beyond the capabilities of a generator-only type of architecture.

...read moreread less

223 citations

Journal Article•DOI•

Learning Graph Embedding With Adversarial Training Methods

[...]

Shirui Pan¹, Ruiqi Hu², Sai-Fu Fung³, Guodong Long², Jing Jiang², Chengqi Zhang² - Show less +2 more•Institutions (3)

Monash University, Clayton campus¹, University of Technology, Sydney², City University of Hong Kong³

01 Jun 2020-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: In this paper, the adversarial training principle is applied to enforce the latent codes to match a prior Gaussian or uniform distribution, which can be used to learn the graph embedding effectively.

...read moreread less

Abstract: Graph embedding aims to transfer a graph into vectors to facilitate subsequent graph-analytics tasks like link prediction and graph clustering. Most approaches on graph embedding focus on preserving the graph structure or minimizing the reconstruction errors for graph data. They have mostly overlooked the embedding distribution of the latent codes, which unfortunately may lead to inferior representation in many cases. In this article, we present a novel adversarially regularized framework for graph embedding. By employing the graph convolutional network as an encoder, our framework embeds the topological information and node content into a vector representation, from which a graph decoder is further built to reconstruct the input graph. The adversarial training principle is applied to enforce our latent codes to match a prior Gaussian or uniform distribution. Based on this framework, we derive two variants of the adversarial models, the adversarially regularized graph autoencoder (ARGA) and its variational version, and adversarially regularized variational graph autoencoder (ARVGA), to learn the graph embedding effectively. We also exploit other potential variations of ARGA and ARVGA to get a deeper understanding of our designs. Experimental results that compared 12 algorithms for link prediction and 20 algorithms for graph clustering validate our solutions.

...read moreread less

Journal Article•DOI•

Searching for New Physics with Deep Autoencoders

[...]

Marco Farina¹, Marco Farina², Yuichiro Nakai¹, David Shih¹•Institutions (2)

Rutgers University¹, C. N. Yang Institute for Theoretical Physics²

13 Apr 2020-Physical Review D

TL;DR: A potentially powerful new method of searching for new physics at the LHC, using autoencoders and unsupervised deep learning, which opens up the exciting possibility of training directly on actual data to discover new physics with no prior expectations or theory prejudice.

...read moreread less

Abstract: We introduce a potentially powerful new method of searching for new physics at the LHC, using autoencoders and unsupervised deep learning The key idea of the autoencoder is that it learns to map ``normal'' events back to themselves, but fails to reconstruct ``anomalous'' events that it has never encountered before The reconstruction error can then be used as an anomaly threshold We demonstrate the effectiveness of this idea using QCD jets as background and boosted top jets and R-parity violating (RPV) gluino jets as signal We show that a deep autoencoder can significantly improve signal over background when trained on backgrounds only, or even directly on data which contain a small admixture of signal Finally, we examine the correlation of the autoencoders with jet mass and show how the jet mass distribution can be stable against cuts in reconstruction loss This may be important for estimating QCD backgrounds from data As a test case, we show how one could plausibly discover 400 GeV RPV gluinos using an autoencoder combined with a bump hunt in jet mass This opens up the exciting possibility of training directly on actual data to discover new physics with no prior expectations or theory prejudice

...read moreread less

Proceedings Article•

From Variational to Deterministic Autoencoders

[...]

Partha Ghosh¹, Mehdi S. M. Sajjadi², Antonio Vergari³, Michael J. Black¹, Bernhard Schölkopf¹ - Show less +1 more•Institutions (3)

Max Planck Society¹, Google², University of California, Los Angeles³

30 Apr 2020

TL;DR: It is shown, in a rigorous empirical study, that the proposed regularized deterministic autoencoders are able to generate samples that are comparable to, or better than, those of VAEs and more powerful alternatives when applied to images as well as to structured data such as molecules.

...read moreread less

Abstract: Variational Autoencoders (VAEs) provide a theoretically-backed and popular framework for deep generative models. However, learning a VAE from data poses still unanswered theoretical questions and considerable practical challenges. In this work, we propose an alternative framework for generative modeling that is simpler, easier to train, and deterministic, yet has many of the advantages of the VAE. We observe that sampling a stochastic encoder in a Gaussian VAE can be interpreted as simply injecting noise into the input of a deterministic decoder. We investigate how substituting this kind of stochasticity, with other explicit and implicit regularization schemes, can lead to an equally smooth and meaningful latent space without having to force it to conform to an arbitrarily chosen prior. To retrieve a generative mechanism to sample new data points, we introduce an ex-post density estimation step that can be readily applied to the proposed framework as well as existing VAEs, improving their sample quality. We show, in a rigorous empirical study, that the proposed regularized deterministic autoencoders are able to generate samples that are comparable to, or better than, those of VAEs and more powerful alternatives when applied to images as well as to structured data such as molecules.

...read moreread less

Proceedings Article•DOI•

Neural Blind Deconvolution Using Deep Priors

[...]

Dongwei Ren¹, Kai Zhang², Qilong Wang¹, Qinghua Hu¹, Wangmeng Zuo² - Show less +1 more•Institutions (2)

Tianjin University¹, Harbin Institute of Technology²

14 Jun 2020

TL;DR: Experimental results show that the proposed SelfDeblur can achieve notable quantitative gains as well as more visually plausible deblurring results in comparison to state-of-the-art blind deconvolution methods on benchmark datasets and real-world blurry images.

...read moreread less

Abstract: Blind deconvolution is a classical yet challenging low-level vision problem with many real-world applications. Traditional maximum a posterior (MAP) based methods rely heavily on fixed and handcrafted priors that certainly are insufficient in characterizing clean images and blur kernels, and usually adopt specially designed alternating minimization to avoid trivial solution. In contrast, existing deep motion deblurring networks learn from massive training images the mapping to clean image or blur kernel, but are limited in handling various complex and large size blur kernels. To connect MAP and deep models, we in this paper present two generative networks for respectively modeling the deep priors of clean image and blur kernel, and propose an unconstrained neural optimization solution to blind deconvolution. In particular, we adopt an asymmetric Autoencoder with skip connections for generating latent clean image, and a fully-connected network (FCN) for generating blur kernel. Moreover, the SoftMax nonlinearity is applied to the output layer of FCN to meet the non-negative and equality constraints. The process of neural optimization can be explained as a kind of ''zero-shot" self-supervised learning of the generative networks, and thus our proposed method is dubbed SelfDeblur. Experimental results show that our SelfDeblur can achieve notable quantitative gains as well as more visually plausible deblurring results in comparison to state-of-the-art blind deconvolution methods on benchmark datasets and real-world blurry images. The source code is publicly available at https://github.com/csdwren/SelfDeblur

...read moreread less

Journal Article•DOI•

Hierarchical Quality-Relevant Feature Representation for Soft Sensor Modeling: A Novel Deep Learning Strategy

[...]

Xiaofeng Yuan¹, Jiao Zhou¹, Biao Huang², Yalin Wang¹, Chunhua Yang¹, Weihua Gui¹ - Show less +2 more•Institutions (2)

Central South University¹, University of Alberta²

01 Jun 2020-IEEE Transactions on Industrial Informatics

TL;DR: A novel deep learning network is proposed for quality-relevant feature representation in this article, based on stacked quality-driven autoencoder (SQAE), which is validated on an industrial debutanizer column process.

...read moreread less

Abstract: Deep learning is a recently developed feature representation technique for data with complicated structures, which has great potential for soft sensing of industrial processes. However, most deep networks mainly focus on hierarchical feature learning for the raw observed input data. For soft sensor applications, it is important to reduce irrelevant information and extract quality-relevant features from the raw input data for quality prediction. To deal with this problem, a novel deep learning network is proposed for quality-relevant feature representation in this article, which is based on stacked quality-driven autoencoder (SQAE). First, a quality-driven autoencoder (QAE) is designed by exploiting the quality data to guide feature extraction with the constraint that the potential features should largely reconstruct the input layer data and the quality data at the output layer. In this way, quality-relevant features can be captured by QAE. Then, by stacking multiple QAEs to construct the deep SQAE network, SQAE can gradually reduce irrelevant features and learn hierarchical quality-relevant features. Finally, the high-level quality-relevant features can be directly applied for soft sensing of the quality variables. The effectiveness and flexibility of the proposed deep learning model are validated on an industrial debutanizer column process.

...read moreread less

Journal Article•DOI•

SACNN: Self-Attention Convolutional Neural Network for Low-Dose CT Denoising With Self-Supervised Perceptual Loss Network

[...]

Meng Li¹, William Hsu², Xiaodong Xie¹, Jason Cong², Wen Gao¹ - Show less +1 more•Institutions (2)

Peking University¹, University of California, Los Angeles²

21 Jan 2020-IEEE Transactions on Medical Imaging

TL;DR: A novel 3D self-attention convolutional neural network for the LDCT denoising problem and a self-supervised learning scheme to train a domain-specific autoencoder as the perceptual loss function are proposed.

...read moreread less

Abstract: Computed tomography (CT) is a widely used screening and diagnostic tool that allows clinicians to obtain a high-resolution, volumetric image of internal structures in a non-invasive manner. Increasingly, efforts have been made to improve the image quality of low-dose CT (LDCT) to reduce the cumulative radiation exposure of patients undergoing routine screening exams. The resurgence of deep learning has yielded a new approach for noise reduction by training a deep multi-layer convolutional neural networks (CNN) to map the low-dose to normal-dose CT images. However, CNN-based methods heavily rely on convolutional kernels, which use fixed-size filters to process one local neighborhood within the receptive field at a time. As a result, they are not efficient at retrieving structural information across large regions. In this paper, we propose a novel 3D self-attention convolutional neural network for the LDCT denoising problem. Our 3D self-attention module leverages the 3D volume of CT images to capture a wide range of spatial information both within CT slices and between CT slices. With the help of the 3D self-attention module, CNNs are able to leverage pixels with stronger relationships regardless of their distance and achieve better denoising results. In addition, we propose a self-supervised learning scheme to train a domain-specific autoencoder as the perceptual loss function. We combine these two methods and demonstrate their effectiveness on both CNN-based neural networks and WGAN-based neural networks with comprehensive experiments. Tested on the AAPM-Mayo Clinic Low Dose CT Grand Challenge data set, our experiments demonstrate that self-attention (SA) module and autoencoder (AE) perceptual loss function can efficiently enhance traditional CNNs and can achieve comparable or better results than the state-of-the-art methods.

...read moreread less

Journal Article•DOI•

Deep learning methods in network intrusion detection: A survey and an objective comparison

[...]

Sunanda Gamage¹, Jagath Samarabandu¹•Institutions (1)

University of Western Ontario¹

01 Nov 2020-Journal of Network and Computer Applications

TL;DR: A taxonomy of deep learning models in intrusion detection is introduced and desirable evaluation metrics on all four datasets in terms of accuracy, F1-score and training and inference time are suggested.

...read moreread less

Journal Article•DOI•

Recurrent Broad Learning Systems for Time Series Prediction

[...]

Meiling Xu¹, Min Han¹, C. L. Philip Chen², Tie Qiu³•Institutions (3)

Dalian University of Technology¹, University of Macau², Tianjin University³

01 Apr 2020-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: A novel recurrent BLS with sparse autoencoder used to extract the features from the input instead of the randomly initialized weights, motivated by the idea of “fine-tuning” in deep learning.

...read moreread less

Abstract: The broad learning system (BLS) is an emerging approach for effective and efficient modeling of complex systems. The inputs are transferred and placed in the feature nodes, and then sent into the enhancement nodes for nonlinear transformation. The structure of a BLS can be extended in a wide sense. Incremental learning algorithms are designed for fast learning in broad expansion. Based on the typical BLSs, a novel recurrent BLS (RBLS) is proposed in this paper. The nodes in the enhancement units of the BLS are recurrently connected, for the purpose of capturing the dynamic characteristics of a time series. A sparse autoencoder is used to extract the features from the input instead of the randomly initialized weights. In this way, the RBLS retains the merit of fast computing and fits for processing sequential data. Motivated by the idea of “fine-tuning” in deep learning, the weights in the RBLS can be updated by conjugate gradient methods if the prediction errors are large. We exhibit the merits of our proposed model on several chaotic time series. Experimental results substantiate the effectiveness of the RBLS. For chaotic benchmark datasets, the RBLS achieves very small errors, and for the real-world dataset, the performance is satisfactory.

...read moreread less

Journal Article•DOI•

DeepJSCC- f : Deep Joint Source-Channel Coding of Images With Feedback

[...]

David Burth Kurka¹, Deniz Gunduz¹•Institutions (1)

Imperial College London¹

14 Apr 2020

TL;DR: To the best of the knowledge, this is the first practical JSCC scheme that can fully exploit channel output feedback, demonstrating yet another setting in which modern machine learning techniques can enable the design of new and efficient communication methods that surpass the performance of traditional structured coding-based designs.

...read moreread less

Abstract: We consider wireless transmission of images in the presence of channel output feedback. From a Shannon theoretic perspective feedback does not improve the asymptotic end-to-end performance, and separate source coding followed by capacity-achieving channel coding, which ignores the feedback signal, achieves the optimal performance. It is well known that separation is not optimal in the practical finite blocklength regime; however, there are no known practical joint source-channel coding (JSCC) schemes that can exploit the feedback signal and surpass the performance of separation-based schemes. Inspired by the recent success of deep learning methods for JSCC, we investigate how noiseless or noisy channel output feedback can be incorporated into the transmission system to improve the reconstruction quality at the receiver. We introduce an autoencoder-based JSCC scheme, which we call DeepJSCC- $f$ , that exploits the channel output feedback, and provides considerable improvements in terms of the end-to-end reconstruction quality for fixed-length transmission, or in terms of the average delay for variable-length transmission. To the best of our knowledge, this is the first practical JSCC scheme that can fully exploit channel output feedback, demonstrating yet another setting in which modern machine learning techniques can enable the design of new and efficient communication methods that surpass the performance of traditional structured coding-based designs.

...read moreread less

Journal Article•DOI•

Deep denoising autoencoder for seismic random noise attenuation

[...]

Omar M. Saad¹, Yangkang Chen¹•Institutions (1)

Zhejiang University¹

01 Jul 2020-Geophysics

TL;DR: The proposed algorithm to attenuate random noise based on a deep-denoising autoencoder (DDAE) succeeds in attenuating the random noise in an effective manner and is compared with several benchmark algorithms.

...read moreread less

Abstract: Attenuation of seismic random noise is considered an important processing step to enhance the signal-to-noise ratio of seismic data. A new approach is proposed to attenuate random noise bas...

...read moreread less

Proceedings Article•DOI•

Structural Deep Clustering Network

[...]

Deyu Bo¹, Xiao Wang¹, Chuan Shi¹, Meiqi Zhu¹, Emiao Lu², Peng Cui³ - Show less +2 more•Institutions (3)

Beijing University of Posts and Telecommunications¹, Tencent², Tsinghua University³

05 Feb 2020-arXiv: Learning

TL;DR: A Structural Deep Clustering Network (SDCN) is proposed to integrate the structural information into deep clustering, with a delivery operator to transfer the representations learned by autoencoder to the corresponding GCN layer, and a dual self-supervised mechanism to unify these two different deep neural architectures and guide the update of the whole model.

...read moreread less

Book Chapter•DOI•

Attention Guided Anomaly Localization in Images

[...]

Shashanka Venkataramanan¹, Kuan-Chuan Peng², Rajat Vikram Singh³, Abhijit Mahalanobis¹•Institutions (3)

University of Central Florida¹, Mitsubishi Electric Research Laboratories², Princeton University³

23 Aug 2020

TL;DR: Li et al. as discussed by the authors proposed Convolutional Adversarial Variational autoencoder with Guided Attention (CAVGA), which localizes the anomaly with a convolutional latent variable to preserve the spatial information.

...read moreread less

Abstract: Anomaly localization is an important problem in computer vision which involves localizing anomalous regions within images with applications in industrial inspection, surveillance, and medical imaging. This task is challenging due to the small sample size and pixel coverage of the anomaly in real-world scenarios. Most prior works need to use anomalous training images to compute a class-specific threshold to localize anomalies. Without the need of anomalous training images, we propose Convolutional Adversarial Variational autoencoder with Guided Attention (CAVGA), which localizes the anomaly with a convolutional latent variable to preserve the spatial information. In the unsupervised setting, we propose an attention expansion loss where we encourage CAVGA to focus on all normal regions in the image. Furthermore, in the weakly-supervised setting we propose a complementary guided attention loss, where we encourage the attention map to focus on all normal regions while minimizing the attention map corresponding to anomalous regions in the image. CAVGA outperforms the state-of-the-art (SOTA) anomaly localization methods on MVTec Anomaly Detection (MVTAD), modified ShanghaiTech Campus (mSTC) and Large-scale Attention based Glaucoma (LAG) datasets in the unsupervised setting and when using only 2% anomalous images in the weakly-supervised setting. CAVGA also outperforms SOTA anomaly detection methods on the MNIST, CIFAR-10, Fashion-MNIST, MVTAD, mSTC and LAG datasets.

...read moreread less

Posted Content•

Swapping Autoencoder for Deep Image Manipulation

[...]

Taesung Park¹, Jun-Yan Zhu², Oliver Wang², Jingwan Lu², Eli Shechtman², Alexei A. Efros¹, Richard Zhang² - Show less +3 more•Institutions (2)

University of California, Berkeley¹, Adobe Systems²

01 Jul 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: The Swapping Autoencoder is proposed, a deep model designed specifically for image manipulation, rather than random sampling, that can be used to manipulate real input images in various ways, including texture swapping, local and global editing, and latent code vector arithmetic.

...read moreread less

Abstract: Deep generative models have become increasingly effective at producing realistic images from randomly sampled seeds, but using such models for controllable manipulation of existing images remains challenging. We propose the Swapping Autoencoder, a deep model designed specifically for image manipulation, rather than random sampling. The key idea is to encode an image with two independent components and enforce that any swapped combination maps to a realistic image. In particular, we encourage the components to represent structure and texture, by enforcing one component to encode co-occurrent patch statistics across different parts of an image. As our method is trained with an encoder, finding the latent codes for a new input image becomes trivial, rather than cumbersome. As a result, it can be used to manipulate real input images in various ways, including texture swapping, local and global editing, and latent code vector arithmetic. Experiments on multiple datasets show that our model produces better results and is substantially more efficient compared to recent generative models.

...read moreread less

Journal Article•DOI•

Anomaly Detection Based on Convolutional Recurrent Autoencoder for IoT Time Series

[...]

Chunyong Yin¹, Sun Zhang¹, Jin Wang², Neal N. Xiong³•Institutions (3)

Nanjing University of Information Science and Technology¹, Changsha University of Science and Technology², Northeastern State University³

07 Feb 2020-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: The integrated model of the convolutional neural network (CNN) and recurrent autoencoder is proposed for anomaly detection and empirical results show that the proposed model has better performances on multiple classification metrics and achieves preferable effect on anomaly detection.

...read moreread less

Abstract: Internet of Things (IoT) realizes the interconnection of heterogeneous devices by the technology of wireless and mobile communication. The data of target regions are collected by widely distributed sensing devices and transmitted to the processing center for aggregation and analysis as the basis of IoT. The quality of IoT services usually depends on the accuracy and integrity of data. However, due to the adverse environment or device defects, the collected data will be anomalous. Therefore, the effective method of anomaly detection is the crucial issue for guaranteeing service quality. Deep learning is one of the most concerned technology in recent years which realizes automatic feature extraction from raw data. In this article, the integrated model of the convolutional neural network (CNN) and recurrent autoencoder is proposed for anomaly detection. Simple combination of CNN and autoencoder cannot improve classification performance, especially, for time series. Therefore, we utilize the two-stage sliding window in data preprocessing to learn better representations. Based on the characteristics of the Yahoo Webscope S5 dataset, raw time series with anomalous points are extended to fixed-length sequences with normal or anomaly label via the first-stage sliding window. Then, each sequence is transformed into continuous time-dependent subsequences by another smaller sliding window. The preprocessing of the two-stage sliding window can be considered as low-level temporal feature extraction, and we empirically prove that the preprocessing of the two-stage sliding window will be useful for high-level feature extraction in the integrated model. After data preprocessing, spatial and temporal features are extracted in CNN and recurrent autoencoder for the classification in fully connected networks. Empiric results show that the proposed model has better performances on multiple classification metrics and achieves preferable effect on anomaly detection.

...read moreread less

Journal Article•DOI•

Cervical cancer classification using convolutional neural networks and extreme learning machines

[...]

Ahmed Ghoneim¹, Ghulam Muhammad¹, M. Shamim Hossain¹•Institutions (1)

King Saud University¹

01 Jan 2020-Future Generation Computer Systems

TL;DR: A cervical cancer cell detection and classification system based on convolutional neural networks (CNNs) and an extreme learning machine (ELM)-based classifier that achieved 99.5% accuracy in the detection problem and 91.2% in the classification problem.

...read moreread less

Journal Article•DOI•

Recommendation system based on deep learning methods: a systematic review and new directions

[...]

Aminu Da'u¹, Aminu Da'u², Naomie Salim¹•Institutions (2)

Universiti Teknologi Malaysia¹, Hassan Usman Katsina Polytechnic²

01 Apr 2020-Artificial Intelligence Review

TL;DR: This paper is the first SLR specifically on the deep learning based RS to summarize and analyze the existing studies based on the best quality research publications and indicated that autoencoder models are the most widely exploited deep learning architectures for RS followed by the Convolutional Neural Networks and the Recurrent Neural Networks.

...read moreread less

Abstract: These days, many recommender systems (RS) are utilized for solving information overload problem in areas such as e-commerce, entertainment, and social media. Although classical methods of RS have achieved remarkable successes in providing item recommendations, they still suffer from many issues such as cold start and data sparsity. With the recent achievements of deep learning in various applications such as Natural Language Processing (NLP) and image processing, more efforts have been made by the researchers to exploit deep learning methods for improving the performance of RS. However, despite the several research works on deep learning based RS, very few secondary studies were conducted in the field. Therefore, this study aims to provide a systematic literature review (SLR) of deep learning based RSs that can guide researchers and practitioners to better understand the new trends and challenges in the field. This paper is the first SLR specifically on the deep learning based RS to summarize and analyze the existing studies based on the best quality research publications. The paper particularly adopts an SLR approach based on the standard guidelines of the SLR designed by Kitchemen-ham which uses selection method and provides detail analysis of the research publications. Several publications were gathered and after inclusion/exclusion criteria and the quality assessment, the selected papers were finally used for the review. The results of the review indicated that autoencoder (AE) models are the most widely exploited deep learning architectures for RS followed by the Convolutional Neural Networks (CNNs) and the Recurrent Neural Networks (RNNs) models. Also, the results showed that Movie Lenses is the most popularly used datasets for the deep learning-based RS evaluation followed by the Amazon review datasets. Based on the results, the movie and e-commerce have been indicated as the most common domains for RS and that precision and Root Mean Squared Error are the most commonly used metrics for evaluating the performance of the deep leaning based RSs.

...read moreread less

Journal Article•DOI•

Constrained Bayesian optimization for automatic chemical design using variational autoencoders.

[...]

Ryan-Rhys Griffiths¹, José Miguel Hernández-Lobato¹, José Miguel Hernández-Lobato², José Miguel Hernández-Lobato³•Institutions (3)

University of Cambridge¹, The Turing Institute², Microsoft³

02 Jan 2020-Chemical Science

TL;DR: Automatic Chemical Design is a framework for generating novel molecules with optimized properties that can be applied to solve the challenge of designing complex molecules with novel properties.

...read moreread less

Abstract: Automatic Chemical Design is a framework for generating novel molecules with optimized properties. The original scheme, featuring Bayesian optimization over the latent space of a variational autoencoder, suffers from the pathology that it tends to produce invalid molecular structures. First, we demonstrate empirically that this pathology arises when the Bayesian optimization scheme queries latent space points far away from the data on which the variational autoencoder has been trained. Secondly, by reformulating the search procedure as a constrained Bayesian optimization problem, we show that the effects of this pathology can be mitigated, yielding marked improvements in the validity of the generated molecules. We posit that constrained Bayesian optimization is a good approach for solving this kind of training set mismatch in many generative tasks involving Bayesian optimization over the latent space of a variational autoencoder.

...read moreread less

Proceedings Article•DOI•

Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images in the Wild

[...]

Shangzhe Wu¹, Christian Rupprecht¹, Andrea Vedaldi¹•Institutions (1)

University of Oxford¹

14 Jun 2020

TL;DR: In this paper, an autoencoder is used to learn 3D deformable object categories from raw single-view images, without external supervision, using the fact that many object categories have, at least in principle, a symmetric structure.

...read moreread less

Abstract: We propose a method to learn 3D deformable object categories from raw single-view images, without external supervision. The method is based on an autoencoder that factors each input image into depth, albedo, viewpoint and illumination. In order to disentangle these components without supervision, we use the fact that many object categories have, at least in principle, a symmetric structure. We show that reasoning about illumination allows us to exploit the underlying object symmetry even if the appearance is not symmetric due to shading. Furthermore, we model objects that are probably, but not certainly, symmetric by predicting a symmetry probability map, learned end-to-end with the other components of the model. Our experiments show that this method can recover very accurately the 3D shape of human faces, cat faces and cars from single-view images, without any supervision or a prior shape model. On benchmarks, we demonstrate superior accuracy compared to another method that uses supervision at the level of 2D image correspondences.

...read moreread less

Posted Content•

Unifying Deep Local and Global Features for Image Search

[...]

Bingyi Cao¹, Andre Araujo¹, Jack Sim¹•Institutions (1)

Google¹

14 Jan 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work unify global and local features into a single deep model, enabling accurate retrieval with efficient feature extraction, and introduces an autoencoder-based dimensionality reduction technique for local features, which is integrated into the model, improving training efficiency and matching performance.

...read moreread less

Abstract: Image retrieval is the problem of searching an image database for items that are similar to a query image. To address this task, two main types of image representations have been studied: global and local image features. In this work, our key contribution is to unify global and local features into a single deep model, enabling accurate retrieval with efficient feature extraction. We refer to the new model as DELG, standing for DEep Local and Global features. We leverage lessons from recent feature learning work and propose a model that combines generalized mean pooling for global features and attentive selection for local features. The entire network can be learned end-to-end by carefully balancing the gradient flow between two heads -- requiring only image-level labels. We also introduce an autoencoder-based dimensionality reduction technique for local features, which is integrated into the model, improving training efficiency and matching performance. Comprehensive experiments show that our model achieves state-of-the-art image retrieval on the Revisited Oxford and Paris datasets, and state-of-the-art single-model instance-level recognition on the Google Landmarks dataset v2. Code and models are available at this https URL .

...read moreread less

Journal Article•DOI•

Character controllers using motion VAEs

[...]

Hung Yu Ling¹, Fabio Zinno², George Cheng², Michiel van de Panne¹•Institutions (2)

University of British Columbia¹, Electronic Arts²

08 Jul 2020-ACM Transactions on Graphics

TL;DR: This work uses deep reinforcement learning to learn controllers that achieve goal-directed movements in data-driven generative models of human movement using autoregressive conditional variational autoencoders, or Motion VAEs.

...read moreread less

Abstract: A fundamental problem in computer animation is that of realizing purposeful and realistic human movement given a sufficiently-rich set of motion capture clips. We learn data-driven generative models of human movement using autoregressive conditional variational autoencoders, or Motion VAEs. The latent variables of the learned autoencoder define the action space for the movement and thereby govern its evolution over time. Planning or control algorithms can then use this action space to generate desired motions. In particular, we use deep reinforcement learning to learn controllers that achieve goal-directed movements. We demonstrate the effectiveness of the approach on multiple tasks. We further evaluate system-design choices and describe the current limitations of Motion VAEs.

...read moreread less

Collapse