scispace - formally typeset
Open AccessPosted Content

Deep Gaussian Processes with Importance-Weighted Variational Inference

TLDR
This work proposes a novel importance-weighted objective, which leverages analytic results and provides a mechanism to trade off computation for improved accuracy and demonstrates that the importance- Weighted objective works well in practice and consistently outperforms classical variational inference, especially for deeper models.
Abstract
Deep Gaussian processes (DGPs) can model complex marginal densities as well as complex mappings. Non-Gaussian marginals are essential for modelling real-world data, and can be generated from the DGP by incorporating uncorrelated variables to the model. Previous work on DGP models has introduced noise additively and used variational inference with a combination of sparse Gaussian processes and mean-field Gaussians for the approximate posterior. Additive noise attenuates the signal, and the Gaussian form of variational distribution may lead to an inaccurate posterior. We instead incorporate noisy variables as latent covariates, and propose a novel importance-weighted objective, which leverages analytic results and provides a mechanism to trade off computation for improved accuracy. Our results demonstrate that the importance-weighted objective works well in practice and consistently outperforms classical variational inference, especially for deeper models.

read more

Citations
More filters
Journal ArticleDOI

When Gaussian Process Meets Big Data: A Review of Scalable GPs

TL;DR: In this article, a review of state-of-the-art scalable Gaussian process regression (GPR) models is presented, focusing on global and local approximations for subspace learning.
Journal Article

Avoiding pathologies in very deep networks

TL;DR: In this paper, the authors study the deep Gaussian process, a type of infinitely wide, deep neural network, and show that in standard architectures, the representational capacity of the network tends to capture fewer degrees of freedom as the number of layers increases.
Posted Content

A Framework for Interdomain and Multioutput Gaussian Processes.

TL;DR: This work presents a mathematical and software framework for scalable approximate inference in GPs, which combines interdomain approximations and multiple outputs, and provides a unified interface for many existing multioutput models, as well as more recent convolutional structures.
Posted Content

A Tutorial on Sparse Gaussian Processes and Variational Inference

TL;DR: This tutorial is to provide access to the basic matter for readers without prior knowledge in both GPs and VI, where pseudo-training examples are treated as optimization arguments of the approximate posterior that are jointly identified together with hyperparameters of the generative model.
Proceedings Article

Bayesian Learning from Sequential Data using Gaussian Processes with Signature Covariances

TL;DR: A Bayesian approach to learning from sequential data is developed by using Gaussian processes (GPs) with so-called signature kernels as covariance functions to make sequences of different length comparable and to rely on strong theoretical results from stochastic analysis.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article

Auto-Encoding Variational Bayes

TL;DR: A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.
Proceedings Article

Understanding the difficulty of training deep feedforward neural networks

TL;DR: The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.
Posted Content

Stochastic Backpropagation and Approximate Inference in Deep Generative Models

TL;DR: In this article, a generative and recognition model is proposed to represent approximate posterior distributions and act as a stochastic encoder of the data, which allows for joint optimisation of the parameters of both the generative model and the recognition model.
Related Papers (5)