Showing papers on "Gaussian process published in 2017"

PDF

Open Access

Book•DOI•

Statistical methods for spatial data analysis

[...]

Oliver Schabenberger, Carol A. Gotway¹•Institutions (1)

27 Jan 2017

TL;DR: In this article, the effects of correlation on statistical inference have been investigated in the context of spatial analysis. But the authors focus on the use of non-Euclidean distances in Geostatistics.

...read moreread less

Abstract: INTRODUCTION The Need for Spatial Analysis Types of Spatial Data Autocorrelation-Concept and Elementary Measures Autocorrelation Functions The Effects of Autocorrelation on Statistical Inference Chapter Problems SOME THEORY ON RANDOM FIELDS Stochastic Processes and Samples of Size One Stationarity, Isotropy, and Heterogeneity Spatial Continuity and Differentiability Random Fields in the Spatial Domain Random Fields in the Frequency Domain Chapter Problems MAPPED POINT PATTERNS Random, Aggregated, and Regular Patterns Binomial and Poisson Processes Testing for Complete Spatial Randomness Second-Order Properties of Point Patterns The Inhomogeneous Poisson Process Marked and Multivariate Point Patterns Point Process Models Chapter Problems SEMIVARIOGRAM AND COVARIANCE FUNCTION ANALYSIS AND ESTIMATION Introduction Semivariogram and Covariogram Covariance and Semivariogram Models Estimating the Semivariogram Parametric Modeling Nonparametric Estimation and Modeling Estimation and Inference in the Frequency Domain On the Use of Non-Euclidean Distances in Geostatistics Supplement: Bessel Functions Chapter Problems SPATIAL PREDICTION AND KRIGING Optimal Prediction in Random Fields Linear Prediction-Simple and Ordinary Kriging Linear Prediction with a Spatially Varying Mean Kriging in Practice Estimating Covariance Parameters Nonlinear Prediction Change of Support On the Popularity of the Multivariate Gaussian Distribution Chapter Problems SPATIAL REGRESSION MODELS Linear Models with Uncorrelated Errors Linear Models with Correlated Errors Generalized Linear Models Bayesian Hierarchical Models Chapter Problems SIMULATION OF RANDOM FIELDS Unconditional Simulation of Gaussian Random Fields Conditional Simulation of Gaussian Random Fields Simulated Annealing Simulating from Convolutions Simulating Point Processes Chapter Problems NON-STATIONARY COVARIANCE Types of Non-Stationarity Global Modeling Approaches Local Stationarity SPATIO-TEMPORAL PROCESSES A New Dimension Separable Covariance Functions Non-Separable Covariance Functions The Spatio-Temporal Semivariogram Spatio-Temporal Point Processes

...read moreread less

1,022 citations

Journal Article•DOI•

Fast and scalable Gaussian process modeling with applications to astronomical time series

[...]

Daniel Foreman-Mackey¹, Eric Agol¹, Sivaram Ambikasaran², Ruth Angus³•Institutions (3)

University of Washington¹, Indian Institute of Science², Columbia University³

09 Nov 2017-The Astronomical Journal

TL;DR: In this paper, the covariance function is expressed as a mixture of complex exponentials, without requiring evenly spaced observations or uniform noise, which can be used for probabilistic inference of stellar rotation periods, asteroseismic oscillation spectra and transiting planet parameters.

...read moreread less

Abstract: The growing field of large-scale time domain astronomy requires methods for probabilistic data analysis that are computationally tractable, even with large data sets. Gaussian processes (GPs) are a popular class of models used for this purpose, but since the computational cost scales, in general, as the cube of the number of data points, their application has been limited to small data sets. In this paper, we present a novel method for GPs modeling in one dimension where the computational requirements scale linearly with the size of the data set. We demonstrate the method by applying it to simulated and real astronomical time series data sets. These demonstrations are examples of probabilistic inference of stellar rotation periods, asteroseismic oscillation spectra, and transiting planet parameters. The method exploits structure in the problem when the covariance function is expressed as a mixture of complex exponentials, without requiring evenly spaced observations or uniform noise. This form of covariance arises naturally when the process is a mixture of stochastically driven damped harmonic oscillators-providing a physical motivation for and interpretation of this choice-but we also demonstrate that it can be a useful effective model in some other cases. We present a mathematical description of the method and compare it to existing scalable GP methods. The method is fast and interpretable, with a range of potential applications within astronomical data analysis and beyond. We provide well-tested and documented open-source implementations of this method in C++, Python, and Julia.

...read moreread less

611 citations

Journal Article•DOI•

Machine learning of linear differential equations using Gaussian processes

[...]

Maziar Raissi¹, Paris Perdikaris², George Em Karniadakis¹•Institutions (2)

Brown University¹, Massachusetts Institute of Technology²

01 Nov 2017-Journal of Computational Physics

TL;DR: Gaussian process priors are modified according to the particular form of such operators and are employed to infer parameters of the linear equations from scarce and possibly noisy observations, leading to model discovery from just a handful of noisy measurements.

...read moreread less

437 citations

Journal Article•

GPflow: a Gaussian process library using tensorflow

[...]

Alexander G. de G. Matthews¹, Mark van der Wilk¹, Thomas Nickson², Keisuke Fujii³, Alexis Boukouvalas⁴, Pablo León-Villagrá⁵, Zoubin Ghahramani¹, James Hensman⁶ - Show less +4 more•Institutions (6)

University of Cambridge¹, University of Oxford², Kyoto University³, University of Manchester⁴, University of Edinburgh⁵, Lancaster University⁶

01 Jan 2017-Journal of Machine Learning Research

TL;DR: GPflow as discussed by the authors is a Gaussian process library that uses TensorFlow for its core computations and Python for its front end The distinguishing features of GPflow are that it uses variational inference as the primary approximation method.

...read moreread less

Abstract: GPflow is a Gaussian process library that uses TensorFlow for its core computations and Python for its front end The distinguishing features of GPflow are that it uses variational inference as the primary approximation method, provides concise code through the use of automatic differentiation, has been engineered with a particular emphasis on software testing and is able to exploit GPU hardware

...read moreread less

381 citations

Journal Article•DOI•

Fast and scalable Gaussian process modeling with applications to astronomical time series

[...]

Daniel Foreman-Mackey¹, Eric Agol¹, Sivaram Ambikasaran², Ruth Angus³•Institutions (3)

University of Washington¹, Indian Institute of Science², Columbia University³

28 Mar 2017-arXiv: Instrumentation and Methods for Astrophysics

TL;DR: A novel method for Gaussian processes modeling in one dimension where the computational requirements scale linearly with the size of the data set, and is fast and interpretable, with a range of potential applications within astronomical data analysis and beyond.

...read moreread less

Abstract: The growing field of large-scale time domain astronomy requires methods for probabilistic data analysis that are computationally tractable, even with large datasets. Gaussian Processes are a popular class of models used for this purpose but, since the computational cost scales, in general, as the cube of the number of data points, their application has been limited to small datasets. In this paper, we present a novel method for Gaussian Process modeling in one-dimension where the computational requirements scale linearly with the size of the dataset. We demonstrate the method by applying it to simulated and real astronomical time series datasets. These demonstrations are examples of probabilistic inference of stellar rotation periods, asteroseismic oscillation spectra, and transiting planet parameters. The method exploits structure in the problem when the covariance function is expressed as a mixture of complex exponentials, without requiring evenly spaced observations or uniform noise. This form of covariance arises naturally when the process is a mixture of stochastically-driven damped harmonic oscillators -- providing a physical motivation for and interpretation of this choice -- but we also demonstrate that it can be a useful effective model in some other cases. We present a mathematical description of the method and compare it to existing scalable Gaussian Process methods. The method is fast and interpretable, with a range of potential applications within astronomical data analysis and beyond. We provide well-tested and documented open-source implementations of this method in C++, Python, and Julia.

...read moreread less

282 citations

Proceedings Article•

Doubly Stochastic Variational Inference for Deep Gaussian Processes

[...]

Hugh Salimbeni¹, Marc Peter Deisenroth¹•Institutions (1)

Imperial College London¹

01 May 2017

TL;DR: In this paper, a doubly stochastic variational inference algorithm for DGPs is proposed, which does not force independence between layers and can be used for both classification and regression.

...read moreread less

Abstract: Deep Gaussian processes (DGPs) are multi-layer generalizations of GPs, but inference in these models has proved challenging. Existing approaches to inference in DGP models assume approximate posteriors that force independence between the layers, and do not work well in practice. We present a doubly stochastic variational inference algorithm, which does not force independence between layers. With our method of inference we demonstrate that a DGP model can be used effectively on data ranging in size from hundreds to a billion points. We provide strong empirical evidence that our inference scheme for DGPs works well in practice in both classification and regression.

...read moreread less

274 citations

Proceedings Article•

On Kernelized Multi-armed Bandits.

[...]

Sayak Ray Chowdhury¹, Aditya Gopalan¹•Institutions (1)

Indian Institute of Science¹

17 Jul 2017

TL;DR: In this article, the authors considered the stochastic bandit problem with a continuous set of arms, with the expected reward function over the arms assumed to be fixed but unknown.

...read moreread less

Abstract: We consider the stochastic bandit problem with a continuous set of arms, with the expected reward function over the arms assumed to be fixed but unknown. We provide two new Gaussian process-based algorithms for continuous bandit optimization-Improved GP-UCB (IGP-UCB) and GP-Thomson sampling (GP-TS), and derive corresponding regret bounds. Specifically, the bounds hold when the expected reward function belongs to the reproducing kernel Hilbert space (RKHS) that naturally corresponds to a Gaussian process kernel used as input by the algorithms. Along the way, we derive a new self-normalized concentration inequality for vector- valued martingales of arbitrary, possibly infinite, dimension. Finally, experimental evaluation and comparisons to existing algorithms on synthetic and real-world environments are carried out that highlight the favorable gains of the proposed strategies in many cases.

...read moreread less

248 citations

Journal Article•DOI•

Accurate interatomic force fields via machine learning with covariant kernels

[...]

Aldo Glielmo¹, Peter Sollich¹, Alessandro De Vita¹, Alessandro De Vita²•Institutions (2)

King's College London¹, University of Trieste²

08 Jun 2017-Physical Review B

TL;DR: A novel scheme to accurately predict atomic forces as vector quantities, rather than sets of scalar components, by Gaussian process (GP) regression is presented, based on matrix-valued kernel functions.

...read moreread less

Abstract: We present a novel scheme to accurately predict atomic forces as vector quantities, rather than sets of scalar components, by Gaussian process (GP) regression. This is based on matrix-valued kernel functions, on which we impose the requirements that the predicted force rotates with the target configuration and is independent of any rotations applied to the configuration database entries. We show that such covariant GP kernels can be obtained by integration over the elements of the rotation group $\mathit{SO}(d)$ for the relevant dimensionality $d$. Remarkably, in specific cases the integration can be carried out analytically and yields a conservative force field that can be recast into a pair interaction form. Finally, we show that restricting the integration to a summation over the elements of a finite point group relevant to the target system is sufficient to recover an accurate GP. The accuracy of our kernels in predicting quantum-mechanical forces in real materials is investigated by tests on pure and defective Ni, Fe, and Si crystalline systems.

...read moreread less

230 citations

Journal Article•DOI•

A Multi-Resolution Approximation for Massive Spatial Datasets

[...]

Matthias Katzfuss¹•Institutions (1)

Texas A&M University¹

03 May 2017-Journal of the American Statistical Association

TL;DR: In this paper, a multi-resolution approximation (M-RA) of Gaussian processes observed at irregular locations in space is proposed, which can capture spatial structure from very fine to very large scales.

...read moreread less

Abstract: Automated sensing instruments on satellites and aircraft have enabled the collection of massive amounts of high-resolution observations of spatial fields over large spatial regions. If these datasets can be efficiently exploited, they can provide new insights on a wide variety of issues. However, traditional spatial-statistical techniques such as kriging are not computationally feasible for big datasets. We propose a multi-resolution approximation (M-RA) of Gaussian processes observed at irregular locations in space. The M-RA process is specified as a linear combination of basis functions at multiple levels of spatial resolution, which can capture spatial structure from very fine to very large scales. The basis functions are automatically chosen to approximate a given covariance function, which can be nonstationary. All computations involving the M-RA, including parameter inference and prediction, are highly scalable for massive datasets. Crucially, the inference algorithms can also be parallelize...

...read moreread less

219 citations

Journal Article•DOI•

Regression and Kriging metamodels with their experimental designs in simulation: A review

[...]

Jack P. C. Kleijnen¹•Institutions (1)

Tilburg University¹

01 Jan 2017-European Journal of Operational Research

TL;DR: In this paper, the design and analysis of simulation experiments are discussed via two types of metamodel (surrogate emulator) and analysis via low-order polynomial regression and Kriging (or Gaussian process).

...read moreread less

213 citations

Journal Article•DOI•

Rare Event Estimation Using Polynomial-Chaos Kriging

[...]

Roland Schöbi¹, Bruno Sudret¹, Stefano Marelli¹•Institutions (1)

ETH Zurich¹

01 Jun 2017-ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering

TL;DR: A new structural reliability method based on the recently developed polynomial-chaos kriging (PC-kriging) approach coupled with an active learning algorithm known as adaptive kriged Monte Carlo simulation (AK-MCS) is developed.

...read moreread less

Abstract: Structural reliability analysis aims at computing the probability of failure of systems whose performance may be assessed by using complex computational models (e.g., expensive-to-run finite-element models). A direct use of Monte Carlo simulation is not feasible in practice, unless a surrogate model (such as kriging, also known as Gaussian process modeling) is used. Such metamodels are often used in conjunction with adaptive experimental designs (i.e., design enrichment strategies), which allows one to iteratively increase the accuracy of the surrogate for the estimation of the failure probability while keeping low the overall number of runs of the costly original model. This paper develops a new structural reliability method based on the recently developed polynomial-chaos kriging (PC-kriging) approach coupled with an active learning algorithm known as adaptive kriging Monte Carlo simulation (AK-MCS). The problem is formulated in such a way that the computation of both small probabilities of fail...

...read moreread less

Journal Article•DOI•

Power load probability density forecasting using Gaussian process quantile regression

[...]

Yandong Yang¹, Shufang Li¹, Wenqi Li, Meijun Qu¹•Institutions (1)

Beijing University of Posts and Telecommunications¹

10 Nov 2017-Applied Energy

TL;DR: In this paper, a Gaussian Process Quantile Regression (GPQR) is proposed to handle the uncertainties in power load data in a principled manner, which can be statistically formulated.

...read moreread less

Journal Article•DOI•

Continuous-Time Gaussian Process Motion Planning via Probabilistic Inference

[...]

Mustafa Mukadam¹, Jing Dong¹, Xinyan Yan¹, Frank Dellaert¹, Byron Boots¹ - Show less +1 more•Institutions (1)

Georgia Institute of Technology¹

24 Jul 2017-arXiv: Robotics

TL;DR: In this article, Gaussian Process Motion Planner (GPMP) is proposed to solve continuous-time motion planning problems as probabilistic inference on a factor graph, where GP representations of trajectories are combined with fast structure-exploiting inference via numerical optimization.

...read moreread less

Abstract: We introduce a novel formulation of motion planning, for continuous-time trajectories, as probabilistic inference. We first show how smooth continuous-time trajectories can be represented by a small number of states using sparse Gaussian process (GP) models. We next develop an efficient gradient-based optimization algorithm that exploits this sparsity and GP interpolation. We call this algorithm the Gaussian Process Motion Planner (GPMP). We then detail how motion planning problems can be formulated as probabilistic inference on a factor graph. This forms the basis for GPMP2, a very efficient algorithm that combines GP representations of trajectories with fast, structure-exploiting inference via numerical optimization. Finally, we extend GPMP2 to an incremental algorithm, iGPMP2, that can efficiently replan when conditions change. We benchmark our algorithms against several sampling-based and trajectory optimization-based motion planning algorithms on planning problems in multiple environments. Our evaluation reveals that GPMP2 is several times faster than previous algorithms while retaining robustness. We also benchmark iGPMP2 on replanning problems, and show that it can find successful solutions in a fraction of the time required by GPMP2 to replan from scratch.

...read moreread less

Proceedings Article•

Nonlinear ICA of temporally dependent stationary sources

[...]

Aapo Hyvärinen, Hiroshi Morioka¹•Institutions (1)

Helsinki Institute for Information Technology¹

10 Apr 2017

TL;DR: It is proved that the method estimates the sources for general smooth mixing nonlinearities, assuming the sources have sufficiently strong temporal dependencies, and these dependencies are in a certain way different from dependencies found in Gaussian processes.

...read moreread less

Abstract: We develop a nonlinear generalization of independent component analysis (ICA) or blind source separation, based on temporal dependencies (e.g. autocorrelations). We introduce a nonlinear generative model where the independent sources are assumed to be temporally dependent, non-Gaussian, and stationary, and we observe arbitrarily nonlinear mixtures of them. We develop a method for estimating the model (i.e. separating the sources) based on logistic regression in a neural network which learns to discriminate between a short temporal window of the data vs. a temporal window of temporally permuted data. We prove that the method estimates the sources for general smooth mixing nonlinearities, assuming the sources have sufficiently strong temporal dependencies, and these dependencies are in a certain way different from dependencies found in Gaussian processes. For Gaussian (and similar) sources, the method estimates the nonlinear part of the mixing. We thus provide the first rigorous and general proof of identifiability of nonlinear ICA for temporally dependent sources, together with a practical method for its estimation.

...read moreread less

Posted Content•

Deep Neural Networks as Gaussian Processes

[...]

Jaehoon Lee¹, Yasaman Bahri², Roman Novak², Samuel S. Schoenholz², Jeffrey Pennington², Jascha Sohl-Dickstein² - Show less +2 more•Institutions (2)

University of British Columbia¹, Google²

01 Nov 2017-arXiv: Machine Learning

TL;DR: In this article, the authors derive the exact equivalence between infinitely wide deep networks and Gaussian Processes (GP) and develop a computationally efficient pipeline to compute the covariance function for these GPs.

...read moreread less

Abstract: It has long been known that a single-layer fully-connected neural network with an i.i.d. prior over its parameters is equivalent to a Gaussian process (GP), in the limit of infinite network width. This correspondence enables exact Bayesian inference for infinite width neural networks on regression tasks by means of evaluating the corresponding GP. Recently, kernel functions which mimic multi-layer random neural networks have been developed, but only outside of a Bayesian framework. As such, previous work has not identified that these kernels can be used as covariance functions for GPs and allow fully Bayesian prediction with a deep neural network. In this work, we derive the exact equivalence between infinitely wide deep networks and GPs. We further develop a computationally efficient pipeline to compute the covariance function for these GPs. We then use the resulting GPs to perform Bayesian inference for wide deep neural networks on MNIST and CIFAR-10. We observe that trained neural network accuracy approaches that of the corresponding GP with increasing layer width, and that the GP uncertainty is strongly correlated with trained network prediction error. We further find that test performance increases as finite-width trained networks are made wider and more similar to a GP, and thus that GP predictions typically outperform those of finite-width networks. Finally we connect the performance of these GPs to the recent theory of signal propagation in random neural networks.

...read moreread less

Journal Article•

Variational Fourier features for Gaussian processes

[...]

James Hensman, Nicolas Durrande, Arno Solin¹•Institutions (1)

Aalto University¹

01 Jan 2017-Journal of Machine Learning Research

TL;DR: In this article, the authors combine the variational approach to sparse approximation and the spectral representation of Gaussian processes to obtain an approximation with the representational power and computational scalability of spectral representations.

...read moreread less

Abstract: This work brings together two powerful concepts in Gaussian processes: the variational approach to sparse approximation and the spectral representation of Gaussian processes. This gives rise to an approximation that inherits the benefits of the variational approach but with the representational power and computational scalability of spectral representations. The work hinges on a key result that there exist spectral features related to a finite domain of the Gaussian process which exhibit almost-independent covariances. We derive these expressions for Matern kernels in one dimension, and generalize to more dimensions using kernels with specific structures. Under the assumption of additive Gaussian noise, our method requires only a single pass through the data set, making for very fast and accurate computation. We fit a model to 4 million training points in just a few minutes on a standard laptop. With non-conjugate likelihoods, our MCMC scheme reduces the cost of computation from O(NM2) (for a sparse Gaussian process) to O(NM) per iteration, where N is the number of data and M is the number of features.

...read moreread less

Proceedings Article•DOI•

G2DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition

[...]

Qilong Wang¹, Peihua Li¹, Lei Zhang²•Institutions (2)

Dalian University of Technology¹, Hong Kong Polytechnic University²

21 Jul 2017

TL;DR: Experimental results on large scale region classification and fine-grained recognition tasks show that G2DeNet is superior to its counterparts, capable of achieving state-of-the-art performance.

...read moreread less

Abstract: Recently, plugging trainable structural layers into deep convolutional neural networks (CNNs) as image representations has made promising progress. However, there has been little work on inserting parametric probability distributions, which can effectively model feature statistics, into deep CNNs in an end-to-end manner. This paper proposes a Global Gaussian Distribution embedding Network (G2DeNet) to take a step towards addressing this problem. The core of G2DeNet is a novel trainable layer of a global Gaussian as an image representation plugged into deep CNNs for end-to-end learning. The challenge is that the proposed layer involves Gaussian distributions whose space is not a linear space, which makes its forward and backward propagations be non-intuitive and non-trivial. To tackle this issue, we employ a Gaussian embedding strategy which respects the structures of both Riemannian manifold and smooth group of Gaussians. Based on this strategy, we construct the proposed global Gaussian embedding layer and decompose it into two sub-layers: the matrix partition sub-layer decoupling the mean vector and covariance matrix entangled in the embedding matrix, and the square-rooted, symmetric positive definite matrix sub-layer. In this way, we can derive the partial derivatives associated with the proposed structural layer and thus allow backpropagation of gradients. Experimental results on large scale region classification and fine-grained recognition tasks show that G2DeNet is superior to its counterparts, capable of achieving state-of-the-art performance.

...read moreread less

Journal Article•DOI•

Bayesian Calibration of Inexact Computer Models

[...]

Matthew Plumlee¹•Institutions (1)

University of Michigan¹

13 Jun 2017-Journal of the American Statistical Association

TL;DR: In this paper, the prior distribution on the bias is orthogonal to the gradient of the computer model, which results in an issue where the posterior of the parameter is suboptimally broad.

...read moreread less

Abstract: Bayesian calibration is used to study computer models in the presence of both a calibration parameter and model bias. The parameter in the predominant methodology is left undefined. This results in an issue, where the posterior of the parameter is suboptimally broad. There has been no generally accepted alternatives to date. This article proposes using Bayesian calibration, where the prior distribution on the bias is orthogonal to the gradient of the computer model. Problems associated with Bayesian calibration are shown to be mitigated through analytic results in addition to examples. Supplementary materials for this article are available online.

...read moreread less

Proceedings Article•

Multi-Information Source Optimization

[...]

Matthias Poloczek¹, Jialei Wang¹, Peter I. Frazier¹•Institutions (1)

Cornell University¹

01 Jan 2017

TL;DR: This work presents a novel algorithm that provides a rigorous mathematical treatment of the uncertainties arising from model discrepancies and noisy observations, and conducts an experimental evaluation that demonstrates that the method consistently outperforms other state-of-the-art techniques.

...read moreread less

Abstract: We consider Bayesian methods for multi-information source optimization (MISO), in which we seek to optimize an expensive-to-evaluate black-box objective function while also accessing cheaper but biased and noisy approximations ("information sources"). We present a novel algorithm that outperforms the state of the art for this problem by using a Gaussian process covariance kernel better suited to MISO than those used by previous approaches, and an acquisition function based on a one-step optimality analysis supported by efficient parallelization. We also provide a novel technique to guarantee the asymptotic quality of the solution provided by this algorithm. Experimental evaluations demonstrate that this algorithm consistently finds designs of higher value at less cost than previous approaches.

...read moreread less

Proceedings Article•

Random feature expansions for Deep Gaussian Processes

[...]

Kurt Cutajar¹, Edwin V. Bonilla², Pietro Michiardi¹, Maurizio Filippone¹•Institutions (2)

Institut Eurécom¹, University of New South Wales²

06 Aug 2017

TL;DR: In this paper, the authors introduce a novel formulation of DGPs based on random feature expansions that are trained using stochastic variational inference, which significantly advances the state-of-the-art in inference for DGPs and enables accurate quantification of uncertainty.

...read moreread less

Abstract: The composition of multiple Gaussian Processes as a Deep Gaussian Process (DGP) enables a deep probabilistic nonparametric approach to flexibly tackle complex machine learning problems with sound quantification of uncertainty. Existing inference approaches for DGP models have limited scalability and are notoriously cumbersome to construct. In this work we introduce a novel formulation of DGPs based on random feature expansions that we train using stochastic variational inference. This yields a practical learning framework which significantly advances the state-of-the-art in inference for DGPs, and enables accurate quantification of uncertainty. We extensively showcase the scalability and performance of our proposal on several datasets with up to 8 million observations, and various DGP architectures with up to 30 hidden layers.

...read moreread less

Proceedings Article•DOI•

Convolutional Gaussian Processes

[...]

Mark van der Wilk¹, Carl Edward Rasmussen¹, James Hensman²•Institutions (2)

University of Cambridge¹, Lancaster University²

01 Jan 2017

TL;DR: It is shown how the marginal likelihood can be used to find an optimal weighting between convolutional and RBF kernels to further improve performance, and it is hoped that this illustration of the usefulness of a marginal likelihood will help automate discovering architectures in larger models.

...read moreread less

Abstract: We present a practical way of introducing convolutional structure into Gaussian processes, making them more suited to high-dimensional inputs like images. The main contribution of our work is the construction of an inter-domain inducing point approximation that is well-tailored to the convolutional kernel. This allows us to gain the generalisation benefit of a convolutional kernel, together with fast but accurate posterior inference. We investigate several variations of the convolutional kernel, and apply it to MNIST and CIFAR-10, where we obtain significant improvements over existing Gaussian process models. We also show how the marginal likelihood can be used to find an optimal weighting between convolutional and RBF kernels to further improve performance. This illustration of the usefulness of the marginal likelihood may help automate discovering architectures in larger models.

...read moreread less

Journal Article•DOI•

A Unifying Framework for Gaussian Process Pseudo-Point Approximations using Power Expectation Propagation

[...]

Thang D. Bui, Josiah Yan, Richard E. Turner

01 Jan 2017-Journal of Machine Learning Research

TL;DR: This paper developed a new pseudo-point approximation framework using Power Expectation Propagation (Power EP) that unifies a large number of these pseudo-points approximations and demonstrated that the new framework includes new pseudo point approximation methods that outperform current approaches on regression and classification tasks.

...read moreread less

Abstract: Gaussian processes (GPs) are flexible distributions over functions that enable high-level assumptions about unknown functions to be encoded in a parsimonious, flexible and general way. Although elegant, the application of GPs is limited by computational and analytical intractabilities that arise when data are sufficiently numerous or when employing non-Gaussian models. Consequently, a wealth of GP approximation schemes have been developed over the last 15 years to address these key limitations. Many of these schemes employ a small set of pseudo data points to summarise the actual data. In this paper, we develop a new pseudo-point approximation framework using Power Expectation Propagation (Power EP) that unifies a large number of these pseudo-point approximations. Unlike much of the previous venerable work in this area, the new framework is built on standard methods for approximate inference (variational free-energy, EP and Power EP methods) rather than employing approximations to the probabilistic generative model itself. In this way, all of approximation is performed at `inference time' rather than at `modelling time' resolving awkward philosophical and empirical questions that trouble previous approaches. Crucially, we demonstrate that the new framework includes new pseudo-point approximation methods that outperform current approaches on regression and classification tasks.

...read moreread less

Journal Article•DOI•

Improved prediction accuracy for disease risk mapping using Gaussian process stacked generalization

[...]

Samir Bhatt¹, Ewan Cameron², Seth Flaxman², Daniel J. Weiss², David L. Smith³, Peter W. Gething² - Show less +2 more•Institutions (3)

Imperial College London¹, University of Oxford², Institute for Health Metrics and Evaluation³

01 Sep 2017-Journal of the Royal Society Interface

TL;DR: In this article, an ensemble approach based on stacked generalization that allows for multiple nonlinear algorithmic mean functions to be jointly embedded within the Gaussian process framework is presented. And the results show that the generalized ensemble approach markedly outperforms any individual method.

...read moreread less

Abstract: Maps of infectious disease—charting spatial variations in the force of infection, degree of endemicity and the burden on human health—provide an essential evidence base to support planning towards global health targets. Contemporary disease mapping efforts have embraced statistical modelling approaches to properly acknowledge uncertainties in both the available measurements and their spatial interpolation. The most common such approach is Gaussian process regression, a mathematical framework composed of two components: a mean function harnessing the predictive power of multiple independent variables, and a covariance function yielding spatio-temporal shrinkage against residual variation from the mean. Though many techniques have been developed to improve the flexibility and fitting of the covariance function, models for the mean function have typically been restricted to simple linear terms. For infectious diseases, known to be driven by complex interactions between environmental and socio-economic factors, improved modelling of the mean function can greatly boost predictive power. Here, we present an ensemble approach based on stacked generalization that allows for multiple nonlinear algorithmic mean functions to be jointly embedded within the Gaussian process framework. We apply this method to mapping Plasmodium falciparum prevalence data in sub-Saharan Africa and show that the generalized ensemble approach markedly outperforms any individual method.

...read moreread less

Journal Article•DOI•

Variational Latent Gaussian Process for Recovering Single-Trial Dynamics from Population Spike Trains.

[...]

Yuan Zhao¹, Il Memming Park¹•Institutions (1)

Stony Brook University¹

14 Apr 2017-Neural Computation

TL;DR: The variational latent gaussian process (vLGP) is proposed, a practical and efficient inference method that combines a generative model with a history-dependent point process observation, together with a smoothness prior on the latent trajectories to reveal hidden neural dynamics from large-scale neural recordings.

...read moreread less

Abstract: When governed by underlying low-dimensional dynamics, the interdependence of simultaneously recorded populations of neurons can be explained by a small number of shared factors, or a low-dimensional trajectory. Recovering these latent trajectories, particularly from single-trial population recordings, may help us understand the dynamics that drive neural computation. However, due to the biophysical constraints and noise in the spike trains, inferring trajectories from data is a challenging statistical problem in general. Here, we propose a practical and efficient inference method, the variational latent gaussian process (vLGP). The vLGP combines a generative model with a history-dependent point process observation, together with a smoothness prior on the latent trajectories. The vLGP improves on earlier methods for recovering latent trajectories, which assume either observation models inappropriate for point processes or linear dynamics. We compare and validate vLGP on both simulated data sets and populat...

...read moreread less

Journal Article•DOI•

Nudged elastic band calculations accelerated with Gaussian process regression.

[...]

Olli-Pekka Koistinen, Freyja B. Dagbjartsdóttir¹, Vilhjálmur Ásgeirsson¹, Aki Vehtari², Hannes Jónsson¹ - Show less +1 more•Institutions (2)

University of Iceland¹, Helsinki Institute for Information Technology²

21 Sep 2017-Journal of Chemical Physics

TL;DR: In this paper, the Hessian matrix at the initial and final state minima can be used as input in the minimum energy path calculation, thereby improving stability and reducing the number of iterations needed for convergence.

...read moreread less

Abstract: Minimum energy paths for transitions such as atomic and/or spin rearrangements in thermalized systems are the transition paths of largest statistical weight. Such paths are frequently calculated using the nudged elastic band method, where an initial path is iteratively shifted to the nearest minimum energy path. The computational effort can be large, especially when ab initio or electron density functional calculations are used to evaluate the energy and atomic forces. Here, we show how the number of such evaluations can be reduced by an order of magnitude using a Gaussian process regression approach where an approximate energy surface is generated and refined in each iteration. When the goal is to evaluate the transition rate within harmonic transition state theory, the evaluation of the Hessian matrix at the initial and final state minima can be carried out beforehand and used as input in the minimum energy path calculation, thereby improving stability and reducing the number of iterations needed for convergence. A Gaussian process model also provides an uncertainty estimate for the approximate energy surface, and this can be used to focus the calculations on the lesser-known part of the path, thereby reducing the number of needed energy and force evaluations to a half in the present calculations. The methodology is illustrated using the two-dimensional Muller-Brown potential surface and performance assessed on an established benchmark involving 13 rearrangement transitions of a heptamer island on a solid surface.

...read moreread less

Journal Article•DOI•

Bridging asymptotic independence and dependence in spatial extremes using Gaussian scale mixtures

[...]

Raphaël Huser¹, Thomas Opitz², Emeric Thibaud³•Institutions (3)

King Abdullah University of Science and Technology¹, Institut national de la recherche agronomique², École Polytechnique Fédérale de Lausanne³

01 Aug 2017-spatial statistics

TL;DR: In this article, the authors study the extremal dependence properties of Gaussian scale mixtures and unify and extend general results on their joint tail decay rates in both asymptotic dependence and independence cases.

...read moreread less

Abstract: Gaussian scale mixtures are constructed as Gaussian processes with a random variance. They have non-Gaussian marginals and can exhibit asymptotic dependence unlike Gaussian processes, which are asymptotically independent except in the case of perfect dependence. In this paper, we study the extremal dependence properties of Gaussian scale mixtures and we unify and extend general results on their joint tail decay rates in both asymptotic dependence and independence cases. Motivated by the analysis of spatial extremes, we propose flexible yet parsimonious parametric copula models that smoothly interpolate from asymptotic dependence to independence and include the Gaussian dependence as a special case. We show how these new models can be fitted to high threshold exceedances using a censored likelihood approach, and we demonstrate that they provide valuable information about tail characteristics. In particular, by borrowing strength across locations, our parametric model-based approach can also be used to provide evidence for or against either asymptotic dependence class, hence complementing information given at an exploratory stage by the widely used nonparametric or parametric estimates of the χ and χ coefficients. We demonstrate the capacity of our methodology by adequately capturing the extremal properties of wind speed data collected in the Pacific Northwest, US.

...read moreread less

Proceedings Article•

Learning to detect sepsis with a multitask Gaussian process RNN classifier

[...]

Joseph Futoma¹, Sanjay Hariharan¹, Katherine Heller¹•Institutions (1)

Duke University¹

06 Aug 2017

TL;DR: In this paper, a scalable end-to-end classifier that uses streaming physiological and medication data to accurately predict the onset of sepsis, a life-threatening complication from infections that has high mortality and morbidity, is presented.

...read moreread less

Abstract: We present a scalable end-to-end classifier that uses streaming physiological and medication data to accurately predict the onset of sepsis, a life-threatening complication from infections that has high mortality and morbidity. Our proposed framework models the multivariate trajectories of continuous-valued physiological time series using multitask Gaussian processes, seamlessly accounting for the high uncertainty, frequent missingness, and irregular sampling rates typically associated with real clinical data. The Gaussian process is directly connected to a black-box classifier that predicts whether a patient will become septic, chosen in our case to be a recurrent neural network to account for the extreme variability in the length of patient encounters. We show how to scale the computations associated with the Gaussian process in a manner so that the entire system can be discriminatively trained end-to-end using backpropagation. In a large cohort of heterogeneous inpatient encounters at our university health system we find that it outperforms several baselines at predicting sepsis, and yields 19.4% and 55.5% improved areas under the Receiver Operating Characteristic and Precision Recall curves as compared to the NEWS score currently used by our hospital.

...read moreread less

Journal Article•DOI•

A Gaussian process-based dynamic surrogate model for complex engineering structural reliability analysis

[...]

Guoshao Su¹, Peng Lifeng¹, Hu Lihua¹•Institutions (1)

Guangxi University¹

01 Sep 2017-Structural Safety

TL;DR: A Dynamic Gaussian Process Regression surrogate model based on Monte Carlo Simulation (DGPR-based MCS) was proposed for the reliability analysis of complex engineering structures and has advantages of high efficiency and high precision compared to the traditional response surface method (RSM).

...read moreread less

Journal Article•DOI•

Nudged elastic band calculations accelerated with Gaussian process regression

[...]

Olli-Pekka Koistinen, Freyja B. Dagbjartsdóttir¹, Vilhjálmur Ásgeirsson¹, Aki Vehtari², Hannes Jónsson¹ - Show less +1 more•Institutions (2)

University of Iceland¹, Helsinki Institute for Information Technology²

14 Jun 2017-arXiv: Chemical Physics

TL;DR: The number of evaluations of the Hessian matrix at the initial and final state minima can be carried out beforehand and used as input in the minimum energy path calculation, thereby improving stability and reducing the number of iterations needed for convergence.

...read moreread less

Proceedings Article•

Gaussian process based nonlinear latent structure discovery in multivariate spike train data

[...]

Anqi Wu¹, Nicholas Roy¹, Stephen Keeley¹, Jonathan W. Pillow¹•Institutions (1)

Princeton University¹

01 Dec 2017

TL;DR: This work proposes a doubly nonlinear latent variable model that can identify low-dimensional structure underlying apparently high-dimensional spike train data and introduces the decoupled Laplace approximation, a fast approximate inference method that allows us to efficiently optimize the latent path while marginalizing over tuning curves.

...read moreread less

Abstract: A large body of recent work focuses on methods for extracting low-dimensional latent structure from multi-neuron spike train data. Most such methods employ either linear latent dynamics or linear mappings from latent space to log spike rates. Here we propose a doubly nonlinear latent variable model that can identify low-dimensional structure underlying apparently high-dimensional spike train data. We introduce the Poisson Gaussian-Process Latent Variable Model (P-GPLVM), which consists of Poisson spiking observations and two underlying Gaussian processes-one governing a temporal latent variable and another governing a set of nonlinear tuning curves. The use of nonlinear tuning curves enables discovery of low-dimensional latent structure even when spike responses exhibit high linear dimensionality (e.g., as found in hippocampal place cell codes). To learn the model from data, we introduce the decoupled Laplace approximation, a fast approximate inference method that allows us to efficiently optimize the latent path while marginalizing over tuning curves. We show that this method outperforms previous Laplace-approximation-based inference methods in both the speed of convergence and accuracy. We apply the model to spike trains recorded from hippocampal place cells and show that it compares favorably to a variety of previous methods for latent structure discovery, including variational auto-encoder (VAE) based methods that parametrize the nonlinear mapping from latent space to spike rates with a deep neural network.

...read moreread less

Collapse