Showing papers on "Gaussian process published in 2015"

PDF

Open Access

Posted Content•

Scalable Bayesian Optimization Using Deep Neural Networks

[...]

Jasper Snoek¹, Oren Rippel², Oren Rippel¹, Kevin Swersky³, Ryan Kiros³, Nadathur Satish⁴, Narayanan Sundaram⁴, Md. Mostofa Ali Patwary⁴, Prabhat⁵, Ryan P. Adams¹ - Show less +6 more•Institutions (5)

Harvard University¹, Massachusetts Institute of Technology², University of Toronto³, Intel⁴, Lawrence Berkeley National Laboratory⁵

19 Feb 2015-arXiv: Machine Learning

TL;DR: In this article, the authors explore the use of neural networks as an alternative to GPs to model distributions over functions, and show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically.

...read moreread less

Abstract: Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations. It relies on querying a distribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effectiveness of the approach, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of observations, it has been challenging to handle objectives whose optimization requires many evaluations, and as such, massively parallelizing the optimization. In this work, we explore the use of neural networks as an alternative to GPs to model distributions over functions. We show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of parallelism, which we apply to large scale hyperparameter optimization, rapidly finding competitive models on benchmark object recognition tasks using convolutional networks, and image caption generation using neural language models.

...read moreread less

524 citations

Proceedings Article•

Scalable Bayesian Optimization Using Deep Neural Networks

[...]

Harvard University¹, Massachusetts Institute of Technology², University of Toronto³, Intel⁴, Lawrence Berkeley National Laboratory⁵

06 Jul 2015

TL;DR: This work shows that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically, which allows for a previously intractable degree of parallelism.

...read moreread less

503 citations

Proceedings Article•

Scalable Variational Gaussian Process Classification

[...]

James Hensman¹, Alexander G. de G. Matthews², Zoubin Ghahramani²•Institutions (2)

University of Sheffield¹, University of Cambridge²

21 Feb 2015

TL;DR: This work shows how to scale the model within a variational inducing point framework, outperforming the state of the art on benchmark datasets, and can be exploited to allow classification in problems with millions of data points.

...read moreread less

Abstract: Gaussian process classification is a popular method with a number of appealing properties. We show how to scale the model within a variational inducing point framework, outperforming the state of the art on benchmark datasets. Importantly, the variational formulation can be exploited to allow classification in problems with millions of data points, as we demonstrate in experiments.

...read moreread less

489 citations

Journal Article•DOI•

Gaussian approximation potentials: A brief tutorial introduction

[...]

Albert P. Bartók¹, Gábor Csányi¹•Institutions (1)

University of Cambridge¹

15 Aug 2015-International Journal of Quantum Chemistry

TL;DR: The Gaussian approximation potentials (GAP) framework is described, a variety of descriptors are discussed, how to train the model on total energies and derivatives, and the simultaneous use of multiple models of different complexity are discussed.

...read moreread less

Abstract: We present a swift walk-through of our recent work that uses machine learning to fit interatomic potentials based on quantum mechanical data. We describe our Gaussian approximation potentials (GAP) framework, discuss a variety of descriptors, how to train the model on total energies and derivatives, and the simultaneous use of multiple models of different complexity. We also show a small example using QUIP, the software sandbox implementation of GAP that is available for noncommercial use. © 2015 Wiley Periodicals, Inc.

...read moreread less

470 citations

Posted Content•

Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP)

[...]

Andrew Gordon Wilson¹, Hannes Nickisch²•Institutions (2)

Carnegie Mellon University¹, Philips²

03 Mar 2015-arXiv: Learning

TL;DR: A new structured kernel interpolation (SKI) framework is introduced, which generalises and unifies inducing point methods for scalable Gaussian processes (GPs) and naturally enables Kronecker and Toeplitz algebra for substantial additional gains in scalability.

...read moreread less

Abstract: We introduce a new structured kernel interpolation (SKI) framework, which generalises and unifies inducing point methods for scalable Gaussian processes (GPs). SKI methods produce kernel approximations for fast computations through kernel interpolation. The SKI framework clarifies how the quality of an inducing point approach depends on the number of inducing (aka interpolation) points, interpolation strategy, and GP covariance kernel. SKI also provides a mechanism to create new scalable kernel methods, through choosing different kernel interpolation strategies. Using SKI, with local cubic kernel interpolation, we introduce KISS-GP, which is 1) more scalable than inducing point alternatives, 2) naturally enables Kronecker and Toeplitz algebra for substantial additional gains in scalability, without requiring any grid data, and 3) can be used for fast and expressive kernel learning. KISS-GP costs O(n) time and storage for GP inference. We evaluate KISS-GP for kernel matrix approximation, kernel learning, and natural sound modelling.

...read moreread less

358 citations

Journal Article•DOI•

Local Gaussian Process Approximation for Large Computer Experiments

[...]

Robert B. Gramacy¹, Daniel W. Apley²•Institutions (2)

University of Chicago¹, Northwestern University²

16 Jun 2015-Journal of Computational and Graphical Statistics

TL;DR: A family of local sequential design schemes that dynamically define the support of a Gaussian process predictor based on a local subset of the data are derived, enabling a global predictor able to take advantage of modern multicore architectures.

...read moreread less

Abstract: We provide a new approach to approximate emulation of large computer experiments. By focusing expressly on desirable properties of the predictive equations, we derive a family of local sequential design schemes that dynamically define the support of a Gaussian process predictor based on a local subset of the data. We further derive expressions for fast sequential updating of all needed quantities as the local designs are built up iteratively. Then we show how independent application of our local design strategy across the elements of a vast predictive grid facilitates a trivially parallel implementation. The end result is a global predictor able to take advantage of modern multicore architectures, providing a nonstationary modeling feature as a bonus. We demonstrate our method on two examples using designs with thousands of data points, and compare to the method of compactly supported covariances. Supplementary materials for this article are available online.

...read moreread less

358 citations

Journal Article•DOI•

Regularized estimation in sparse high-dimensional time series models

[...]

Sumanta Basu¹, George Michailidis•Institutions (1)

University of Michigan¹

01 Aug 2015-Annals of Statistics

TL;DR: In this article, a measure of stability for stable Gaussian processes using their spectral properties is introduced, which provides insight into the effect of dependence on the accuracy of the regularized estimates.

...read moreread less

Abstract: Many scientific and economic problems involve the analysis of high-dimensional time series datasets. However, theoretical studies in high-dimensional statistics to date rely primarily on the assumption of independent and identically distributed (i.i.d.) samples. In this work, we focus on stable Gaussian processes and investigate the theoretical properties of $\ell_{1}$-regularized estimates in two important statistical problems in the context of high-dimensional time series: (a) stochastic regression with serially correlated errors and (b) transition matrix estimation in vector autoregressive (VAR) models. We derive nonasymptotic upper bounds on the estimation errors of the regularized estimates and establish that consistent estimation under high-dimensional scaling is possible via $\ell_{1}$-regularization for a large class of stable processes under sparsity constraints. A key technical contribution of the work is to introduce a measure of stability for stationary processes using their spectral properties that provides insight into the effect of dependence on the accuracy of the regularized estimates. With this proposed stability measure, we establish some useful deviation bounds for dependent data, which can be used to study several important regularized estimates in a time series setting.

...read moreread less

346 citations

Journal Article•DOI•

A Multi-resolution Gaussian process model for the analysis of large spatial data sets

[...]

Douglas Nychka¹, Soutir Bandyopadhyay², Dorit Hammerling³, Finn Lindgren⁴, Stephan R. Sain¹ - Show less +1 more•Institutions (4)

National Center for Atmospheric Research¹, Lehigh University², Research Triangle Park³, Engineering and Physical Sciences Research Council⁴

16 Jun 2015-Journal of Computational and Graphical Statistics

TL;DR: A multiresolution model to predict two-dimensional spatial fields based on irregularly spaced observations that gives a good approximation to standard covariance functions such as the Matérn and also has flexibility to fit more complicated shapes.

...read moreread less

Abstract: We develop a multiresolution model to predict two-dimensional spatial fields based on irregularly spaced observations. The radial basis functions at each level of resolution are constructed using a Wendland compactly supported correlation function with the nodes arranged on a rectangular grid. The grid at each finer level increases by a factor of two and the basis functions are scaled to have a constant overlap. The coefficients associated with the basis functions at each level of resolution are distributed according to a Gaussian Markov random field (GMRF) and take advantage of the fact that the basis is organized as a lattice. Several numerical examples and analytical results establish that this scheme gives a good approximation to standard covariance functions such as the Matern and also has flexibility to fit more complicated shapes. The other important feature of this model is that it can be applied to statistical inference for large spatial datasets because key matrices in the computations are spars...

...read moreread less

331 citations

Proceedings Article•DOI•

Associating neural word embeddings with deep image representations using Fisher Vectors

[...]

Benjamin Klein¹, Guy Lev¹, Gil Sadeh¹, Lior Wolf¹•Institutions (1)

Tel Aviv University¹

07 Jun 2015

TL;DR: This work is using the Fisher Vector as a sentence representation by pooling the word2vec embedding of each word in the sentence by using the new Fisher Vectors derived from HGLMMs to represent sentences.

...read moreread less

Abstract: In recent years, the problem of associating a sentence with an image has gained a lot of attention. This work continues to push the envelope and makes further progress in the performance of image annotation and image search by a sentence tasks. In this work, we are using the Fisher Vector as a sentence representation by pooling the word2vec embedding of each word in the sentence. The Fisher Vector is typically taken as the gradients of the log-likelihood of descriptors, with respect to the parameters of a Gaussian Mixture Model (GMM). In this work we present two other Mixture Models and derive their Expectation-Maximization and Fisher Vector expressions. The first is a Laplacian Mixture Model (LMM), which is based on the Laplacian distribution. The second Mixture Model presented is a Hybrid Gaussian-Laplacian Mixture Model (HGLMM) which is based on a weighted geometric mean of the Gaussian and Laplacian distribution. Finally, by using the new Fisher Vectors derived from HGLMMs to represent sentences, we achieve state-of-the-art results for both the image annotation and the image search by a sentence tasks on four benchmarks: Pascal1K, Flickr8K, Flickr30K, and COCO.

...read moreread less

326 citations

Posted Content•

Deep Kernel Learning

[...]

Andrew Gordon Wilson¹, Zhiting Hu², Ruslan Salakhutdinov³, Eric P. Xing²•Institutions (3)

Pennsylvania State University¹, Carnegie Mellon University², University of Toronto³

06 Nov 2015-arXiv: Learning

TL;DR: In this paper, the authors introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods, and jointly learn the properties of these kernels through the marginal likelihood of a Gaussian process.

...read moreread less

Abstract: We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods. Specifically, we transform the inputs of a spectral mixture base kernel with a deep architecture, using local kernel interpolation, inducing points, and structure exploiting (Kronecker and Toeplitz) algebra for a scalable kernel representation. These closed-form kernels can be used as drop-in replacements for standard kernels, with benefits in expressive power and scalability. We jointly learn the properties of these kernels through the marginal likelihood of a Gaussian process. Inference and learning cost $O(n)$ for $n$ training points, and predictions cost $O(1)$ per test point. On a large and diverse collection of applications, including a dataset with 2 million examples, we show improved performance over scalable Gaussian processes with flexible kernel learning models, and stand-alone deep architectures.

...read moreread less

288 citations

Posted Content•

Distributed Gaussian Processes

[...]

Marc Peter Deisenroth¹, Jun Wei Ng¹•Institutions (1)

Imperial College London¹

10 Feb 2015-arXiv: Machine Learning

TL;DR: The robust Bayesian Committee Machine is introduced, a practical and scalable product-of-experts model for large-scale distributed GP regression and can be used on heterogeneous computing infrastructures, ranging from laptops to clusters.

...read moreread less

Abstract: To scale Gaussian processes (GPs) to large data sets we introduce the robust Bayesian Committee Machine (rBCM), a practical and scalable product-of-experts model for large-scale distributed GP regression. Unlike state-of-the-art sparse GP approximations, the rBCM is conceptually simple and does not rely on inducing or variational parameters. The key idea is to recursively distribute computations to independent computational units and, subsequently, recombine them to form an overall result. Efficient closed-form inference allows for straightforward parallelisation and distributed computations with a small memory footprint. The rBCM is independent of the computational graph and can be used on heterogeneous computing infrastructures, ranging from laptops to clusters. With sufficient computing resources our distributed GP model can handle arbitrarily large data sets.

...read moreread less

Journal Article•DOI•

A Multiobjective Evolutionary Algorithm Using Gaussian Process-Based Inverse Modeling

[...]

Ran Cheng¹, Yaochu Jin¹, Kaname Narukawa², Bernhard Sendhoff²•Institutions (2)

University of Surrey¹, Honda²

23 Jan 2015-IEEE Transactions on Evolutionary Computation

TL;DR: This paper proposes a new model-based method for representing and searching nondominated solutions that is able to alleviate the requirement on solution diversity and in principle, as many solutions as needed can be generated.

...read moreread less

Abstract: To approximate the Pareto front, most existing multiobjective evolutionary algorithms store the nondominated solutions found so far in the population or in an external archive during the search. Such algorithms often require a high degree of diversity of the stored solutions and only a limited number of solutions can be achieved. By contrast, model-based algorithms can alleviate the requirement on solution diversity and in principle, as many solutions as needed can be generated. This paper proposes a new model-based method for representing and searching nondominated solutions. The main idea is to construct Gaussian process-based inverse models that map all found nondominated solutions from the objective space to the decision space. These inverse models are then used to create offspring by sampling the objective space. To facilitate inverse modeling, the multivariate inverse function is decomposed into a group of univariate functions, where the number of inverse models is reduced using a random grouping technique. Extensive empirical simulations demonstrate that the proposed algorithm exhibits robust search performance on a variety of medium to high dimensional multiobjective optimization test problems. Additional nondominated solutions are generated a posteriori using the constructed models to increase the density of solutions in the preferred regions at a low computational cost.

...read moreread less

Journal Article•DOI•

Non-parametric representation and prediction of single- and multi-shell diffusion-weighted MRI data using Gaussian processes

[...]

Jesper L. R. Andersson¹, Stamatios N. Sotiropoulos¹•Institutions (1)

University of Oxford¹

15 Nov 2015-NeuroImage

TL;DR: A novel way to represent and make predictions about diffusion MRI data is described, based on a Gaussian process on one or several spheres similar to the Geostatistical method of “Kriging”.

...read moreread less

Posted Content•

Batch Bayesian Optimization via Local Penalization

[...]

Javier González¹, Zhenwen Dai, Philipp Hennig², Neil D. Lawrence¹•Institutions (2)

University of Sheffield¹, Max Planck Society²

29 May 2015-arXiv: Machine Learning

TL;DR: A simple heuristic based on an estimate of the Lipschitz constant is investigated that captures the most important aspect of this interaction at negligible computational overhead and compares well, in running time, with much more elaborate alternatives.

...read moreread less

Abstract: The popularity of Bayesian optimization methods for efficient exploration of parameter spaces has lead to a series of papers applying Gaussian processes as surrogates in the optimization of functions. However, most proposed approaches only allow the exploration of the parameter space to occur sequentially. Often, it is desirable to simultaneously propose batches of parameter values to explore. This is particularly the case when large parallel processing facilities are available. These facilities could be computational or physical facets of the process being optimized. E.g. in biological experiments many experimental set ups allow several samples to be simultaneously processed. Batch methods, however, require modeling of the interaction between the evaluations in the batch, which can be expensive in complex scenarios. We investigate a simple heuristic based on an estimate of the Lipschitz constant that captures the most important aspect of this interaction (i.e. local repulsion) at negligible computational overhead. The resulting algorithm compares well, in running time, with much more elaborate alternatives. The approach assumes that the function of interest, $f$, is a Lipschitz continuous function. A wrap-loop around the acquisition function is used to collect batches of points of certain size minimizing the non-parallelizable computational effort. The speed-up of our method with respect to previous approaches is significant in a set of computationally expensive experiments.

...read moreread less

Journal Article•DOI•

Discriminative Shared Gaussian Processes for Multiview and View-Invariant Facial Expression Recognition

[...]

Stefanos Eleftheriadis¹, Ognjen Rudovic¹, Maja Pantic¹•Institutions (1)

Imperial College London¹

01 Jan 2015-IEEE Transactions on Image Processing

TL;DR: A discriminative shared Gaussian process latent variable model (DS-GPLVM) for multiview and view-invariant classification of facial expressions from multiple views is proposed and validated.

...read moreread less

Abstract: Images of facial expressions are often captured from various views as a result of either head movements or variable camera position. Existing methods for multiview and/or view-invariant facial expression recognition typically perform classification of the observed expression using either classifiers learned separately for each view or a single classifier learned for all views. However, these approaches ignore the fact that different views of a facial expression are just different manifestations of the same facial expression. By accounting for this redundancy, we can design more effective classifiers for the target task. To this end, we propose a discriminative shared Gaussian process latent variable model (DS-GPLVM) for multiview and view-invariant classification of facial expressions from multiple views. In this model, we first learn a discriminative manifold shared by multiple views of a facial expression. Subsequently, we perform facial expression classification in the expression manifold. Finally, classification of an observed facial expression is carried out either in the view-invariant manner (using only a single view of the expression) or in the multiview manner (using multiple views of the expression). The proposed model can also be used to perform fusion of different facial features in a principled manner. We validate the proposed DS-GPLVM on both posed and spontaneously displayed facial expressions from three publicly available datasets (MultiPIE, labeled face parts in the wild, and static facial expressions in the wild). We show that this model outperforms the state-of-the-art methods for multiview and view-invariant facial expression classification, and several state-of-the-art methods for multiview learning and feature fusion.

...read moreread less

Journal Article•DOI•

Regression and Kriging Metamodels with Their Experimental Designs in Simulation: Review

[...]

Jack P. C. Kleijnen¹•Institutions (1)

Tilburg University¹

06 Jul 2015-Social Science Research Network

TL;DR: This article reviews the design and analysis of simulation experiments and focuses on analysis via two types of metamodel, namely, low-order polynomial regression, and Kriging (or Gaussian process).

...read moreread less

Abstract: This article reviews the design and analysis of simulation experiments. It focusses on analysis via either low-order polynomial regression or Kriging (also known as Gaussian process) metamodels. The type of metamodel determines the design of the experiment, which determines the input combinations of the simulation experiment. For example, a first-order polynomial metamodel requires a "resolution-III" design, whereas Kriging may use Latin hypercube sampling. Polynomials of first or second order require resolution III, IV, V, or "central composite" designs. Before applying either regression or Kriging, sequential bifurcation may be applied to screen a great many inputs. Optimization of the simulated system may use either a sequence of low-order polynomials known as response surface methodology (RSM) or Kriging models fitted through sequential designs including efficient global optimization (EGO). The review includes robust optimization, which accounts for uncertain simulation inputs.

...read moreread less

Journal Article•DOI•

System probabilistic stability analysis of soil slopes using Gaussian process regression with Latin hypercube sampling

[...]

Fei Kang¹, Fei Kang², Shaoxuan Han¹, Rodrigo Salgado², Junjie Li¹ - Show less +1 more•Institutions (2)

Dalian University of Technology¹, Purdue University²

01 Jan 2015-Computers and Geotechnics

TL;DR: Computer simulation results show that the proposed system reliability analysis method can accurately give the system failure probability with a relatively small number of deterministic slope stability analyses.

...read moreread less

Journal Article•DOI•

Extended Target Tracking Using Gaussian Processes

[...]

Niklas Wahlström¹, Emre Ozkan¹•Institutions (1)

Linköping University¹

17 Apr 2015-IEEE Transactions on Signal Processing

TL;DR: This paper proposes using Gaussian processes to track an extended object or group of objects, that generates multiple measurements at each scan, that creates a model that describes the shape and the kinematics of the object.

...read moreread less

Abstract: In this paper, we propose using Gaussian processes to track an extended object or group of objects, that generates multiple measurements at each scan. The shape and the kinematics of the object are ...

...read moreread less

Journal Article•DOI•

A multi-resolution approximation for massive spatial datasets

[...]

Matthias Katzfuss¹•Institutions (1)

Texas A&M University¹

16 Jul 2015-arXiv: Methodology

TL;DR: A multi-resolution approximation (M-RA) of Gaussian processes observed at irregular locations in space is proposed, which can capture spatial structure from very fine to very large scales.

...read moreread less

Abstract: Automated sensing instruments on satellites and aircraft have enabled the collection of massive amounts of high-resolution observations of spatial fields over large spatial regions. If these datasets can be efficiently exploited, they can provide new insights on a wide variety of issues. However, traditional spatial-statistical techniques such as kriging are not computationally feasible for big datasets. We propose a multi-resolution approximation (M-RA) of Gaussian processes observed at irregular locations in space. The M-RA process is specified as a linear combination of basis functions at multiple levels of spatial resolution, which can capture spatial structure from very fine to very large scales. The basis functions are automatically chosen to approximate a given covariance function, which can be nonstationary. All computations involving the M-RA, including parameter inference and prediction, are highly scalable for massive datasets. Crucially, the inference algorithms can also be parallelized to take full advantage of large distributed-memory computing environments. In comparisons using simulated data and a large satellite dataset, the M-RA outperforms a related state-of-the-art method.

...read moreread less

Proceedings Article•DOI•

Safe and robust learning control with Gaussian processes

[...]

Felix Berkenkamp¹, Angela P. Schoellig²•Institutions (2)

ETH Zurich¹, University of Toronto²

15 Jul 2015

TL;DR: A stabilization task, linearizes the nonlinear, GP-based model around a desired operating point, and solves a convex optimization problem to obtain a linear robust controller that provides robust stability and performance guarantees during learning.

...read moreread less

Abstract: This paper introduces a learning-based robust control algorithm that provides robust stability and performance guarantees during learning. The approach uses Gaussian process (GP) regression based on data gathered during operation to update an initial model of the system and to gradually decrease the uncertainty related to this model. Embedding this data-based update scheme in a robust control framework guarantees stability during the learning process. Traditional robust control approaches have not considered online adaptation of the model and its uncertainty before. As a result, their controllers do not improve performance during operation. Typical machine learning algorithms that have achieved similar high-performance behavior by adapting the model and controller online do not provide the guarantees presented in this paper. In particular, this paper considers a stabilization task, linearizes the nonlinear, GP-based model around a desired operating point, and solves a convex optimization problem to obtain a linear robust controller. The resulting performance improvements due to the learning-based controller are demonstrated in experiments on a quadrotor vehicle.

...read moreread less

Journal Article•DOI•

Bayesian Nonparametric Adaptive Control Using Gaussian Processes

[...]

Girish Chowdhary¹, Hassan A. Kingravi¹, Jonathan P. How², Patricio A. Vela³•Institutions (3)

Oklahoma State University–Stillwater¹, Massachusetts Institute of Technology², Georgia Institute of Technology³

01 Mar 2015-IEEE Transactions on Neural Networks

TL;DR: This paper investigates a Gaussian process-based Bayesian MRAC architecture (GP-MRAC), which leverages the power and flexibility of GP Bayesian nonparametric models of uncertainty and enables MRAC to handle a broader set of uncertainties, including those that are defined as distributions over functions.

...read moreread less

Abstract: Most current model reference adaptive control (MRAC) methods rely on parametric adaptive elements, in which the number of parameters of the adaptive element are fixed a priori, often through expert judgment. An example of such an adaptive element is radial basis function networks (RBFNs), with RBF centers preallocated based on the expected operating domain. If the system operates outside of the expected operating domain, this adaptive element can become noneffective in capturing and canceling the uncertainty, thus rendering the adaptive controller only semiglobal in nature. This paper investigates a Gaussian process-based Bayesian MRAC architecture (GP-MRAC), which leverages the power and flexibility of GP Bayesian nonparametric models of uncertainty. The GP-MRAC does not require the centers to be preallocated, can inherently handle measurement noise, and enables MRAC to handle a broader set of uncertainties, including those that are defined as distributions over functions. We use stochastic stability arguments to show that GP-MRAC guarantees good closed-loop performance with no prior domain knowledge of the uncertainty. Online implementable GP inference methods are compared in numerical simulations against RBFN-MRAC with preallocated centers and are shown to provide better tracking and improved long-term learning.

...read moreread less

Journal Article•DOI•

Multitask Gaussian processes for multivariate physiological time-series analysis.

[...]

Robert Dürichen, Marco A. F. Pimentel¹, Lei Clifton¹, Achim Schweikard², David A. Clifton¹ - Show less +1 more•Institutions (2)

University of Oxford¹, University of Lübeck²

01 Jan 2015-IEEE Transactions on Biomedical Engineering

TL;DR: This work investigates MTGPs for physiological monitoring with synthetic data sets and two real-world problems from the field of patient monitoring and radiotherapy, and shows that the framework learned the correlation between physiological time series efficiently, outperforming the existing state of the art.

...read moreread less

Abstract: Gaussian process (GP) models are a flexible means of performing nonparametric Bayesian regression. However, GP models in healthcare are often only used to model a single univariate output time series, denoted as single-task GPs (STGP). Due to an increasing prevalence of sensors in healthcare settings, there is an urgent need for robust multivariate time-series tools. Here, we propose a method using multitask GPs (MTGPs) which can model multiple correlated multivariate physiological time series simultaneously. The flexible MTGP framework can learn the correlation between multiple signals even though they might be sampled at different frequencies and have training sets available for different intervals. Furthermore, prior knowledge of any relationship between the time series such as delays and temporal behavior can be easily integrated. A novel normalization is proposed to allow interpretation of the various hyperparameters used in the MTGP. We investigate MTGPs for physiological monitoring with synthetic data sets and two real-world problems from the field of patient monitoring and radiotherapy. The results are compared with standard Gaussian processes and other existing methods in the respective biomedical application areas. In both cases, we show that our framework learned the correlation between physiological time series efficiently, outperforming the existing state of the art.

...read moreread less

Journal Article•DOI•

Fast machine-learning online optimization of ultra-cold-atom experiments

[...]

P. B. Wigley¹, P. J. Everitt¹, A. van den Hengel², John Bastian², M. A. Sooriyabandara¹, Gordon McDonald¹, K. S. Hardman¹, C. D. Quinlivan¹, P. Manju¹, Carlos C. N. Kuhn¹, Ian R. Petersen³, Andre N. Luiten², Joseph Hope¹, Nicholas Robins¹, Michael R. Hush³ - Show less +11 more•Institutions (3)

Australian National University¹, University of Adelaide², University of New South Wales³

17 Jul 2015-arXiv: Quantum Physics

TL;DR: It is demonstrated that the Gaussian process machine learner is able to discover a ramp that produces high quality BECs in 10 times fewer iterations than a previously used online optimization technique.

...read moreread less

Abstract: Machine-designed control of complex devices or experiments can discover strategies superior to those developed via simplified models. We describe an online optimization algorithm based on Gaussian processes and apply it to optimization of the production of Bose-Einstein condensates (BEC). BEC is typically created with an exponential evaporation ramp that is approximately optimal for s-wave, ergodic dynamics with two-body interactions and no other loss rates, but likely sub-optimal for many real experiments. Machine learning using a Gaussian process, in contrast, develops a statistical model of the relationship between the parameters it controls and the quality of the BEC produced. This is an online process, and an active one, as the Gaussian process model updates on the basis of each subsequent experiment and proposes a new set of parameters as a result. We demonstrate that the Gaussian process machine learner is able to discover a ramp that produces high quality BECs in 10 times fewer iterations than a previously used online optimization technique. Furthermore, we show the internal model developed can be used to determine which parameters are essential in BEC creation and which are unimportant, providing insight into the optimization process.

...read moreread less

Proceedings Article•DOI•

Face alignment using cascade Gaussian process regression trees

[...]

Donghoon Lee¹, Hyunsin Park¹, Chang D. Yoo¹•Institutions (1)

KAIST¹

07 Jun 2015

TL;DR: Compared with the previous CRT-based face alignment methods that have shown state-of-the-art performances, cGPRT using shape-indexed DoG features performed best on the HELEN and 300-W datasets which are the most challenging dataset today.

...read moreread less

Abstract: In this paper, we propose a face alignment method that uses cascade Gaussian process regression trees (cGPRT) constructed by combining Gaussian process regression trees (GPRT) in a cascade stage-wise manner. Here, GPRT is a Gaussian process with a kernel defined by a set of trees. The kernel measures the similarity between two inputs as the number of trees where the two inputs fall in the same leaves. Without increasing prediction time, the prediction of cGPRT can be performed in the same framework as the cascade regression trees (CRT) but with better generalization. Features for GPRT are designed using shape-indexed difference of Gaussian (DoG) filter responses sampled from local retinal patterns to increase stability and to attain robustness against geometric variances. Compared with the previous CRT-based face alignment methods that have shown state-of-the-art performances, cGPRT using shape-indexed DoG features performed best on the HELEN and 300-W datasets which are the most challenging dataset today.

...read moreread less

Journal Article•DOI•

First order reliability method for time-variant problems using series expansions

[...]

Zhen Hu¹, Xiaoping Du¹•Institutions (1)

Missouri University of Science and Technology¹

01 Jan 2015-Structural and Multidisciplinary Optimization

TL;DR: In this article, a new simulation method with the first order approximation and series expansions is proposed to improve the accuracy and efficiency of the Rice/FORM method, which maps the general stochastic process of the response into a Gaussian process, whose samples are then generated by the Expansion Optimal Linear Estimation if the response is stationary or by the Orthogonal Series Expansion if a response is non-stationary.

...read moreread less

Abstract: Time-variant reliability is often evaluated by Rice's formula combined with the First Order Reliability Method (FORM). To improve the accuracy and efficiency of the Rice/FORM method, this work develops a new simulation method with the first order approximation and series expansions. The approximation maps the general stochastic process of the response into a Gaussian process, whose samples are then generated by the Expansion Optimal Linear Estimation if the response is stationary or by the Orthogonal Series Expansion if the response is non-stationary. As the computational cost largely comes from estimating the covariance of the response at expansion points, a cheaper surrogate model of the covariance is built and allows for significant reduction in computational cost. In addition to its superior accuracy and efficiency over the Rice/FORM method, the proposed method can also produce the failure rate and probability of failure with respect to time for a given period of time within only one reliability analysis.

...read moreread less

Journal Article•DOI•

Estimation of Hüsler–Reiss distributions and Brown–Resnick processes

[...]

Sebastian Engelke¹, Sebastian Engelke², Alexander Malinowski³, Zakhar Kabluchko⁴, Martin Schlather³ - Show less +1 more•Institutions (4)

University of Lausanne¹, University of Göttingen², University of Mannheim³, University of Ulm⁴

01 Jan 2015-Journal of The Royal Statistical Society Series B-statistical Methodology

TL;DR: In this article, an approach based on peaks over thresholds that provides several new estimators for processes η in the max-domain of attraction of the frequently used Husler-Reiss model and its spatial extension: Brown-Resnick processes.

...read moreread less

Abstract: Summary Estimation of extreme value parameters from observations in the max-domain of attraction of a multivariate max-stable distribution commonly uses aggregated data such as block maxima. Multivariate peaks-over-threshold methods, in contrast, exploit additional information from the non-aggregated ‘large’ observations. We introduce an approach based on peaks over thresholds that provides several new estimators for processes η in the max-domain of attraction of the frequently used Husler–Reiss model and its spatial extension: Brown–Resnick processes. The method relies on increments η(·)−η(t0) conditional on η(t0) exceeding a high threshold, where t0 is a fixed location. When the marginals are standardized to the Gumbel distribution, these increments asymptotically form a Gaussian process resulting in computationally simple estimates of the Husler–Reiss parameter matrix and particularly enables parametric inference for Brown–Resnick processes based on (high dimensional) multivariate densities. This is a major advantage over composite likelihood methods that are commonly used in spatial extreme value statistics since they rely only on bivariate densities. A simulation study compares the performance of the new estimators with other commonly used methods. As an application, we fit a non-isotropic Brown–Resnick process to the extremes of 12-year data of daily wind speed measurements.

...read moreread less

Posted Content•

The Variational Gaussian Process

[...]

Dustin Tran¹, Rajesh Ranganath², David M. Blei³•Institutions (3)

Harvard University¹, Princeton University², Columbia University³

20 Nov 2015-arXiv: Machine Learning

TL;DR: The Variational Gaussian Process (VGP) as discussed by the authors generates approximate posterior samples by generating latent inputs and warping them through random nonlinear mappings; the distribution over random mappings is learned during inference, enabling the transformed outputs to adapt to varying complexity.

...read moreread less

Abstract: Variational inference is a powerful tool for approximate inference, and it has been recently applied for representation learning with deep generative models. We develop the variational Gaussian process (VGP), a Bayesian nonparametric variational family, which adapts its shape to match complex posterior distributions. The VGP generates approximate posterior samples by generating latent inputs and warping them through random non-linear mappings; the distribution over random mappings is learned during inference, enabling the transformed outputs to adapt to varying complexity. We prove a universal approximation theorem for the VGP, demonstrating its representative power for learning any model. For inference we present a variational objective inspired by auto-encoders and perform black box inference over a wide class of models. The VGP achieves new state-of-the-art results for unsupervised learning, inferring models such as the deep latent Gaussian model and the recently proposed DRAW.

...read moreread less

Proceedings Article•

MCMC for variationally sparse Gaussian processes

[...]

James Hensman¹, Alexander G. de G. Matthews², Maurizio Filippone³, Zoubin Ghahramani²•Institutions (3)

Lancaster University¹, University of Cambridge², Institut Eurécom³

07 Dec 2015

TL;DR: A Hybrid Monte-Carlo sampling scheme which allows for a non-Gaussian approximation over the function values and covariance parameters simultaneously, with efficient computations based on inducing-point sparse GPs.

...read moreread less

Abstract: Gaussian process (GP) models form a core part of probabilistic machine learning. Considerable research effort has been made into attacking three issues with GP models: how to compute efficiently when the number of data is large; how to approximate the posterior when the likelihood is not Gaussian and how to estimate covariance function parameter posteriors. This paper simultaneously addresses these, using a variational approximation to the posterior which is sparse in support of the function but otherwise free-form. The result is a Hybrid Monte-Carlo sampling scheme which allows for a non-Gaussian approximation over the function values and covariance parameters simultaneously, with efficient computations based on inducing-point sparse GPs. Code to replicate each experiment in this paper is available at github.com/sparseMCMC.

...read moreread less

Journal Article•DOI•

A new prognostics method for state of health estimation of lithium-ion batteries based on a mixture of Gaussian process models and particle filter

[...]

Fan Li¹, Jiuping Xu¹•Institutions (1)

Sichuan University¹

01 Jun 2015-Microelectronics Reliability

TL;DR: A novel integrated approach based on a mixture of Gaussian process (MGP) model and particle filtering (PF) is presented for lithium-ion battery SOH estimation under uncertain conditions, where the distribution of the degradation process is learnt from the inputs based on the available capacity monitoring data.

...read moreread less

Journal Article•DOI•

Multiobjective optimization using Gaussian process emulators via stepwise uncertainty reduction

[...]

Victor Picheny¹•Institutions (1)

Institut national de la recherche agronomique¹

01 Nov 2015-Statistics and Computing

TL;DR: The method is tested on several numerical examples and on an agronomy problem, showing that it provides an efficient trade-off between exploration and intensification.

...read moreread less

Abstract: Optimization of expensive computer models with the help of Gaussian process emulators is now commonplace. However, when several (competing) objectives are considered, choosing an appropriate sampling strategy remains an open question. We present here a new algorithm based on stepwise uncertainty reduction principles. Optimization is seen as a sequential reduction of the volume of the excursion sets below the current best solutions (Pareto set), and our sampling strategy chooses the points that give the highest expected reduction. The method is tested on several numerical examples and on an agronomy problem, showing that it provides an efficient trade-off between exploration and intensification.

...read moreread less

Collapse