Showing papers on "Gaussian process published in 2013"

PDF

Open Access

Proceedings Article•

[...]

James Hensman¹, Nicolo Fusi¹, Neil D. Lawrence¹•Institutions (1)

11 Aug 2013

TL;DR: In this article, the authors introduce stochastic variational inference for Gaussian process models, which enables the application of Gaussian Process (GP) models to data sets containing millions of data points.

...read moreread less

Abstract: We introduce stochastic variational inference for Gaussian process models. This enables the application of Gaussian process (GP) models to data sets containing millions of data points. We show how GPs can be variationally decomposed to depend on a set of globally relevant inducing variables which factorize the model in the necessary manner to perform variational inference. Our approach is readily extended to models with non-Gaussian likelihoods and latent variable models based around Gaussian processes. We demonstrate the approach on a simple toy problem and two real world data sets.

...read moreread less

898 citations

Proceedings Article•

Deep Gaussian Processes

[...]

Andreas Damianou¹, Neil D. Lawrence¹•Institutions (1)

University of Sheffield¹

29 Apr 2013

TL;DR: Deep Gaussian process (GP) models are introduced and model selection by the variational bound shows that a five layer hierarchy is justified even when modelling a digit data set containing only 150 examples.

...read moreread less

Abstract: In this paper we introduce deep Gaussian process (GP) models. Deep GPs are a deep belief network based on Gaussian process mappings. The data is modeled as the output of a multivariate GP. The inputs to that Gaussian process are then governed by another GP. A single layer model is equivalent to a standard GP or the GP latent variable model (GP-LVM). We perform inference in the model by approximate variational marginalization. This results in a strict lower bound on the marginal likelihood of the model which we use for model selection (number of layers and nodes per layer). Deep belief networks are typically applied to relatively large data sets using stochastic gradient descent for optimization. Our fully Bayesian treatment allows for the application of deep models even when data is scarce. Model selection by our variational bound shows that a five layer hierarchy is justified even when modelling a digit data set containing only 150 examples.

...read moreread less

743 citations

Proceedings Article•DOI•

Saliency Detection via Dense and Sparse Reconstruction

[...]

Xiaohui Li¹, Huchuan Lu¹, Lihe Zhang¹, Xiang Ruan², Ming-Hsuan Yang³ - Show less +1 more•Institutions (3)

Dalian University of Technology¹, Omron², University of California, Merced³

01 Dec 2013

TL;DR: A visual saliency detection algorithm from the perspective of reconstruction errors that applies the Bayes formula to integrate saliency measures based on dense and sparse reconstruction errors and refined by an object-biased Gaussian model is proposed.

...read moreread less

Abstract: In this paper, we propose a visual saliency detection algorithm from the perspective of reconstruction errors. The image boundaries are first extracted via super pixels as likely cues for background templates, from which dense and sparse appearance models are constructed. For each image region, we first compute dense and sparse reconstruction errors. Second, the reconstruction errors are propagated based on the contexts obtained from K-means clustering. Third, pixel-level saliency is computed by an integration of multi-scale reconstruction errors and refined by an object-biased Gaussian model. We apply the Bayes formula to integrate saliency measures based on dense and sparse reconstruction errors. Experimental results show that the proposed algorithm performs favorably against seventeen state-of-the-art methods in terms of precision and recall. In addition, the proposed algorithm is demonstrated to be more effective in highlighting salient objects uniformly and robust to background noise.

...read moreread less

725 citations

Journal Article•DOI•

MCMC Methods for Functions: Modifying Old Algorithms to Make Them Faster

[...]

Simon L. Cotter, Gareth O. Roberts, Andrew M. Stuart, David White

28 Aug 2013-Statistical Science

TL;DR: An approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement.

...read moreread less

Abstract: Many problems arising in applications result in the need to probe a probability distribution for functions. Examples include Bayesian nonparametric statistics and conditioned diffusion processes. Standard MCMC algorithms typically become arbitrarily slow under the mesh refinement dictated by nonparametric description of the unknown function. We describe an approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement. Gaussian processes or random fields are fields whose marginal distributions, when evaluated at any finite set of NNpoints, are ℝ^N-valued Gaussians. The algorithmic approach that we describe is applicable not only when the desired probability measure has density with respect to a Gaussian process or Gaussian random field reference measure, but also to some useful non-Gaussian reference measures constructed through random truncation. In the applications of interest the data is often sparse and the prior specification is an essential part of the overall modelling strategy. These Gaussian-based reference measures are a very flexible modelling tool, finding wide-ranging application. Examples are shown in density estimation, data assimilation in fluid mechanics, subsurface geophysics and image registration. The key design principle is to formulate the MCMC method so that it is, in principle, applicable for functions; this may be achieved by use of proposals based on carefully chosen time-discretizations of stochastic dynamical systems which exactly preserve the Gaussian reference measure. Taking this approach leads to many new algorithms which can be implemented via minor modification of existing algorithms, yet which show enormous speed-up on a wide range of applied problems.

...read moreread less

553 citations

Journal Article•DOI•

Gaussian processes for time-series modelling

[...]

Stephen J. Roberts¹, Michael A. Osborne¹, Mark Ebden¹, Steve Reece¹, Neale P. Gibson¹, Suzanne Aigrain¹ - Show less +2 more•Institutions (1)

University of Oxford¹

13 Feb 2013-Philosophical Transactions of the Royal Society A

TL;DR: This paper discusses how domain knowledge influences design of the Gaussian process models and provides case examples to highlight the approaches.

...read moreread less

Abstract: In this paper, we offer a gentle introduction to Gaussian processes for time-series data analysis. The conceptual framework of Bayesian modelling for time-series data is discussed and the foundations of Bayesian non-parametric modelling presented for Gaussian processes . We discuss how domain knowledge influences design of the Gaussian process models and provide case examples to highlight the approaches.

...read moreread less

502 citations

Journal Article•DOI•

Statistical Characterization and Computationally Efficient Modeling of a Class of Underwater Acoustic Communication Channels

[...]

Parastoo Qarabaqi¹, Milica Stojanovic¹•Institutions (1)

Northeastern University¹

30 Sep 2013-IEEE Journal of Oceanic Engineering

TL;DR: In this paper, the authors proposed a statistical channel model which incorporates physical laws of acoustic propagation (frequency-dependent attenuation, bottom/surface reflections), as well as the effects of inevitable random local displacements.

...read moreread less

Abstract: Underwater acoustic channel models provide a tool for predicting the performance of communication systems before deployment, and are thus essential for system design. In this paper, we offer a statistical channel model which incorporates physical laws of acoustic propagation (frequency-dependent attenuation, bottom/surface reflections), as well as the effects of inevitable random local displacements. Specifically, we focus on random displacements on two scales: those that involve distances on the order of a few wavelengths, to which we refer as small-scale effects, and those that involve many wavelengths, to which we refer as large-scale effects. Small-scale effects include scattering and motion-induced Doppler shifting, and are responsible for fast variations of the instantaneous channel response, while large-scale effects describe the location uncertainty and changing environmental conditions, and affect the locally averaged received power. We model each propagation path by a large-scale gain and micromultipath components that cumulatively result in a complex Gaussian distortion. Time- and frequency-correlation properties of the path coefficients are assessed analytically, leading to a computationally efficient model for numerical channel simulation. Random motion of the surface and transmitter/receiver displacements introduce additional variation whose temporal correlation is described by Bessel-type functions. The total energy, or the gain contained in the channel, averaged over small scale, is modeled as log-normally distributed. The models are validated using real data obtained from four experiments. Specifically, experimental data are used to assess the distribution and the autocorrelation functions of the large-scale transmission loss and the short-term path gains. While the former indicates a log-normal distribution with an exponentially decaying autocorrelation, the latter indicates a conditional Ricean distribution with Bessel-type autocorrelation.

...read moreread less

436 citations

Report•DOI•

Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors

[...]

Victor Chernozhukov, Denis Chetverikov, Kengo Kato

01 Dec 2013-Annals of Statistics

TL;DR: It is demonstrated how the Gaussian approximations and the multiplier bootstrap can be used for modern high dimensional estimation, multiple hypothesis testing, and adaptive specification testing.

...read moreread less

Abstract: We derive a Gaussian approximation result for the maximum of a sum of high-dimensional random vectors. Specifically, we establish conditions under which the distribution of the maximum is approximated by that of the maximum of a sum of the Gaussian random vectors with the same covariance matrices as the original vectors. This result applies when the dimension of random vectors ($p$) is large compared to the sample size ($n$); in fact, $p$ can be much larger than $n$, without restricting correlations of the coordinates of these vectors. We also show that the distribution of the maximum of a sum of the random vectors with unknown covariance matrices can be consistently estimated by the distribution of the maximum of a sum of the conditional Gaussian random vectors obtained by multiplying the original vectors with i.i.d. Gaussian multipliers. This is the Gaussian multiplier (or wild) bootstrap procedure. Here too, $p$ can be large or even much larger than $n$. These distributional approximations, either Gaussian or conditional Gaussian, yield a high-quality approximation to the distribution of the original maximum, often with approximation error decreasing polynomially in the sample size, and hence are of interest in many applications. We demonstrate how our Gaussian approximations and the multiplier bootstrap can be used for modern high-dimensional estimation, multiple hypothesis testing, and adaptive specification testing. All these results contain nonasymptotic bounds on approximation errors.

...read moreread less

383 citations

Posted Content•

Gaussian Processes for Big Data

[...]

James Hensman¹, Nicolo Fusi¹, Neil D. Lawrence¹•Institutions (1)

University of Sheffield¹

26 Sep 2013-arXiv: Learning

TL;DR: Stochastic variational inference for Gaussian process models is introduced and it is shown how GPs can be variationally decomposed to depend on a set of globally relevant inducing variables which factorize the model in the necessary manner to perform Variational inference.

...read moreread less

Abstract: We introduce stochastic variational inference for Gaussian process models. This enables the application of Gaussian process (GP) models to data sets containing millions of data points. We show how GPs can be vari- ationally decomposed to depend on a set of globally relevant inducing variables which factorize the model in the necessary manner to perform variational inference. Our ap- proach is readily extended to models with non-Gaussian likelihoods and latent variable models based around Gaussian processes. We demonstrate the approach on a simple toy problem and two real world data sets.

...read moreread less

374 citations

Posted Content•

Gaussian Process Kernels for Pattern Discovery and Extrapolation

[...]

Andrew Gordon Wilson¹, Ryan P. Adams²•Institutions (2)

University of Cambridge¹, Harvard University²

18 Feb 2013-arXiv: Machine Learning

TL;DR: In this paper, simple closed-form kernels are derived by modelling a spectral density with a Gaussian mixture, which can be used with Gaussian processes to discover patterns and enable extrapolation, and demonstrate the proposed kernels by discovering patterns and performing long range extrapolation on synthetic examples, as well as atmospheric CO2 trends and airline passenger data.

...read moreread less

Abstract: Gaussian processes are rich distributions over functions, which provide a Bayesian nonparametric approach to smoothing and interpolation. We introduce simple closed form kernels that can be used with Gaussian processes to discover patterns and enable extrapolation. These kernels are derived by modelling a spectral density -- the Fourier transform of a kernel -- with a Gaussian mixture. The proposed kernels support a broad class of stationary covariances, but Gaussian process inference remains simple and analytic. We demonstrate the proposed kernels by discovering patterns and performing long range extrapolation on synthetic examples, as well as atmospheric CO2 trends and airline passenger data. We also show that we can reconstruct standard covariances within our framework.

...read moreread less

356 citations

MCMC Methods for Functions: ModifyingOld Algorithms to Make Them Faster

[...]

Simon L. Cotter, Gareth O. Roberts, Andrew M. Stuart, David White

01 Jan 2013

TL;DR: In this paper, the authors describe an approach to modify a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement.

...read moreread less

Abstract: Many problems arising in applications result in the need to probe a probability distribution for functions. Examples include Bayesian nonparametric statistics and conditioned diffusion processes. Standard MCMC algorithms typically become arbitrarily slow under the mesh refinement dictated by nonparametric description of the un- known function. We describe an approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement. Gaussian processes or random fields are fields whose marginal distri- butions, when evaluated at any finite set of N points, are RN-valued Gaussians. The algorithmic approach that we describe is applicable not only when the desired probability measure has density with respect to a Gaussian process or Gaussian random field reference measure, but also to some useful non-Gaussian reference measures constructed through random truncation. In the applications of interest the data is often sparse and the prior specification is an essential part of the over- all modelling strategy. These Gaussian-based reference measures are a very flexible modelling tool, finding wide-ranging application. Examples are shown in density estimation, data assimilation in fluid mechanics, subsurface geophysics and image registration. The key design principle is to formulate the MCMC method so that it is, in principle, applicable for functions; this may be achieved by use of proposals based on carefully chosen time-discretizations of stochas- tic dynamical systems which exactly preserve the Gaussian reference measure. Taking this approach leads to many new algorithms which can be implemented via minor modification of existing algorithms, yet which show enormous speed-up on a wide range of applied problems.

...read moreread less

340 citations

Journal Article•DOI•

A benchmark of kriging-based infill criteria for noisy optimization

[...]

Victor Picheny¹, Tobias Wagner², David Ginsbourger³•Institutions (3)

Institut national de la recherche agronomique¹, Technical University of Dortmund², University of Bern³

01 Sep 2013-Structural and Multidisciplinary Optimization

TL;DR: A comprehensive review of existing kriging-based methods for the optimization of noisy functions is provided, and the three most intuitive criteria are found as poor alternatives.

...read moreread less

Abstract: Responses of many real-world problems can only be evaluated perturbed by noise. In order to make an efficient optimization of these problems possible, intelligent optimization strategies successfully coping with noisy evaluations are required. In this article, a comprehensive review of existing kriging-based methods for the optimization of noisy functions is provided. In summary, ten methods for choosing the sequential samples are described using a unified formalism. They are compared on analytical benchmark problems, whereby the usual assumption of homoscedastic Gaussian noise made in the underlying models is meet. Different problem configurations (noise level, maximum number of observations, initial number of observations) and setups (covariance functions, budget, initial sample size) are considered. It is found that the choices of the initial sample size and the covariance function are not critical. The choice of the method, however, can result in significant differences in the performance. In particular, the three most intuitive criteria are found as poor alternatives. Although no criterion is found consistently more efficient than the others, two specialized methods appear more robust on average.

...read moreread less

Journal Article•DOI•

Asymptotic Analysis of Complex LASSO via Complex Approximate Message Passing (CAMP)

[...]

Arian Maleki¹, Laura Anitori, Zai Yang², Richard G. Baraniuk³•Institutions (3)

Columbia University¹, Nanyang Technological University², Rice University³

01 Jul 2013-IEEE Transactions on Information Theory

TL;DR: The approximate message passing (AMP) algorithm is extended to solve the complex-valued LASSO problem and obtained the complex approximate message passed algorithm (CAMP), and the state evolution framework recently introduced for the analysis of AMP is generalized to the complex setting.

...read moreread less

Abstract: Recovering a sparse signal from an undersampled set of random linear measurements is the main problem of interest in compressed sensing. In this paper, we consider the case where both the signal and the measurements are complex-valued. We study the popular recovery method of l1-regularized least squares or LASSO. While several studies have shown that LASSO provides desirable solutions under certain conditions, the precise asymptotic performance of this algorithm in the complex setting is not yet known. In this paper, we extend the approximate message passing (AMP) algorithm to solve the complex-valued LASSO problem and obtain the complex approximate message passing algorithm (CAMP). We then generalize the state evolution framework recently introduced for the analysis of AMP to the complex setting. Using the state evolution, we derive accurate formulas for the phase transition and noise sensitivity of both LASSO and CAMP. Our theoretical results are concerned with the case of i.i.d. Gaussian sensing matrices. Simulations confirm that our results hold for a larger class of random matrices.

...read moreread less

Journal Article•

GPstuff: Bayesian modeling with Gaussian processes

[...]

Jarno Vanhatalo¹, Jaakko Riihimäki², Jouni Hartikainen², Pasi Jylänki², Ville Tolvanen², Aki Vehtari² - Show less +2 more•Institutions (2)

University of Helsinki¹, Aalto University²

01 Jan 2013-Journal of Machine Learning Research

TL;DR: The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference, including various inference methods, sparse approximations and model assessment methods.

...read moreread less

Abstract: The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference. The tools include, among others, various inference methods, sparse approximations and model assessment methods.

...read moreread less

Proceedings Article•

Gaussian Process Kernels for Pattern Discovery and Extrapolation

[...]

Andrew Gordon Wilson¹, Ryan P. Adams²•Institutions (2)

University of Cambridge¹, Harvard University²

16 Jun 2013

TL;DR: In this article, simple closed-form kernels are derived by modelling a spectral density with a Gaussian mixture, which can be used with Gaussian processes to discover patterns and enable extrapolation, and demonstrate the proposed kernels by discovering patterns and performing long range extrapolation on synthetic examples, as well as atmospheric CO2 trends and airline passenger data.

...read moreread less

Abstract: Gaussian processes are rich distributions over functions, which provide a Bayesian nonparametric approach to smoothing and interpolation. We introduce simple closed form kernels that can be used with Gaussian processes to discover patterns and enable extrapolation. These kernels are derived by modelling a spectral density - the Fourier transform of a kernel - with a Gaussian mixture. The proposed kernels support a broad class of stationary covariances, but Gaussian process inference remains simple and analytic. We demonstrate the proposed kernels by discovering patterns and performing long range extrapolation on synthetic examples, as well as atmospheric CO2 trends and airline passenger data. We also show that it is possible to reconstruct several popular standard covariances within our framework.

...read moreread less

Journal Article•DOI•

Navigating the protein fitness landscape with Gaussian processes

[...]

Philip A. Romero¹, Andreas Krause², Frances H. Arnold¹•Institutions (2)

California Institute of Technology¹, École Polytechnique Fédérale de Lausanne²

15 Jan 2013-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The ability of Gaussian processes to guide the search through protein sequence space by designing, constructing, and testing chimeric cytochrome P450s allowed us to engineer active P450 enzymes that are more thermostable than any previously made by chimeragenesis, rational design, or directed evolution.

...read moreread less

Abstract: Knowing how protein sequence maps to function (the “fitness landscape”) is critical for understanding protein evolution as well as for engineering proteins with new and useful properties. We demonstrate that the protein fitness landscape can be inferred from experimental data, using Gaussian processes, a Bayesian learning technique. Gaussian process landscapes can model various protein sequence properties, including functional status, thermostability, enzyme activity, and ligand binding affinity. Trained on experimental data, these models achieve unrivaled quantitative accuracy. Furthermore, the explicit representation of model uncertainty allows for efficient searches through the vast space of possible sequences. We develop and test two protein sequence design algorithms motivated by Bayesian decision theory. The first one identifies small sets of sequences that are informative about the landscape; the second one identifies optimized sequences by iteratively improving the Gaussian process model in regions of the landscape that are predicted to be optimized. We demonstrate the ability of Gaussian processes to guide the search through protein sequence space by designing, constructing, and testing chimeric cytochrome P450s. These algorithms allowed us to engineer active P450 enzymes that are more thermostable than any previously made by chimeragenesis, rational design, or directed evolution.

...read moreread less

Journal Article•DOI•

Gaussian Bare-Bones Differential Evolution

[...]

Hui Wang¹, Shahryar Rahnamayan², Hui Sun¹, Mahamed G. H. Omran³•Institutions (3)

Nanchang Institute of Technology¹, University of Ontario Institute of Technology², Gulf University for Science and Technology³

07 Mar 2013-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: A Gaussian bare-bones DE and its modified version (MGBDE) are proposed which are almost parameter free and indicate that the MGBDE performs significantly better than, or at least comparable to, several state-of-the-art DE variants and some existing bare-bone algorithms.

...read moreread less

Abstract: Differential evolution (DE) is a well-known algorithm for global optimization over continuous search spaces. However, choosing the optimal control parameters is a challenging task because they are problem oriented. In order to minimize the effects of the control parameters, a Gaussian bare-bones DE (GBDE) and its modified version (MGBDE) are proposed which are almost parameter free. To verify the performance of our approaches, 30 benchmark functions and two real-world problems are utilized. Conducted experiments indicate that the MGBDE performs significantly better than, or at least comparable to, several state-of-the-art DE variants and some existing bare-bones algorithms.

...read moreread less

Journal Article•DOI•

Spatiotemporal Learning via Infinite-Dimensional Bayesian Filtering and Smoothing: A Look at Gaussian Process Regression Through Kalman Filtering

[...]

Simo Särkkä¹, Arno Solin¹, Jouni Hartikainen•Institutions (1)

Aalto University¹

12 Jun 2013-IEEE Signal Processing Magazine

TL;DR: Methods for converting spatiotemporal Gaussian process regression problems into infinite-dimensional state-space models are presented and the use of machine-learning models in signal processing becomes computationally feasible, and it opens the possibility to combine machine- learning techniques with signal processing methods.

...read moreread less

Abstract: Gaussian process-based machine learning is a powerful Bayesian paradigm for nonparametric nonlinear regression and classification. In this article, we discuss connections of Gaussian process regression with Kalman filtering and present methods for converting spatiotemporal Gaussian process regression problems into infinite-dimensional state-space models. This formulation allows for use of computationally efficient infinite-dimensional Kalman filtering and smoothing methods, or more general Bayesian filtering and smoothing methods, which reduces the problematic cubic complexity of Gaussian process regression in the number of time steps into linear time complexity. The implication of this is that the use of machine-learning models in signal processing becomes computationally feasible, and it opens the possibility to combine machine-learning techniques with signal processing methods.

...read moreread less

Journal Article•DOI•

Cross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes with model misspecification

[...]

François Bachoc¹•Institutions (1)

University of Paris¹

01 Oct 2013-Computational Statistics & Data Analysis

TL;DR: The Maximum Likelihood (ML) and Cross Validation (CV) methods for estimating covariance hyper-parameters are compared and it is shown that when the correlation function is misspecified, the CV does better compared to ML, while ML is optimal when the model is well-specified.

...read moreread less

Posted Content•

Local Gaussian process approximation for large computer experiments

[...]

Robert B. Gramacy¹, Daniel W. Apley²•Institutions (2)

University of Chicago¹, Northwestern University²

02 Mar 2013-arXiv: Methodology

TL;DR: In this article, the authors derive a family of local sequential design schemes that dynamically define the support of a Gaussian process predictor based on a local subset of the data, and derive expressions for fast sequential updating of all needed quantities as the local designs are built up iteratively.

...read moreread less

Abstract: We provide a new approach to approximate emulation of large computer experiments. By focusing expressly on desirable properties of the predictive equations, we derive a family of local sequential design schemes that dynamically define the support of a Gaussian process predictor based on a local subset of the data. We further derive expressions for fast sequential updating of all needed quantities as the local designs are built-up iteratively. Then we show how independent application of our local design strategy across the elements of a vast predictive grid facilitates a trivially parallel implementation. The end result is a global predictor able to take advantage of modern multicore architectures, while at the same time allowing for a nonstationary modeling feature as a bonus. We demonstrate our method on two examples utilizing designs sized in the thousands, and tens of thousands of data points. Comparisons are made to the method of compactly supported covariances.

...read moreread less

Journal Article•DOI•

Strictly and non-strictly positive definite functions on spheres

[...]

Tilmann Gneiting

01 Sep 2013-Bernoulli

TL;DR: In this article, the authors review characterizations of positive definite functions on spheres in terms of Gegenbauer expansions, and apply them to dimension walks, where monotonicity properties of the Geggenbauer coefficients guarantee positive definiteness in higher dimensions.

...read moreread less

Abstract: Isotropic positive definite functions on spheres play important roles in spatial statistics, where they occur as the correlation functions of homogeneous random fields and star-shaped random particles. In approximation theory, strictly positive definite functions serve as radial basis functions for interpolating scattered data on spherical domains. We review characterizations of positive definite functions on spheres in terms of Gegenbauer expansions and apply them to dimension walks, where monotonicity properties of the Gegenbauer coefficients guarantee positive definiteness in higher dimensions. Subject to a natural support condition, isotropic positive definite functions on the Euclidean space $\mathbb{R} ^{3}$, such as Askey’s and Wendland’s functions, allow for the direct substitution of the Euclidean distance by the great circle distance on a one-, two- or three-dimensional sphere, as opposed to the traditional approach, where the distances are transformed into each other. Completely monotone functions are positive definite on spheres of any dimension and provide rich parametric classes of such functions, including members of the powered exponential, Matern, generalized Cauchy and Dagum families. The sine power family permits a continuous parameterization of the roughness of the sample paths of a Gaussian process. A collection of research problems provides challenges for future work in mathematical analysis, probability theory and spatial statistics.

...read moreread less

Journal Article•DOI•

Multi-output separable Gaussian process: Towards an efficient, fully Bayesian paradigm for uncertainty quantification

[...]

Ilias Bilionis¹, Nicholas Zabaras¹, Bledar A. Konomi², Guang Lin²•Institutions (2)

Cornell University¹, Pacific Northwest National Laboratory²

01 May 2013-Journal of Computational Physics

TL;DR: The novelty of this work, is the recognition that the Gaussian process model defines a posterior probability measure on the function space of possible surrogates for the computer code and the derivation of an algorithmic procedure that allows us to sample it efficiently.

...read moreread less

Proceedings Article•

High-Dimensional Gaussian Process Bandits

[...]

Josip Djolonga¹, Andreas Krause¹, Volkan Cevher²•Institutions (2)

ETH Zurich¹, École Polytechnique Fédérale de Lausanne²

05 Dec 2013

TL;DR: The SI-BO algorithm is presented, which leverages recent low-rank matrix recovery techniques to learn the underlying subspace of the unknown function and applies Gaussian Process Upper Confidence sampling for optimization of the function.

...read moreread less

Abstract: Many applications in machine learning require optimizing unknown functions defined over a high-dimensional space from noisy samples that are expensive to obtain. We address this notoriously hard challenge, under the assumptions that the function varies only along some low-dimensional subspace and is smooth (i.e., it has a low norm in a Reproducible Kernel Hilbert Space). In particular, we present the SI-BO algorithm, which leverages recent low-rank matrix recovery techniques to learn the underlying subspace of the unknown function and applies Gaussian Process Upper Confidence sampling for optimization of the function. We carefully calibrate the exploration-exploitation tradeoff by allocating the sampling budget to subspace estimation and function optimization, and obtain the first subexponential cumulative regret bounds and convergence rates for Bayesian optimization in high-dimensions under noisy observations. Numerical results demonstrate the effectiveness of our approach in difficult scenarios.

...read moreread less

Journal Article•DOI•

Coupled Gaussian processes for pose-invariant facial expression recognition

[...]

Ognjen Rudovic¹, Maja Pantic¹, Ioannis Patras²•Institutions (2)

Imperial College London¹, Queen Mary University of London²

01 Jun 2013-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The proposed Coupled Scaled Gaussian Process Regression model for head-posing normalization outperforms state-of-the-art regression-based approaches to head-pose normalization, 2D and 3D Point Distribution Models (PDMs), and Active Appearance Models (AAMs), especially in cases of unknown poses and imbalanced training data.

...read moreread less

Abstract: We propose a method for head-pose invariant facial expression recognition that is based on a set of characteristic facial points. To achieve head-pose invariance, we propose the Coupled Scaled Gaussian Process Regression (CSGPR) model for head-pose normalization. In this model, we first learn independently the mappings between the facial points in each pair of (discrete) nonfrontal poses and the frontal pose, and then perform their coupling in order to capture dependences between them. During inference, the outputs of the coupled functions from different poses are combined using a gating function, devised based on the head-pose estimation for the query points. The proposed model outperforms state-of-the-art regression-based approaches to head-pose normalization, 2D and 3D Point Distribution Models (PDMs), and Active Appearance Models (AAMs), especially in cases of unknown poses and imbalanced training data. To the best of our knowledge, the proposed method is the first one that is able to deal with expressive faces in the range from $(-45^\circ)$ to $(+45^\circ)$ pan rotation and $(-30^\circ)$ to $(+30^\circ)$ tilt rotation, and with continuous changes in head pose, despite the fact that training was conducted on a small set of discrete poses. We evaluate the proposed method on synthetic and real images depicting acted and spontaneously displayed facial expressions.

...read moreread less

Journal Article•DOI•

Multivariate Gaussian Process Emulators With Nonseparable Covariance Structures

[...]

Thomas E. Fricker¹, Jeremy E. Oakley², Nathan M. Urban³•Institutions (3)

University of Exeter¹, University of Sheffield², Los Alamos National Laboratory³

22 Feb 2013-Technometrics

TL;DR: Nonseparable covariance structures for Gaussian process emulators are developed, based on the linear model of coregionalization and convolution methods, finding that only emulators with nonseparable covariances structures have sufficient flexibility both to give good predictions and to represent joint uncertainty about the simulator outputs appropriately.

...read moreread less

Abstract: The Gaussian process regression model is a popular type of “emulator” used as a fast surrogate for computationally expensive simulators (deterministic computer models). For simulators with multivariate output, common practice is to specify a separable covariance structure for the Gaussian process. Though computationally convenient, this can be too restrictive, leading to poor performance of the emulator, particularly when the different simulator outputs represent different physical quantities. Also, treating the simulator outputs as independent can lead to inappropriate representations of joint uncertainty. We develop nonseparable covariance structures for Gaussian process emulators, based on the linear model of coregionalization and convolution methods. Using two case studies, we compare the performance of these covariance structures both with standard separable covariance structures and with emulators that assume independence between the outputs. In each case study, we find that only emulators with nons...

...read moreread less

Report•DOI•

Comparison and anti-concentration bounds for maxima of Gaussian random vectors

[...]

Victor Chernozhukov¹, Denis Chetverikov², Kengo Kato³•Institutions (3)

Massachusetts Institute of Technology¹, University of California, Los Angeles², University of Tokyo³

30 Dec 2013-Probability Theory and Related Fields

TL;DR: In this paper, the authors give explicit comparisons of expectations of smooth functions and distribution functions of maxima of Gaussian random vectors without any restriction on the covariance matrices, and establish an anti-concentration inequality for the maximum of a Gaussian vector, which derives a useful upper bound on the Levy concentration function for the Gaussian maximum.

...read moreread less

Abstract: Slepian and Sudakov–Fernique type inequalities, which compare expectations of maxima of Gaussian random vectors under certain restrictions on the covariance matrices, play an important role in probability theory, especially in empirical process and extreme value theories. Here we give explicit comparisons of expectations of smooth functions and distribution functions of maxima of Gaussian random vectors without any restriction on the covariance matrices. We also establish an anti-concentration inequality for the maximum of a Gaussian random vector, which derives a useful upper bound on the Levy concentration function for the Gaussian maximum. The bound is dimension-free and applies to vectors with arbitrary covariance matrices. This anti-concentration inequality plays a crucial role in establishing bounds on the Kolmogorov distance between maxima of Gaussian random vectors. These results have immediate applications in mathematical statistics. As an example of application, we establish a conditional multiplier central limit theorem for maxima of sums of independent random vectors where the dimension of the vectors is possibly much larger than the sample size.

...read moreread less

Journal Article•DOI•

Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis

[...]

Zhen-Hua Ling¹, Li Deng², Dong Yu²•Institutions (2)

University of Science and Technology of China¹, Microsoft²

01 Oct 2013-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: The proposed spectral modeling method can significantly alleviate the over-smoothing effect and improve the naturalness of the conventional HMM-based speech synthesis system using mel-cepstra.

...read moreread less

Abstract: This paper presents a new spectral modeling method for statistical parametric speech synthesis. In the conventional methods, high-level spectral parameters, such as mel-cepstra or line spectral pairs, are adopted as the features for hidden Markov model (HMM)-based parametric speech synthesis. Our proposed method described in this paper improves the conventional method in two ways. First, distributions of low-level, un-transformed spectral envelopes (extracted by the STRAIGHT vocoder) are used as the parameters for synthesis. Second, instead of using single Gaussian distribution, we adopt the graphical models with multiple hidden variables, including restricted Boltzmann machines (RBM) and deep belief networks (DBN), to represent the distribution of the low-level spectral envelopes at each HMM state. At the synthesis time, the spectral envelopes are predicted from the RBM-HMMs or the DBN-HMMs of the input sentence following the maximum output probability parameter generation criterion with the constraints of the dynamic features. A Gaussian approximation is applied to the marginal distribution of the visible stochastic variables in the RBM or DBN at each HMM state in order to achieve a closed-form solution to the parameter generation problem. Our experimental results show that both RBM-HMM and DBN-HMM are able to generate spectral envelope parameter sequences better than the conventional Gaussian-HMM with superior generalization capabilities and that DBN-HMM and RBM-HMM perform similarly due possibly to the use of Gaussian approximation. As a result, our proposed method can significantly alleviate the over-smoothing effect and improve the naturalness of the conventional HMM-based speech synthesis system using mel-cepstra.

...read moreread less

Journal Article•DOI•

One-class classification with Gaussian processes

[...]

Michael Kemmler¹, Erik Rodner¹, Esther-Sabrina Wacker¹, Joachim Denzler¹•Institutions (1)

University of Jena¹

01 Dec 2013-Pattern Recognition

TL;DR: This article investigates the use of Gaussian process (GP) priors for one-class classification and shows the suitability of the methods in the area of attribute prediction, defect localization, bacteria recognition, and background subtraction.

...read moreread less

Journal Article•DOI•

Asymptotic power of sphericity tests for high-dimensional data

[...]

Alexei Onatski, Marcelo J. Moreira, Marc Hallin

01 Jun 2013-Annals of Statistics

TL;DR: In this article, the asymptotic power of tests of sphericity against perturbations in a single unknown direction was studied, where both the dimensionality of the data and the number of observations go to infinity.

...read moreread less

Abstract: This paper studies the asymptotic power of tests of sphericity against perturbations in a single unknown direction as both the dimensionality of the data and the number of observations go to infinity. We establish the convergence, under the null hypothesis and the alternative, of the log ratio of the joint densities of the sample covariance eigenvalues to a Gaussian process indexed by the norm of the perturbation. When the perturbation norm is larger than the phase transition threshold studied in Baik et al. (2005), the limiting process is degenerate and discrimination between the null and the alternative is asymptotically certain. When the norm is below the threshold, the process is non-degenerate, so that the joint eigenvalue densities under the null and alternative hypotheses are mutually contiguous. Using the asymptotic theory of statistical experiments, we obtain asymptotic power envelopes and derive the asymptotic power for various sphericity tests in the contiguity region. In particular, we show that the asymptotic power of the Tracy-Widom-type tests is trivial, whereas that of the eigenvalue-based likelihood ratio test is strictly larger than the size, and close to the power envelope.

...read moreread less

Book Chapter•DOI•

Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration

[...]

Emile Contal¹, David Buffoni¹, Alexandre Robicquet¹, Nicolas Vayatis¹•Institutions (1)

École normale supérieure de Cachan¹

19 Apr 2013-arXiv: Learning

TL;DR: The Gaussian Process Upper Confidence Bound and Pure exploration algorithm (GP-UCB-PE) is introduced which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations and proves theoretical upper bounds on the regret with batches of size K for this procedure.

...read moreread less

Abstract: In this paper, we consider the challenge of maximizing an unknown function f for which evaluations are noisy and are acquired with high cost. An iterative procedure uses the previous measures to actively select the next estimation of f which is predicted to be the most useful. We focus on the case where the function can be evaluated in parallel with batches of fixed size and analyze the benefit compared to the purely sequential procedure in terms of cumulative regret. We introduce the Gaussian Process Upper Confidence Bound and Pure Exploration algorithm (GP-UCB-PE) which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations. We prove theoretical upper bounds on the regret with batches of size K for this procedure which show the improvement of the order of sqrt{K} for fixed iteration cost over purely sequential versions. Moreover, the multiplicative constants involved have the property of being dimension-free. We also confirm empirically the efficiency of GP-UCB-PE on real and synthetic problems compared to state-of-the-art competitors.

...read moreread less

Journal Article•DOI•

Note(s): Extremal t processes: Elliptical domain of attraction and a spectral representation

[...]

T. Opitz¹•Institutions (1)

University of Montpellier¹

01 Nov 2013-Journal of Multivariate Analysis

TL;DR: The role of the extremal t process as the maximum attractor for processes with finite-dimensional elliptical distributions is highlighted and all results naturally also hold within the multivariate domain.

...read moreread less

Collapse