Showing papers on "Gaussian process published in 2014"

PDF

Open Access

Book Chapter•DOI•

Transfer Learning Based Visual Tracking with Gaussian Processes Regression

[...]

Jin Gao¹, Haibin Ling¹, Weiming Hu, Junliang Xing•Institutions (1)

06 Sep 2014

TL;DR: This paper directly analyze this probability of target appearance as exponentially related to the confidence of a classifier output using Gaussian Processes Regression (GPR), and introduces a latent variable to assist the tracking decision.

...read moreread less

Abstract: Modeling the target appearance is critical in many modern visual tracking algorithms. Many tracking-by-detection algorithms formulate the probability of target appearance as exponentially related to the confidence of a classifier output. By contrast, in this paper we directly analyze this probability using Gaussian Processes Regression (GPR), and introduce a latent variable to assist the tracking decision. Our observation model for regression is learnt in a semi-supervised fashion by using both labeled samples from previous frames and the unlabeled samples that are tracking candidates extracted from the current frame. We further divide the labeled samples into two categories: auxiliary samples collected from the very early frames and target samples from most recent frames. The auxiliary samples are dynamically re-weighted by the regression, and the final tracking result is determined by fusing decisions from two individual trackers, one derived from the auxiliary samples and the other from the target samples. All these ingredients together enable our tracker, denoted as TGPR, to alleviate the drifting issue from various aspects. The effectiveness of TGPR is clearly demonstrated by its excellent performances on three recently proposed public benchmarks, involving 161 sequences in total, in comparison with state-of-the-arts.

...read moreread less

479 citations

Journal Article•DOI•

Gaussian multiplicative chaos and applications: A review

[...]

Rémi Rhodes¹, Vincent Vargas²•Institutions (2)

University of Marne-la-Vallée¹, École Normale Supérieure²

01 Jan 2014-Probability Surveys

TL;DR: The theory of Gaussian multiplicative chaos was introduced by Kahane's seminal work in 1985 as discussed by the authors, and it has been applied in many applications, ranging from finance to quantum gravity.

...read moreread less

Abstract: In this article, we review the theory of Gaussian multiplicative chaos initially introduced by Kahane’s seminal work in 1985. Though this beautiful paper faded from memory until recently, it already contains ideas and results that are nowadays under active investigation, like the construction of the Liouville measure in $2d$-Liouville quantum gravity or thick points of the Gaussian Free Field. Also, we mention important extensions and generalizations of this theory that have emerged ever since and discuss a whole family of applications, ranging from finance, through the Kolmogorov-Obukhov model of turbulence to $2d$-Liouville quantum gravity. This review also includes new results like the convergence of discretized Liouville measures on isoradial graphs (thus including the triangle and square lattices) towards the continuous Liouville measures (in the subcritical and critical case) or multifractal analysis of the measures in all dimensions.

...read moreread less

469 citations

Journal Article•DOI•

Active Subspace Methods in Theory and Practice: Applications to Kriging Surfaces

[...]

Paul G. Constantine, Eric Dow, Qiqi Wang

24 Jul 2014-SIAM Journal on Scientific Computing

TL;DR: This work presents a method to first detect the directions of the strongest variability using evaluations of the gradient and subsequently exploit these directions to construct a response surface on a low-dimensional subspace---i.e., the active subspace ---of the inputs.

...read moreread less

Abstract: Many multivariate functions in engineering models vary primarily along a few directions in the space of input parameters. When these directions correspond to coordinate directions, one may apply global sensitivity measures to determine the most influential parameters. However, these methods perform poorly when the directions of variability are not aligned with the natural coordinates of the input space. We present a method to first detect the directions of the strongest variability using evaluations of the gradient and subsequently exploit these directions to construct a response surface on a low-dimensional subspace---i.e., the active subspace---of the inputs. We develop a theoretical framework with error bounds, and we link the theoretical quantities to the parameters of a kriging response surface on the active subspace. We apply the method to an elliptic PDE model with coefficients parameterized by 100 Gaussian random variables and compare it with a local sensitivity analysis method for dimension reduc...

...read moreread less

435 citations

Journal Article•DOI•

A Gaussian Process Surrogate Model Assisted Evolutionary Algorithm for Medium Scale Expensive Optimization Problems

[...]

Bo Liu¹, Qingfu Zhang², Georges Gielen³•Institutions (3)

Glyndŵr University¹, City University of Hong Kong², Katholieke Universiteit Leuven³

01 Apr 2014-IEEE Transactions on Evolutionary Computation

TL;DR: A new framework is developed and used in GPEME, which carefully coordinates the surrogate modeling and the evolutionary search, so that the search can focus on a small promising area and is supported by the constructed surrogate model.

...read moreread less

Abstract: Surrogate model assisted evolutionary algorithms (SAEAs) have recently attracted much attention due to the growing need for computationally expensive optimization in many real-world applications. Most current SAEAs, however, focus on small-scale problems. SAEAs for medium-scale problems (i.e., 20-50 decision variables) have not yet been well studied. In this paper, a Gaussian process surrogate model assisted evolutionary algorithm for medium-scale computationally expensive optimization problems (GPEME) is proposed and investigated. Its major components are a surrogate model-aware search mechanism for expensive optimization problems when a high-quality surrogate model is difficult to build and dimension reduction techniques for tackling the “curse of dimensionality.” A new framework is developed and used in GPEME, which carefully coordinates the surrogate modeling and the evolutionary search, so that the search can focus on a small promising area and is supported by the constructed surrogate model. Sammon mapping is introduced to transform the decision variables from tens of dimensions to a few dimensions, in order to take advantage of Gaussian process surrogate modeling in a low-dimensional space. Empirical studies on benchmark problems with 20, 30, and 50 variables and a real-world power amplifier design automation problem with 17 variables show the high efficiency and effectiveness of GPEME. Compared to three state-of-the-art SAEAs, better or similar solutions can be obtained with 12% to 50% exact function evaluations.

...read moreread less

369 citations

Journal Article•DOI•

Parallelizing exploration-exploitation tradeoffs in Gaussian process bandit optimization

[...]

Thomas Desautels¹, Andreas Krause², Joel W. Burdick³•Institutions (3)

University College London¹, ETH Zurich², California Institute of Technology³

01 Jan 2014-Journal of Machine Learning Research

TL;DR: The GP-BUCB algorithm is also applicable in the related case of a delay between initiation of an experiment and observation of its results, for which the same regret bounds hold.

...read moreread less

Abstract: How can we take advantage of opportunities for experimental parallelization in exploration-exploitation tradeoffs? In many experimental scenarios, it is often desirable to execute experiments simultaneously or in batches, rather than only performing one at a time Additionally, observations may be both noisy and expensive We introduce Gaussian Process Batch Upper Confidence Bound (GP-BUCB), an upper confidence bound-based algorithm, which models the reward function as a sample from a Gaussian process and which can select batches of experiments to run in parallel We prove a general regret bound for GP-BUCB, as well as the surprising result that for some common kernels, the asymptotic average regret can be made independent of the batch size The GP-BUCB algorithm is also applicable in the related case of a delay between initiation of an experiment and observation of its results, for which the same regret bounds hold We also introduce Gaussian Process Adaptive Upper Confidence Bound (GP-AUCB), a variant of GP-BUCB which can exploit parallelism in an adaptive manner We evaluate GP-BUCB and GP-AUCB on several simulated and real data sets These experiments show that GP-BUCB and GP-AUCB are competitive with state-of-the-art heuristics

...read moreread less

338 citations

Journal Article•DOI•

Uncertainty in perception and the Hierarchical Gaussian Filter

[...]

Christoph Mathys, Ekaterina I. Lomakina¹, Ekaterina I. Lomakina², Jean Daunizeau, Sandra Iglesias¹, Kay H. Brodersen¹, Karl J. Friston³, Klaas E. Stephan³, Klaas E. Stephan¹ - Show less +5 more•Institutions (3)

University of Zurich¹, ETH Zurich², Wellcome Trust Centre for Neuroimaging³

19 Nov 2014-Frontiers in Human Neuroscience

TL;DR: This paper explicitly formulate the extension of the HGF's hierarchy to any number of levels, and discusses how various forms of uncertainty are accommodated by the minimization of variational free energy as encoded in the update equations.

...read moreread less

Abstract: In its full sense, perception rests on an agent’s model of how its sensory input comes about and the inferences it draws based on this model. These inferences are necessarily uncertain. Here, we illustrate how the hierarchical Gaussian filter (HGF) offers a principled and generic way to deal with the several forms that uncertainty in perception takes. The HGF is a recent derivation of one-step update equations from Bayesian principles that rests on a hierarchical generative model of the environment and its (in)stability. It is computationally highly efficient, allows for online estimates of hidden states, and has found numerous applications to experimental data from human subjects. In this paper, we generalize previous descriptions of the HGF and its account of perceptual uncertainty. First, we explicitly formulate the extension of the HGF’s hierarchy to any number of levels; second, we discuss how various forms of uncertainty are accommodated by the minimization of variational free energy as encoded in the update equations; third, we combine the HGF with decision models and demonstrate the inversion of this combination; finally, we report a simulation study that compared four optimization methods for inverting the HGF/decision model combination at different noise levels. These four methods (Nelder-Mead simplex algorithm, Gaussian process-based global optimization, variational Bayes and Markov chain Monte Carlo sampling) all performed well even under considerable noise, with variational Bayes offering the best combination of efficiency and informativeness of inference. Our results demonstrate that the HGF provides a principled, flexible, and efficient - but at the same time intuitive - framework for the resolution of perceptual uncertainty in behaving agents.

...read moreread less

294 citations

Report•DOI•

Gaussian approximation of suprema of empirical processes

[...]

Victor Chernozhukov, Denis Chetverikov, Kengo Kato

01 Aug 2014-Annals of Statistics

TL;DR: An abstract approximation theorem that is applicable to a wide variety of problems, primarily in statistics, is proved and the bound in the main approximation theorem is non-asymptotic and the theorem does not require uniform boundedness of the class of functions.

...read moreread less

Abstract: This paper develops a new direct approach to approximating suprema of general empirical processes by a sequence of suprema of Gaussian processes, without taking the route of approximating whole empirical processes in the sup-norm. We prove an abstract approximation theorem applicable to a wide variety of statistical problems, such as construction of uniform confidence bands for functions. Notably, the bound in the main approximation theorem is nonasymptotic and the theorem allows for functions that index the empirical process to be unbounded and have entropy divergent with the sample size. The proof of the approximation theorem builds on a new coupling inequality for maxima of sums of random vectors, the proof of which depends on an effective use of Stein’s method for normal approximation, and some new empirical process techniques. We study applications of this approximation theorem to local and series empirical processes arising in nonparametric estimation via kernel and series methods, where the classes of functions change with the sample size and are non-Donsker. Importantly, our new technique is able to prove the Gaussian approximation for the supremum type statistics under weak regularity conditions, especially concerning the bandwidth and the number of series functions, in those examples.

...read moreread less

257 citations

Book•

Upper and Lower Bounds for Stochastic Processes : Modern Methods and Classical Problems

[...]

Michel Talagrand

16 Feb 2014

TL;DR: In this paper, the authors present an overview of the history of Gaussian Processes and their application to Banach Space Theory, including Bernouilli Processes, Random Fourier Series and Trigonometric Sums, and the fundamental Conjectures.

...read moreread less

Abstract: 0. Introduction.- 1. Philosophy and Overview of the Book.- 2. Gaussian Processes and the Generic Chaining.- 3. Random Fourier Series and Trigonometric Sums, I. - 4. Matching Theorems I.- 5. Bernouilli Processes.- 6. Trees and the Art of Lower Bounds.- 7. Random Fourier Series and Trigonometric Sums, II.- 8. Processes Related to Gaussian Processes.- 9. Theory and Practice of Empirical Processes.- 10. Partition Scheme for Families of Distances.- 11. Infinitely Divisible Processes.- 12. The Fundamental Conjectures.- 13. Convergence of Orthogonal Series Majorizing Measures.- 14. Matching Theorems, II: Shor's Matching Theorem. 15. The Ultimate Matching Theorem in Dimension => 3.- 16. Applications to Banach Space Theory.- 17. Appendix: What this Book is Really About.- 18. Appendix: Continuity.- References. Index.

...read moreread less

216 citations

Journal Article•DOI•

Limitations on low rank approximations for covariance matrices of spatial data

[...]

Michael L. Stein¹•Institutions (1)

University of Chicago¹

01 May 2014-spatial statistics

TL;DR: An approximation in which observations are split into contiguous blocks and independence across blocks is assumed often provides a much better approximation to the likelihood than a low rank approximation requiring similar memory and calculations.

...read moreread less

Abstract: Evaluating the likelihood function for Gaussian models when a spatial process is observed irregularly is problematic for larger datasets due to constraints of memory and calculation. If the covariance structure can be approximated by a diagonal matrix plus a low rank matrix, then both the memory and calculations needed to evaluate the likelihood function are greatly reduced. When neighboring observations are strongly correlated, much of the variation in the observations can be captured by low frequency components, so the low rank approach might be thought to work well in this setting. Through both theory and numerical results, where the diagonal matrix is assumed to be a multiple of the identity, this paper shows that the low rank approximation sometimes performs poorly in this setting. In particular, an approximation in which observations are split into contiguous blocks and independence across blocks is assumed often provides a much better approximation to the likelihood than a low rank approximation requiring similar memory and calculations. An example with satellite-based measurements of total column ozone shows that these results are relevant to real data and that the low rank models also can be highly statistically inefficient for spatial interpolation.

...read moreread less

199 citations

Posted Content•

Input Warping for Bayesian Optimization of Non-stationary Functions

[...]

Jasper Snoek¹, Kevin Swersky², Richard S. Zemel³, Ryan P. Adams¹•Institutions (3)

Harvard University¹, University of Toronto², Canadian Institute for Advanced Research³

05 Feb 2014-arXiv: Machine Learning

TL;DR: In this article, the authors develop a methodology for automatically learning a wide family of bijective transformations or warpings of the input space using the Beta cumulative distribution function and further extend the warping framework to multi-task Bayesian optimization so that multiple tasks can be warped into a jointly stationary space.

...read moreread less

Abstract: Bayesian optimization has proven to be a highly effective methodology for the global optimization of unknown, expensive and multimodal functions. The ability to accurately model distributions over functions is critical to the effectiveness of Bayesian optimization. Although Gaussian processes provide a flexible prior over functions which can be queried efficiently, there are various classes of functions that remain difficult to model. One of the most frequently occurring of these is the class of non-stationary functions. The optimization of the hyperparameters of machine learning algorithms is a problem domain in which parameters are often manually transformed a priori, for example by optimizing in "log-space," to mitigate the effects of spatially-varying length scale. We develop a methodology for automatically learning a wide family of bijective transformations or warpings of the input space using the Beta cumulative distribution function. We further extend the warping framework to multi-task Bayesian optimization so that multiple tasks can be warped into a jointly stationary space. On a set of challenging benchmark optimization tasks, we observe that the inclusion of warping greatly improves on the state-of-the-art, producing better results faster and more reliably.

...read moreread less

198 citations

Proceedings Article•

Automatic construction and natural-language description of nonparametric regression models

[...]

James Robert Lloyd¹, David Duvenaud¹, Roger Grosse², Joshua B. Tenenbaum², Zoubin Ghahramani¹ - Show less +1 more•Institutions (2)

University of Cambridge¹, Massachusetts Institute of Technology²

27 Jul 2014

TL;DR: The beginnings of an automatic statistician is presented, focusing on regression problems, which explores an open-ended space of statistical models to discover a good explanation of a data set, and then produces a detailed report with figures and natural language text.

...read moreread less

Abstract: This paper presents the beginnings of an automatic statistician, focusing on regression problems. Our system explores an open-ended space of statistical models to discover a good explanation of a data set, and then produces a detailed report with figures and natural language text. Our approach treats unknown regression functions nonparametrically using Gaussian processes, which has two important consequences. First, Gaussian processes can model functions in terms of high-level properties (e.g. smoothness, trends, periodicity, changepoints). Taken together with the compositional structure of our language of models this allows us to automatically describe functions in simple terms. Second, the use of flexible nonparametric models and a rich language for composing them in an open-ended manner also results in state-of-the-art extrapolation performance evaluated over 13 real time series data sets from various domains.

...read moreread less

Journal Article•DOI•

Monthly streamflow forecasting using Gaussian Process Regression

[...]

Alexander Y. Sun¹, Dingbao Wang², Xianli Xu³•Institutions (3)

University of Texas at Austin¹, University of Central Florida², Chinese Academy of Sciences³

16 Apr 2014-Journal of Hydrology

TL;DR: Gaussian Process Regression (GPR), an effective kernel-based machine learning algorithm, is applied to probabilistic streamflow forecasting and indicates relatively strong persistence of streamflow predictability in the extended period, although the low-predictability basins tend to show more variations.

...read moreread less

Journal Article•DOI•

Background Subtraction with DirichletProcess Mixture Models

[...]

Tom S. F. Haines¹, Tao Xiang²•Institutions (2)

University College London¹, Queen Mary University of London²

01 Apr 2014-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work presents a new method based on Dirichlet process Gaussian mixture models, which is used to estimate per-pixel background distributions, followed by probabilistic regularisation, and develops novel model learning algorithms for continuous update of the model in a principled fashion as the scene changes.

...read moreread less

Abstract: Video analysis often begins with background subtraction. This problem is often approached in two steps-a background model followed by a regularisation scheme. A model of the background allows it to be distinguished on a per-pixel basis from the foreground, whilst the regularisation combines information from adjacent pixels. We present a new method based on Dirichlet process Gaussian mixture models, which are used to estimate per-pixel background distributions. It is followed by probabilistic regularisation. Using a non-parametric Bayesian method allows per-pixel mode counts to be automatically inferred, avoiding over-/under- fitting. We also develop novel model learning algorithms for continuous update of the model in a principled fashion as the scene changes. These key advantages enable us to outperform the state-of-the-art alternatives on four benchmarks.

...read moreread less

Journal Article•DOI•

Inverse Gaussian process models for degradation analysis: A Bayesian perspective

[...]

Weiwen Peng¹, Yan-Feng Li¹, Yuanjian Yang¹, Hong-Zhong Huang¹, Ming J. Zuo², Ming J. Zuo¹ - Show less +2 more•Institutions (2)

University of Electronic Science and Technology of China¹, University of Alberta²

01 Oct 2014-Reliability Engineering & System Safety

TL;DR: A Bayesian analysis of inverse Gaussian process models for degradation modeling and inference and a classic example is presented to demonstrate the applicability of the Bayesian method for degradation analysis with the inverse Gaussia process models.

...read moreread less

Journal Article•DOI•

Scaled Brownian motion: a paradoxical process with a time dependent diffusivity for the description of anomalous diffusion

[...]

Jae-Hyung Jeon¹, Aleksei V. Chechkin², Aleksei V. Chechkin³, Aleksei V. Chechkin⁴, Ralf Metzler¹, Ralf Metzler³ - Show less +2 more•Institutions (4)

Tampere University of Technology¹, Max Planck Society², University of Potsdam³, Kharkov Institute of Physics and Technology⁴

09 Jul 2014-Physical Chemistry Chemical Physics

TL;DR: In this paper, the scaling Brownian motion (SBM) model is shown to be weakly nonergodic but does not exhibit a significant amplitude scatter of the time averaged mean squared displacement.

...read moreread less

Abstract: Anomalous diffusion is frequently described by scaled Brownian motion (SBM), a Gaussian process with a power-law time dependent diffusion coefficient. Its mean squared displacement is 〈x2(t)〉 ≃ 2(t)t with (t) ≃ tα−1 for 0 < α < 2. SBM may provide a seemingly adequate description in the case of unbounded diffusion, for which its probability density function coincides with that of fractional Brownian motion. Here we show that free SBM is weakly non-ergodic but does not exhibit a significant amplitude scatter of the time averaged mean squared displacement. More severely, we demonstrate that under confinement, the dynamics encoded by SBM is fundamentally different from both fractional Brownian motion and continuous time random walks. SBM is highly non-stationary and cannot provide a physical description for particles in a thermalised stationary system. Our findings have direct impact on the modelling of single particle tracking experiments, in particular, under confinement inside cellular compartments or when optical tweezers tracking methods are used.

...read moreread less

Journal Article•DOI•

Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners

[...]

Carlo Baldassi¹, Marco Zamparo¹, Christoph Feinauer¹, Andrea Procaccini, Riccardo Zecchina¹, Martin Weigt², Andrea Pagnani¹ - Show less +3 more•Institutions (2)

Polytechnic University of Turin¹, Centre national de la recherche scientifique²

24 Mar 2014-PLOS ONE

TL;DR: The quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis for the prediction of residue-residue contacts in proteins and the identification of protein-protein interaction partner in bacterial signal transduction.

...read moreread less

Abstract: In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i) the prediction of residue-residue contacts in proteins, and (ii) the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code.

...read moreread less

Journal Article•DOI•

Extended Object Tracking with Random Hypersurface Models

[...]

Marcus Baum¹, Uwe D. Hanebeck¹•Institutions (1)

Karlsruhe Institute of Technology¹

02 May 2014-IEEE Transactions on Aerospace and Electronic Systems

TL;DR: In this paper, the random hypersurface model (RHM) is introduced for estimating a shape approximation of an extended object in addition to its kinematic state, where the shape parameters and measurements are related via a measurement equation that serves as the basis for a Gaussian state estimator.

...read moreread less

Abstract: The random hypersurface model (RHM) is introduced for estimating a shape approximation of an extended object in addition to its kinematic state. An RHM represents the spatial extent by means of randomly scaled versions of the shape boundary. In doing so, the shape parameters and the measurements are related via a measurement equation that serves as the basis for a Gaussian state estimator. Specific estimators are derived for elliptic and star-convex shapes.

...read moreread less

Posted Content•

Student-t Processes as Alternatives to Gaussian Processes

[...]

Amar Shah¹, Andrew Gordon Wilson¹, Zoubin Ghahramani¹•Institutions (1)

University of Cambridge¹

18 Feb 2014-arXiv: Machine Learning

TL;DR: In this paper, the Student-t process is proposed as an alternative to the Gaussian process as a nonparametric prior over functions, and closed form expressions for the marginal likelihood and predictive distribution of a Student-T process are derived by integrating away an inverse Wishart process prior over the covariance kernel.

...read moreread less

Abstract: We investigate the Student-t process as an alternative to the Gaussian process as a nonparametric prior over functions. We derive closed form expressions for the marginal likelihood and predictive distribution of a Student-t process, by integrating away an inverse Wishart process prior over the covariance kernel of a Gaussian process model. We show surprising equivalences between different hierarchical Gaussian process models leading to Student-t processes, and derive a new sampling scheme for the inverse Wishart process, which helps elucidate these equivalences. Overall, we show that a Student-t process can retain the attractive properties of a Gaussian process -- a nonparametric representation, analytic marginal and predictive distributions, and easy model selection through covariance kernels -- but has enhanced flexibility, and predictive covariances that, unlike a Gaussian process, explicitly depend on the values of training observations. We verify empirically that a Student-t process is especially useful in situations where there are changes in covariance structure, or in applications like Bayesian optimization, where accurate predictive covariances are critical for good performance. These advantages come at no additional computational cost over Gaussian processes.

...read moreread less

Journal Article•DOI•

Adaptive Double Subspace Signal Detection in Gaussian Background—Part I: Homogeneous Environments

[...]

Weijian Liu¹, Wenchong Xie, Jun Liu², Yongliang Wang•Institutions (2)

National University of Defense Technology¹, Xidian University²

01 May 2014-IEEE Transactions on Signal Processing

TL;DR: The generalized likelihood ratio test (GLRT), Rao test, Wald test, as well as their two-step variations, in homogeneous environments are derived, inhomogeneous environments and three types of spectral norm tests (SNTs) are introduced.

...read moreread less

Abstract: In this two-part paper, we consider the problem of adaptive multidimensional/multichannel signal detection in Gaussian noise with unknown covariance matrix. The test data (primary data) is assumed as a collection of sample vectors, arranged as the columns of a rectangular data array. The rows and columns of the signal matrix are both assumed to lie in known subspaces, but with unknown coordinates. Due to this feature of the signal structure, we name this kind of signal as the double subspace signal. Part I of this paper focuses on the adaptive detection in homogeneous environments, while Part II deals with the adaptive detection in partially homogeneous environments. Precisely, in this part, we derive the generalized likelihood ratio test (GLRT), Rao test, Wald test, as well as their two-step variations, in homogeneous environments. Three types of spectral norm tests (SNTs) are also introduced. All these detectors are shown to possess the constant false alarm rate (CFAR) property. Moreover, we discuss the differences between them and show how they work. Another contribution is that we investigate various special cases of these detectors. Remarkably, some of them are well-known existing detectors, while some others are still new. At the stage of performance evaluation, conducted by Monte Carlo simulations, both matched and mismatched signals are dealt with. For each case, more than one scenario is considered.

...read moreread less

Posted Content•

Generalized Product of Experts for Automatic and Principled Fusion of Gaussian Process Predictions

[...]

Yanshuai Cao, David J. Fleet

28 Oct 2014-arXiv: Learning

TL;DR: This work identifies four desirable properties that are important for scalability, expressiveness and robustness, when learning and inferring with a combination of multiple models and shows that gPoE of Gaussian processes has these qualities, while no other existing combination schemes satisfy all of them at the same time.

...read moreread less

Abstract: In this work, we propose a generalized product of experts (gPoE) framework for combining the predictions of multiple probabilistic models. We identify four desirable properties that are important for scalability, expressiveness and robustness, when learning and inferring with a combination of multiple models. Through analysis and experiments, we show that gPoE of Gaussian processes (GP) have these qualities, while no other existing combination schemes satisfy all of them at the same time. The resulting GP-gPoE is highly scalable as individual GP experts can be independently learned in parallel; very expressive as the way experts are combined depends on the input rather than fixed; the combined prediction is still a valid probabilistic model with natural interpretation; and finally robust to unreliable predictions from individual experts.

...read moreread less

Proceedings Article•

Fast Kernel Learning for Multidimensional Pattern Extrapolation

[...]

Andrew Gordon Wilson¹, Elad Gilboa, John P. Cunningham, Arye Nehorai•Institutions (1)

Carnegie Mellon University¹

08 Dec 2014

TL;DR: It is shown that a distinct combination of expressive kernels, a fully non-parametric representation, and scalable inference which exploits existing model structure, are critical for large scale multidimensional pattern extrapolation.

...read moreread less

Abstract: The ability to automatically discover patterns and perform extrapolation is an essential quality of intelligent systems. Kernel methods, such as Gaussian processes, have great potential for pattern extrapolation, since the kernel flexibly and interpretably controls the generalisation properties of these methods. However, automatically extrapolating large scale multidimensional patterns is in general difficult, and developing Gaussian process models for this purpose involves several challenges. A vast majority of kernels, and kernel learning methods, currently only succeed in smoothing and interpolation. This difficulty is compounded by the fact that Gaussian processes are typically only tractable for small datasets, and scaling an expressive kernel learning approach poses different challenges than scaling a standard Gaussian process model. One faces additional computational constraints, and the need to retain significant model structure for expressing the rich information available in a large dataset. In this paper, we propose a Gaussian process approach for large scale multidimensional pattern extrapolation. We recover sophisticated out of class kernels, perform texture extrapolation, inpainting, and video extrapolation, and long range forecasting of land surface temperatures, all on large multidimensional datasets, including a problem with 383,400 training points. The proposed method significantly outperforms alternative scalable and flexible Gaussian process methods, in speed and accuracy. Moreover, we show that a distinct combination of expressive kernels, a fully non-parametric representation, and scalable inference which exploits existing model structure, are critical for large scale multidimensional pattern extrapolation.

...read moreread less

Journal Article•DOI•

Hyperspectral Image Classification Using Gaussian Mixture Models and Markov Random Fields

[...]

Wei Li¹, Saurabh Prasad², James E. Fowler³•Institutions (3)

University of California, Davis¹, University of Houston², Mississippi State University³

01 Jan 2014-IEEE Geoscience and Remote Sensing Letters

TL;DR: In this paper, dimensionality reduction targeting the preservation of multimodal structures is proposed to counter the parameter-space issue, where locality-preserving nonnegative matrix factorization, as well as local Fisher's discriminant analysis, is deployed as preprocessing to reduce the dimensionality of data for the Gaussian-mixture-model classifier.

...read moreread less

Abstract: The Gaussian mixture model is a well-known classification tool that captures non-Gaussian statistics of multivariate data. However, the impractically large size of the resulting parameter space has hindered widespread adoption of Gaussian mixture models for hyperspectral imagery. To counter this parameter-space issue, dimensionality reduction targeting the preservation of multimodal structures is proposed. Specifically, locality-preserving nonnegative matrix factorization, as well as local Fisher's discriminant analysis, is deployed as preprocessing to reduce the dimensionality of data for the Gaussian-mixture-model classifier, while preserving multimodal structures within the data. In addition, the pixel-wise classification results from the Gaussian mixture model are combined with spatial-context information resulting from a Markov random field. Experimental results demonstrate that the proposed classification system significantly outperforms other approaches even under limited training data.

...read moreread less

Proceedings Article•

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

[...]

Yarin Gal¹, Mark van der Wilk¹, Carl Edward Rasmussen¹•Institutions (1)

University of Cambridge¹

08 Dec 2014

TL;DR: A novel re-parametrisation of variational inference for sparse GP regression and latent variable models that allows for an efficient distributed algorithm and shows that GPs perform better than many common models often used for big data.

...read moreread less

Abstract: Gaussian processes (GPs) are a powerful tool for probabilistic inference over functions. They have been applied to both regression and non-linear dimensionality reduction, and offer desirable properties such as uncertainty estimates, robustness to over-fitting, and principled ways for tuning hyper-parameters. However the scalability of these models to big datasets remains an active topic of research. We introduce a novel re-parametrisation of variational inference for sparse GP regression and latent variable models that allows for an efficient distributed algorithm. This is done by exploiting the decoupling of the data given the inducing points to re-formulate the evidence lower bound in a Map-Reduce setting. We show that the inference scales well with data and computational resources, while preserving a balanced distribution of the load among the nodes. We further demonstrate the utility in scaling Gaussian processes to big data. We show that GP performance improves with increasing amounts of data in regression (on flight data with 2 million records) and latent variable modelling (on MNIST). The results show that GPs perform better than many common models often used for big data.

...read moreread less

Journal Article•DOI•

Recursive Gaussian process: On-line regression and learning

[...]

Marco F. Huber

01 Aug 2014-Pattern Recognition Letters

TL;DR: Two approaches for on-line Gaussian process regression with low computational and memory demands are proposed, one that assumes known hyperparameters and performs regression on a set of basis vectors that stores mean and covariance estimates of the latent function.

...read moreread less

Proceedings Article•DOI•

Batch Continuous-Time Trajectory Estimation as Exactly Sparse Gaussian Process Regression

[...]

Timothy D. Barfoot¹, Chi Hay Tong², Simo Särkkä³•Institutions (3)

University of Toronto¹, University of Oxford², Helsinki University of Technology³

12 Jul 2014

TL;DR: This paper revisits batch state estimation through the lens of Gaussian process (GP) regression, and shows that this class of prior results in an inverse kernel matrix that is exactly sparse (block-tridiagonal) and that this can be exploited to carry out GP regression (and interpolation) very efficiently.

...read moreread less

Abstract: In this paper, we revisit batch state estimation through the lens of Gaussian process (GP) regression. We consider continuous-discrete estimation problems wherein a trajectory is viewed as a one-dimensional GP, with time as the independent variable. Our continuous-time prior can be defined by any linear, time-varying stochastic differential equation driven by white noise; this allows the possibility of smoothing our trajectory estimates using a variety of vehicle dynamics models (e.g., ‘constant-velocity’). We show that this class of prior results in an inverse kernel matrix (i.e., covariance matrix between all pairs of measurement times) that is exactly sparse (block-tridiagonal) and that this can be exploited to carry out GP regression (and interpolation) very efficiently. Though the prior is continuous, we consider measurements to occur at discrete times. When the measurement model is also linear, this GP approach is equivalent to classical, discrete-time smoothing (at the measurement times). When the measurement model is nonlinear, we iterate over the whole trajectory (as is common in vision and robotics) to maximize accuracy. We test the approach experimentally on a simultaneous trajectory estimation and mapping problem using a mobile robot dataset.

...read moreread less

Posted Content•

Variational Gaussian Process State-Space Models

[...]

Roger Frigola¹, Yutian Chen¹, Carl Edward Rasmussen¹•Institutions (1)

University of Cambridge¹

18 Jun 2014-arXiv: Learning

TL;DR: This work presents a procedure for efficient variational Bayesian learning of nonlinear state-space models based on sparse Gaussian processes and offers the possibility to straightforwardly trade off model capacity and computational cost whilst avoiding overfitting.

...read moreread less

Abstract: State-space models have been successfully used for more than fifty years in different areas of science and engineering. We present a procedure for efficient variational Bayesian learning of nonlinear state-space models based on sparse Gaussian processes. The result of learning is a tractable posterior over nonlinear dynamical systems. In comparison to conventional parametric models, we offer the possibility to straightforwardly trade off model capacity and computational cost whilst avoiding overfitting. Our main algorithm uses a hybrid inference approach combining variational Bayes and sequential Monte Carlo. We also present stochastic variational inference and online learning approaches for fast learning with long time series.

...read moreread less

Posted Content•

Manifold Gaussian Processes for Regression

[...]

Roberto Calandra¹, Jan Peters¹, Carl Edward Rasmussen², Marc Peter Deisenroth³•Institutions (3)

Technische Universität Darmstadt¹, University of Cambridge², Imperial College London³

24 Feb 2014-arXiv: Machine Learning

TL;DR: Manifold GP as discussed by the authors learns a transformation of the data into a feature space and a GP regression from the feature space to observed space, which is a full GP and allows to learn data representations which are useful for the overall regression task.

...read moreread less

Abstract: Off-the-shelf Gaussian Process (GP) covariance functions encode smoothness assumptions on the structure of the function to be modeled. To model complex and non-differentiable functions, these smoothness assumptions are often too restrictive. One way to alleviate this limitation is to find a different representation of the data by introducing a feature space. This feature space is often learned in an unsupervised way, which might lead to data representations that are not useful for the overall regression task. In this paper, we propose Manifold Gaussian Processes, a novel supervised method that jointly learns a transformation of the data into a feature space and a GP regression from the feature space to observed space. The Manifold GP is a full GP and allows to learn data representations, which are useful for the overall regression task. As a proof-of-concept, we evaluate our approach on complex non-smooth functions where standard GPs perform poorly, such as step functions and robotics tasks with contacts.

...read moreread less

Journal Article•DOI•

Nonparametric Bayes dynamic modelling of relational data

[...]

Daniele Durante¹, David B. Dunson²•Institutions (2)

University of Padua¹, Duke University²

01 Dec 2014-Biometrika

TL;DR: A nonparametric Bayesian dynamic model is proposed, which reduces dimensionality in characterizing the binary matrix through a lower-dimensional latent space representation, with the latent coordinates evolving in continuous time via Gaussian processes, to obtain a flexible and computationally tractable formulation.

...read moreread less

Abstract: Symmetric binary matrices representing relations are collected in many areas. Our focus is on dynamically evolving binary relational matrices, with interest being on inference on the relationship structure and prediction. We propose a nonparametric Bayesian dynamic model, which reduces dimensionality in characterizing the binary matrix through a lower-dimensional latent space representation, with the latent coordinates evolving in continuous time via Gaussian processes. By using a logistic mapping function from the link probability matrix space to the latent relational space, we obtain a flexible and computationally tractable formulation. Employing Polya-gamma data augmentation, an efficient Gibbs sampler is developed for posterior computation, with the dimension of the latent space automatically inferred. We provide theoretical results on flexibility of the model, and illustrate its performance via simulation experiments. We also consider an application to co-movements in world financial markets.

...read moreread less

Report•DOI•

Anti-concentration and honest, adaptive confidence bands

[...]

Victor Chernozhukov, Denis Chetverikov, Kengo Kato

01 Oct 2014-Annals of Statistics

TL;DR: In this paper, an anti-concentration property of the supremum of a Gaussian process is derived from an inequality leading to a generalized SBR condition for separable Gaussian processes.

...read moreread less

Abstract: Modern construction of uniform condence e and Nickl (2010). This condition requires the existence of a limit distribution of an extreme value type for the supremum of a studentized empirical process (equivalently, for the supremum of a Gaussian process with the same covariance function as that of the studentized empirical process). The principal contribution of this paper is to remove the need for this classical condition. We show that a considerably weaker sucient condi- tion is derived from an anti-concentration property of the supremum of the approximating Gaussian process, and we derive an inequality lead- ing to such a property for separable Gaussian processes. We refer to the new condition as a generalized SBR condition. Our new result shows that the supremum does not concentrate too fast around any value. We then apply this result to derive a Gaussian multiplier bootstrap procedure for constructing honest condence bands for nonparametric density estimators (this result can be applied in other nonparametric problems as well). An essential advantage of our approach is that it ap- plies generically even in those cases where the limit distribution of the supremum of the studentized empirical process does not exist (or is un- known). This is of particular importance in problems where resolution levels or other tuning parameters have been chosen in a data-driven fash- ion, which is needed for adaptive constructions of the condence bands. Furthermore, our approach is asymptotically honest at a polynomial rate { namely, the error in coverage level converges to zero at a fast, polynomial speed (with respect to the sample size). In sharp contrast, the approach based on extreme value theory is asymptotically honest only at a logarithmic rate { the error converges to zero at a slow, loga- rithmic speed. Finally, of independent interest is our introduction of a new, practical version of Lepski's method, which computes the optimal, non-conservative resolution levels via a Gaussian multiplier bootstrap method.

...read moreread less

Journal Article•DOI•

Corrupted Sensing: Novel Guarantees for Separating Structured Signals

[...]

Rina Foygel¹, Lester Mackey¹•Institutions (1)

Stanford University¹

01 Feb 2014-IEEE Transactions on Information Theory

TL;DR: In this paper, a convex programming approach is used to disentangle signal and corruption, and conditions for exact signal recovery from structured corruption and stable signal recovery with added unstructured noise are provided.

...read moreread less

Abstract: We study the problem of corrupted sensing, a generalization of compressed sensing in which one aims to recover a signal from a collection of corrupted or unreliable measurements. While an arbitrary signal cannot be recovered in the face of arbitrary corruption, tractable recovery is possible when both signal and corruption are suitably structured. We quantify the relationship between signal recovery and two geometric measures of structure, the Gaussian complexity of a tangent cone, and the Gaussian distance to a subdifferential. We take a convex programming approach to disentangling signal and corruption, analyzing both penalized programs that tradeoff between signal and corruption complexity, and constrained programs that bound the complexity of signal or corruption when prior information is available. In each case, we provide conditions for exact signal recovery from structured corruption and stable signal recovery from structured corruption with added unstructured noise. Our simulations demonstrate close agreement between our theoretical recovery bounds and the sharp phase transitions observed in practice. In addition, we provide new interpretable bounds for the Gaussian complexity of sparse vectors, block-sparse vectors, and low-rank matrices, which lead to sharper guarantees of recovery when combined with our results and those in the literature.

...read moreread less

Collapse