Open AccessDissertation
Gaussian processes:iterative sparse approximations
TLDR
This thesis proposes a two-step solution to construct a probabilistic approximation to the posterior of Gaussian processes, and combines the sparse approximation with an extension to the Bayesian online algorithm that allows multiple iterations for each input and thus approximating a batch solution.Abstract:
In recent years there has been an increased interest in applying non-parametric methods to real-world problems. Significant research has been devoted to Gaussian processes (GPs) due to their increased flexibility when compared with parametric models. These methods use Bayesian learning, which generally leads to analytically intractable posteriors. This thesis proposes a two-step solution to construct a probabilistic approximation to the posterior. In the first step we adapt the Bayesian online learning to GPs: the final approximation to the posterior is the result of propagating the first and second moments of intermediate posteriors obtained by combining a new example with the previous approximation. The propagation of em functional forms is solved by showing the existence of a parametrisation to posterior moments that uses combinations of the kernel function at the training points, transforming the Bayesian online learning of functions into a parametric formulation. The drawback is the prohibitive quadratic scaling of the number of parameters with the size of the data, making the method inapplicable to large datasets. The second step solves the problem of the exploding parameter size and makes GPs applicable to arbitrarily large datasets. The approximation is based on a measure of distance between two GPs, the KL-divergence between GPs. This second approximation is with a constrained GP in which only a small subset of the whole training dataset is used to represent the GP. This subset is called the em Basis Vector, or BV set and the resulting GP is a sparse approximation to the true posterior. As this sparsity is based on the KL-minimisation, it is probabilistic and independent of the way the posterior approximation from the first step is obtained. We combine the sparse approximation with an extension to the Bayesian online algorithm that allows multiple iterations for each input and thus approximating a batch solution. The resulting sparse learning algorithm is a generic one: for different problems we only change the likelihood. The algorithm is applied to a variety of problems and we examine its performance both on more classical regression and classification tasks and to the data-assimilation and a simple density estimation problems.read more
Citations
More filters
BookDOI
Semi-Supervised Learning
TL;DR: Semi-supervised learning (SSL) as discussed by the authors is the middle ground between supervised learning (in which all training examples are labeled) and unsupervised training (where no label data are given).
Proceedings Article
Sparse Gaussian Processes using Pseudo-inputs
Edward Snelson,Zoubin Ghahramani +1 more
TL;DR: It is shown that this new Gaussian process (GP) regression model can match full GP performance with small M, i.e. very sparse solutions, and it significantly outperforms other approaches in this regime.
Journal ArticleDOI
Gaussian predictive process models for large spatial data sets
TL;DR: This work achieves the flexibility to accommodate non‐stationary, non‐Gaussian, possibly multivariate, possibly spatiotemporal processes in the context of large data sets in the form of a computational template encompassing these diverse settings.
Journal Article
Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models
TL;DR: A novel probabilistic interpretation of principal component analysis (PCA) that is based on a Gaussian process latent variable model (GP-LVM), and related to popular spectral techniques such as kernel PCA and multidimensional scaling.
Proceedings Article
Fast Forward Selection to Speed Up Sparse Gaussian Process Regression
TL;DR: A method for the sparse greedy approximation of Bayesian Gaussian process regression, featuring a novel heuristic for very fast forward selection, which leads to a sufficiently stable approximation of the log marginal likelihood of the training data, which can be optimised to adjust a large number of hyperparameters automatically.
References
More filters
Book
Elements of information theory
Thomas M. Cover,Joy A. Thomas +1 more
TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
Book
The Nature of Statistical Learning Theory
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Book
Neural Networks: A Comprehensive Foundation
TL;DR: Thorough, well-organized, and completely up to date, this book examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks.
Book
Table of Integrals, Series, and Products
TL;DR: Combinations involving trigonometric and hyperbolic functions and power 5 Indefinite Integrals of Special Functions 6 Definite Integral Integral Functions 7.Associated Legendre Functions 8 Special Functions 9 Hypergeometric Functions 10 Vector Field Theory 11 Algebraic Inequalities 12 Integral Inequality 13 Matrices and related results 14 Determinants 15 Norms 16 Ordinary differential equations 17 Fourier, Laplace, and Mellin Transforms 18 The z-transform