scispace - formally typeset
Search or ask a question
Book ChapterDOI

Regression with Gaussian processes

01 Oct 1997-pp 378-382
TL;DR: This paper investigates the use of Gaussian process priors over functions, which permit the predictive Bayesian analysis to be carried out exactly using matrix operations.
Abstract: The Bayesian analysis of neural networks is difficult because the prior over functions has a complex form, leading to implementations that either make approximations or use Monte Carlo integration techniques. In this paper I investigate the use of Gaussian process priors over functions, which permit the predictive Bayesian analysis to be carried out exactly using matrix operations. The method has been tested on two challenging problems and has produced excellent results.

Content maybe subject to copyright    Report

Citations
More filters
01 Jan 2007
TL;DR: An attempt has been made to review the existing theory, methods, recent developments and scopes of Support Vector Regression.
Abstract: Instead of minimizing the observed training error, Support Vector Regression (SVR) attempts to minimize the generalization error bound so as to achieve generalized performance. The idea of SVR is based on the computation of a linear regression function in a high dimensional feature space where the input data are mapped via a nonlinear function. SVR has been applied in various fields - time series and financial (noisy and risky) prediction, approximation of complex engineering analyses, convex quadratic programming and choices of loss functions, etc. In this paper, an attempt has been made to review the existing theory, methods, recent developments and scopes of SVR.

1,467 citations


Cites methods from "Regression with Gaussian processes"

  • ...A new smoothing strategy has been proposed for solving epsilon-SVR tolerating a small error in fitting a given data set linearly or nonlinearly [63]....

    [...]

Dissertation
01 Jan 1997
TL;DR: It is shown that a Bayesian approach to learning in multi-layer perceptron neural networks achieves better performance than the commonly used early stopping procedure, even for reasonably short amounts of computation time.
Abstract: This thesis develops two Bayesian learning methods relying on Gaussian processes and a rigorous statistical approach for evaluating such methods. In these experimental designs the sources of uncertainty in the estimated generalisation performances due to both variation in training and test sets are accounted for. The framework allows for estimation of generalisation performance as well as statistical tests of significance for pairwise comparisons. Two experimental designs are recommended and supported by the DELVE software environment. Two new non-parametric Bayesian learning methods relying on Gaussian process priors over functions are developed. These priors are controlled by hyperparameters which set the characteristic length scale for each input dimension. In the simplest method, these parameters are fit from the data using optimization. In the second, fully Bayesian method, a Markov chain Monte Carlo technique is used to integrate over the hyperparameters. One advantage of these Gaussian process methods is that the priors and hyperparameters of the trained models are easy to interpret. The Gaussian process methods are benchmarked against several other methods, on regression tasks using both real data and data generated from realistic simulations. The experiments show that small datasets are unsuitable for benchmarking purposes because the uncertainties in performance measurements are large. A second set of experiments provide strong evidence that the bagging procedure is advantageous for the Multivariate Adaptive Regression Splines (MARS) method. The simulated datasets have controlled characteristics which make them useful for understanding the relationship between properties of the dataset and the performance of different methods. The dependency of the performance on available computation time is also investigated. It is shown that a Bayesian approach to learning in multi-layer perceptron neural networks achieves better performance than the commonly used early stopping procedure, even for reasonably short amounts of computation time. The Gaussian process methods are shown to consistently outperform the more conventional methods.

467 citations


Cites methods from "Regression with Gaussian processes"

  • ...This chapter presents a new method for regression which was inspired by Neal’s work [Neal 1996] on priors for infinite networks and pursued in [Williams 1996; Williams and Rasmussen 1996]....

    [...]

Journal ArticleDOI
TL;DR: The variational methods of Jaakkola and Jordan are applied to Gaussian processes to produce an efficient Bayesian binary classifier.
Abstract: Gaussian processes are a promising nonlinear regression tool, but it is not straightforward to solve classification problems with them. In the paper the variational methods of Jaakkola and Jordan (2000) are applied to Gaussian processes to produce an efficient Bayesian binary classifier.

236 citations


Cites background or methods from "Regression with Gaussian processes"

  • ...In the alternative Gaussian process approach (Williams 1995; Williams and Rasmussen 1996), wemodel a(x) directly using a Gaussian process....

    [...]

  • ...MacKay Cavendish Laboratory Cambridge CB3 0HE United Kingdom May 28, 1997 Abstract Gaussian processes are a promising non-linear interpolation tool (Williams 1995; Williams and Rasmussen 1996), but it is not straightforward to solve classi cation problems with them....

    [...]

  • ...…CB3 0HEUnited Kingdom David J.C. MacKayCavendish LaboratoryCambridge CB3 0HEUnited KingdomMay 28, 1997AbstractGaussian processes are a promising non-linear interpolation tool (Williams 1995; Williams andRasmussen 1996), but it is not straightforward to solve classi cation problems with them....

    [...]

  • ...In the alternative Gaussian process approach (Williams 1995; Williams and Rasmussen 1996), we model a(x) directly using a Gaussian process....

    [...]

Book
22 Nov 2010
TL;DR: First, PILCO, a fully Bayesian approach for efficient RL in continuous-valued state and action spaces when no expert knowledge is available is introduced, and principled algorithms for robust filtering and smoothing in GP dynamic systems are proposed.
Abstract: This book examines Gaussian processes in both model-based reinforcement learning (RL) and inference in nonlinear dynamic systems. First, we introduce PILCO, a fully Bayesian approach for efficient RL in continuous-valued state and action spaces when no expert knowledge is available. PILCO takes model uncertainties consistently into account during long-term planning to reduce model bias. Second, we propose principled algorithms for robust filtering and smoothing in GP dynamic systems. Umfang: IX, 205 S. Preis: €36.00 | £33.00 | $63.00

200 citations


Cites background from "Regression with Gaussian processes"

  • ...Williams (1995), Williams and Rasmussen (1996), MacKay (1998), and Rasmussen (1996) introduced GPs into the machine learning community....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the main theoretical Gaussian Process (GPs) developments in the field of biogeophysical parameter retrieval are reviewed, considering new algorithms that respect signal and noise characteristics, extract knowledge via automatic relevance kernels, and allow applicability of associated uncertainty intervals to transport GP models in space and time that can be used to uncover causal relations between variables and can encode physically meaningful prior knowledge via radiative transfer model (RTM) emulation.
Abstract: Gaussian processes (GPs) have experienced tremendous success in biogeophysical parameter retrieval in the last few years. GPs constitute a solid Bayesian framework to consistently formulate many function approximation problems. This article reviews the main theoretical GP developments in the field, considering new algorithms that respect signal and noise characteristics, extract knowledge via automatic relevance kernels to yield feature rankings automatically, and allow applicability of associated uncertainty intervals to transport GP models in space and time that can be used to uncover causal relations between variables and can encode physically meaningful prior knowledge via radiative transfer model (RTM) emulation. The important issue of computational efficiency will also be addressed. These developments are illustrated in the field of geosciences and remote sensing at local and global scales through a set of illustrative examples. In particular, important problems for land, ocean, and atmosphere monitoring are considered, from accurately estimating oceanic chlorophyll content and pigments to retrieving vegetation properties from multi- and hyperspectral sensors as well as estimating atmospheric parameters (e.g., temperature, moisture, and ozone) from infrared sounders.

185 citations

References
More filters
Book
Vladimir Vapnik1
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations

Book
01 Mar 1990
TL;DR: In this paper, a theory and practice for the estimation of functions from noisy data on functionals is developed, where convergence properties, data based smoothing parameter selection, confidence intervals, and numerical methods are established which are appropriate to a number of problems within this framework.
Abstract: This book serves well as an introduction into the more theoretical aspects of the use of spline models. It develops a theory and practice for the estimation of functions from noisy data on functionals. The simplest example is the estimation of a smooth curve, given noisy observations on a finite number of its values. Convergence properties, data based smoothing parameter selection, confidence intervals, and numerical methods are established which are appropriate to a number of problems within this framework. Methods for including side conditions and other prior information in solving ill posed inverse problems are provided. Data which involves samples of random variables with Gaussian, Poisson, binomial, and other distributions are treated in a unified optimization context. Experimental design questions, i.e., which functionals should be observed, are studied in a general context. Extensions to distributed parameter system identification problems are made by considering implicitly defined functionals.

6,120 citations

Journal ArticleDOI
TL;DR: Cressie et al. as discussed by the authors presented the Statistics for Spatial Data (SDS) for the first time in 1991, and used it for the purpose of statistical analysis of spatial data.
Abstract: 5. Statistics for Spatial Data. By N. Cressie. ISBN 0 471 84336 9. Wiley, Chichester, 1991. 900 pp. £71.00.

5,555 citations

Book
01 Jan 1995
TL;DR: Bayesian Learning for Neural Networks shows that Bayesian methods allow complex neural network models to be used without fear of the "overfitting" that can occur with traditional neural network learning methods.
Abstract: From the Publisher: Artificial "neural networks" are now widely used as flexible models for regression classification applications, but questions remain regarding what these models mean, and how they can safely be used when training data is limited. Bayesian Learning for Neural Networks shows that Bayesian methods allow complex neural network models to be used without fear of the "overfitting" that can occur with traditional neural network learning methods. Insight into the nature of these complex Bayesian models is provided by a theoretical investigation of the priors over functions that underlie them. Use of these models in practice is made possible using Markov chain Monte Carlo techniques. Both the theoretical and computational aspects of this work are of wider statistical interest, as they contribute to a better understanding of how Bayesian methods can be applied to complex problems. Presupposing only the basic knowledge of probability and statistics, this book should be of interest to many researchers in statistics, engineering, and artificial intelligence. Software for Unix systems that implements the methods described is freely available over the Internet.

3,846 citations


"Regression with Gaussian processes" refers background in this paper

  • ...3.1 The robot arm problemI consider a version of MacKay's robot arm problem introduced by Neal (1995)....

    [...]

Journal ArticleDOI
01 Sep 1990
TL;DR: Regularization networks are mathematically related to the radial basis functions, mainly used for strict interpolation tasks as mentioned in this paper, and two extensions of the regularization approach are presented, along with the approach's corrections to splines, regularization, Bayes formulation, and clustering.
Abstract: The problem of the approximation of nonlinear mapping, (especially continuous mappings) is considered. Regularization theory and a theoretical framework for approximation (based on regularization techniques) that leads to a class of three-layer networks called regularization networks are discussed. Regularization networks are mathematically related to the radial basis functions, mainly used for strict interpolation tasks. Learning as approximation and learning as hypersurface reconstruction are discussed. Two extensions of the regularization approach are presented, along with the approach's corrections to splines, regularization, Bayes formulation, and clustering. The theory of regularization networks is generalized to a formulation that includes task-dependent clustering and dimensionality reduction. Applications of regularization networks are discussed. >

3,595 citations