scispace - formally typeset
Search or ask a question

Showing papers on "Gaussian process published in 2003"


Book ChapterDOI
TL;DR: In this paper, the authors give a basic introduction to Gaussian Process regression models and present the simple equations for incorporating training data and examine how to learn the hyperparameters using the marginal likelihood.
Abstract: We give a basic introduction to Gaussian Process regression models. We focus on understanding the role of the stochastic process and how it is used to define a distribution over functions. We present the simple equations for incorporating training data and examine how to learn the hyperparameters using the marginal likelihood. We explain the practical advantages of Gaussian Process and end with conclusions and a look at the current trends in GP work.

6,295 citations


Book ChapterDOI
01 Mar 2003
TL;DR: In this paper, the authors investigate the use of data-dependent estimates of the complexity of a function class, called Rademacher and Gaussian complexities, in a decision theoretic setting and prove general risk bounds in terms of these complexities.
Abstract: We investigate the use of certain data-dependent estimates of the complexity of a function class, called Rademacher and Gaussian complexities. In a decision theoretic setting, we prove general risk bounds in terms of these complexities. We consider function classes that can be expressed as combinations of functions from basis classes and show how the Rademacher and Gaussian complexities of such a function class can be bounded in terms of the complexity of the basis classes. We give examples of the application of these techniques in finding data-dependent risk bounds for decision trees, neural networks and support vector machines.

2,535 citations


Proceedings Article
09 Dec 2003
TL;DR: A new underlying probabilistic model for principal component analysis (PCA) is introduced that shows that if the prior's covariance function constrains the mappings to be linear the model is equivalent to PCA, and is extended by considering less restrictive covariance functions which allow non-linear mappings.
Abstract: In this paper we introduce a new underlying probabilistic model for principal component analysis (PCA). Our formulation interprets PCA as a particular Gaussian process prior on a mapping from a latent space to the observed data-space. We show that if the prior's covariance function constrains the mappings to be linear the model is equivalent to PCA, we then extend the model by considering less restrictive covariance functions which allow non-linear mappings. This more general Gaussian process latent variable model (GPLVM) is then evaluated as an approach to the visualisation of high dimensional data for three different data-sets. Additionally our non-linear algorithm can be further kernelised leading to 'twin kernel PCA' in which a mapping between feature spaces occurs.

843 citations


Journal ArticleDOI
TL;DR: The concept of cumulant and bispectrum are embedded into the ARMA model in order to facilitate Gaussian and non-Gaussian process considerations, and with embodiment of a Gaussianity verification procedure, the forecasted model is identified more appropriately.
Abstract: In this paper, the short-term load forecast by use of autoregressive moving average (ARMA) model including non-Gaussian process considerations is proposed. In the proposed method, the concept of cumulant and bispectrum are embedded into the ARMA model in order to facilitate Gaussian and non-Gaussian process. With embodiment of a Gaussianity verification procedure, the forecasted model is identified more appropriately. Therefore, the performance of ARMA model is better ensured, improving the load forecast accuracy significantly. The proposed method has been applied on a practical system and the results are compared with other published techniques.

597 citations


Journal ArticleDOI
TL;DR: In this article, the authors focus on the local case and show how such modeling can be formalized in the context of Gaussian responses providing attractive interpretation in terms of both random effects and explaining residuals.
Abstract: In many applications, the objective is to build regression models to explain a response variable over a region of interest under the assumption that the responses are spatially correlated. In nearly all of this work, the regression coefficients are assumed to be constant over the region. However, in some applications, coefficients are expected to vary at the local or subregional level. Here we focus on the local case. Although parametric modeling of the spatial surface for the coefficient is possible, here we argue that it is more natural and flexible to view the surface as a realization from a spatial process. We show how such modeling can be formalized in the context of Gaussian responses providing attractive interpretation in terms of both random effects and explaining residuals. We also offer extensions to generalized linear models and to spatio-temporal setting. We illustrate both static and dynamic modeling with a dataset that attempts to explain (log) selling price of single-family houses.

572 citations


Proceedings ArticleDOI
13 Oct 2003
TL;DR: An improved fast Gauss transform is developed to efficiently estimate sums of Gaussians in higher dimensions, where a new multivariate expansion scheme and an adaptive space subdivision technique dramatically improve the performance.
Abstract: Evaluating sums of multivariate Gaussians is a common computational task in computer vision and pattern recognition, including in the general and powerful kernel density estimation technique. The quadratic computational complexity of the summation is a significant barrier to the scalability of this algorithm to practical applications. The fast Gauss transform (FGT) has successfully accelerated the kernel density estimation to linear running time for low-dimensional problems. Unfortunately, the cost of a direct extension of the FGT to higher-dimensional problems grows exponentially with dimension, making it impractical for dimensions above 3. We develop an improved fast Gauss transform to efficiently estimate sums of Gaussians in higher dimensions, where a new multivariate expansion scheme and an adaptive space subdivision technique dramatically improve the performance. The improved FGT has been applied to the mean shift algorithm achieving linear computational complexity. Experimental results demonstrate the efficiency and effectiveness of our algorithm.

492 citations


Proceedings Article
01 Jan 2003
TL;DR: A method for the sparse greedy approximation of Bayesian Gaussian process regression, featuring a novel heuristic for very fast forward selection, which leads to a sufficiently stable approximation of the log marginal likelihood of the training data, which can be optimised to adjust a large number of hyperparameters automatically.
Abstract: We present a method for the sparse greedy approximation of Bayesian Gaussian process regression, featuring a novel heuristic for very fast forward selection Our method is essentially as fast as an equivalent one which selects the "support" patterns at random, yet it can outperform random selection on hard curve fitting tasks More importantly, it leads to a sufficiently stable approximation of the log marginal likelihood of the training data, which can be optimised to adjust a large number of hyperparameters automatically We demonstrate the model selection capabilities of the algorithm in a range of experiments In line with the development of our method, we present a simple view on sparse approximations for GP models and their underlying assumptions and show relations to other methods

487 citations


Journal ArticleDOI
TL;DR: The use of Gaussian particle filters and Gaussian sum particle filters are extended to dynamic state space (DSS) models with non-Gaussian noise and problems involving heavy-tailed densities can be conveniently addressed.
Abstract: We use the Gaussian particle filter to build several types of Gaussian sum particle filters. These filters approximate the filtering and predictive distributions by weighted Gaussian mixtures and are basically banks of Gaussian particle filters. Then, we extend the use of Gaussian particle filters and Gaussian sum particle filters to dynamic state space (DSS) models with non-Gaussian noise. With non-Gaussian noise approximated by Gaussian mixtures, the non-Gaussian noise models are approximated by banks of Gaussian noise models, and Gaussian mixture filters are developed using algorithms developed for Gaussian noise DSS models. As a result, problems involving heavy-tailed densities can be conveniently addressed. Simulations are presented to exhibit the application of the framework developed herein, and the performance of the algorithms is examined.

484 citations


Journal ArticleDOI
TL;DR: The computation of the exact distribution of a maximally selected rank statistic is discussed and a new lower bound of the distribution is derived based on an extension of an algorithm for the exactribution of a linear rank statistic.

466 citations


Journal ArticleDOI
TL;DR: Inspired by neurophysiology experiments in which neural spiking activity is induced by an implicit (latent) stimulus, an algorithm to estimate a state-space model observed through point process measurements is developed.
Abstract: A widely used signal processing paradigm is the state-space model. The state-space model is defined by two equations: an observation equation that describes how the hidden state or latent process is observed and a state equation that defines the evolution of the process through time. Inspired by neurophysiology experiments in which neural spiking activity is induced by an implicit (latent) stimulus, we develop an algorithm to estimate a state-space model observed through point process measurements. We represent the latent process modulating the neural spiking activity as a gaussian autoregressive model driven by an external stimulus. Given the latent process, neural spiking activity is characterized as a general point process defined by its conditional intensity function. We develop an approximate expectation-maximization (EM) algorithm to estimate the unobservable state-space process, its parameters, and the parameters of the point process. The EM algorithm combines a point process recursive nonlinear filter algorithm, the fixed interval smoothing algorithm, and the state-space covariance algorithm to compute the complete data log likelihood efficiently. We use a Kolmogorov-Smirnov test based on the time-rescaling theorem to evaluate agreement between the model and point process data. We illustrate the model with two simulated data examples: an ensemble of Poisson neurons driven by a common stimulus and a single neuron whose conditional intensity function is approximated as a local Bernoulli process.

407 citations


Journal ArticleDOI
TL;DR: An adaptation of a new expectation-maximization based competitive mixture decomposition algorithm is introduced and it is shown that it efficiently and reliably performs mixture decompositions of t-distributions.

Proceedings Article
09 Dec 2003
TL;DR: This work generalises the Gaussian process (GP) framework for regression by learning a nonlinear transformation of the GP outputs, which allows for non-Gaussian processes and non- Gaussian noise.
Abstract: We generalise the Gaussian process (GP) framework for regression by learning a nonlinear transformation of the GP outputs. This allows for non-Gaussian processes and non-Gaussian noise. The learning algorithm chooses a nonlinear transformation such that transformed data is well-modelled by a GP. This can be seen as including a preprocessing transformation as an integral part of the probabilistic modelling problem, rather than as an ad-hoc step. We demonstrate on several real regression problems that learning the transformation can lead to significantly better performance than using a regular GP, or a GP with a fixed transformation.

Proceedings Article
09 Dec 2003
TL;DR: In experiments, the nonstationary GP regression model performs well when the input space is two or three dimensions, outperforming a neural network model and Bayesian free-knot spline models, and competitive with a Bayesian neural network, but is outperformed in one dimension by a state-of-the-art BayesianFree-k not spline model.
Abstract: We introduce a class of nonstationary covariance functions for Gaussian process (GP) regression. Nonstationary covariance functions allow the model to adapt to functions whose smoothness varies with the inputs. The class includes a nonstationary version of the Matern stationary co-variance, in which the differentiability of the regression function is controlled by a parameter, freeing one from fixing the differentiability in advance. In experiments, the nonstationary GP regression model performs well when the input space is two or three dimensions, outperforming a neural network model and Bayesian free-knot spline models, and competitive with a Bayesian neural network, but is outperformed in one dimension by a state-of-the-art Bayesian free-knot spline model. The model readily generalizes to non-Gaussian data. Use of computational methods for speeding GP fitting may allow for implementation of the method on larger datasets.

Journal ArticleDOI
TL;DR: In this paper, a Bayesian model is proposed to address the anisot- ropy problem, where the correlation function of the spatial process is defined by reference to a latent space, denoted by D, where stationarity and isotropy hold.
Abstract: Summary. In geostatistics it is common practice to assume that the underlying spatial process is stationary and isotropic, i.e. the spatial distribution is unchanged when the origin of the index set is translated and under rotation about the origin. However, in environmental problems, such assumptions are not realistic since local influences in the correlation structure of the spatial process may be found in the data. The paper proposes a Bayesian model to address the anisot- ropy problem. Following Sampson and Guttorp, we define the correlation function of the spatial process by reference to a latent space, denoted by D, where stationarity and isotropy hold. The space where the gauged monitoring sites lie is denoted by G. We adopt a Bayesian approach in which the mapping between G and D is represented by an unknown function d(·). A Gaussian process prior distribution is defined for d(·). Unlike the Sampson–Guttorp approach, the mapping of both gauged and ungauged sites is handled in a single framework, and predictive inferences take explicit account of uncertainty in the mapping. Markov chain Monte Carlo methods are used to obtain samples from the posterior distributions. Two examples are discussed: a simulated data set and the solar radiation data set that also was analysed by Sampson and Guttorp.

Journal ArticleDOI
TL;DR: By applying the PAC-Bayesian theorem of McAllester (1999a), this paper proves distribution-free generalisation error bounds for a wide range of approximate Bayesian GP classification techniques, giving a strong learning-theoretical justification for the use of these techniques.
Abstract: Approximate Bayesian Gaussian process (GP) classification techniques are powerful non-parametric learning methods, similar in appearance and performance to support vector machines. Based on simple probabilistic models, they render interpretable results and can be embedded in Bayesian frameworks for model selection, feature selection, etc. In this paper, by applying the PAC-Bayesian theorem of McAllester (1999a), we prove distribution-free generalisation error bounds for a wide range of approximate Bayesian GP classification techniques. We also provide a new and much simplified proof for this powerful theorem, making use of the concept of convex duality which is a backbone of many machine learning techniques. We instantiate and test our bounds for two particular GPC techniques, including a recent sparse method which circumvents the unfavourable scaling of standard GP algorithms. As is shown in experiments on a real-world task, the bounds can be very tight for moderate training sample sizes. To the best of our knowledge, these results provide the tightest known distribution-free error bounds for approximate Bayesian GPC methods, giving a strong learning-theoretical justification for the use of these techniques.

Proceedings Article
09 Dec 2003
TL;DR: It is speculated that the intrinsic ability of GP models to characterise distributions of functions would allow the method to capture entire distributions over future values instead of merely their expectation, which has traditionally been the focus of much of reinforcement learning.
Abstract: We exploit some useful properties of Gaussian process (GP) regression models for reinforcement learning in continuous state spaces and discrete time. We demonstrate how the GP model allows evaluation of the value function in closed form. The resulting policy iteration algorithm is demonstrated on a simple problem with a two dimensional state space. Further, we speculate that the intrinsic ability of GP models to characterise distributions of functions would allow the method to capture entire distributions over future values instead of merely their expectation, which has traditionally been the focus of much of reinforcement learning.

BookDOI
01 Jan 2003
TL;DR: Nonlinear Classification, Approximation Theory and Signal Processing, Modeling of Complex Objects, and Splines Gaussian Processes and Support Vector Machines.
Abstract: Nonlinear Classification * Approximation Theory and Signal Processing * Modeling of Complex Objects * Splines Gaussian Processes and Support Vector Machines * Case Studies * Theory * Machine Learning and Optimization * Future Directions

Proceedings Article
21 Aug 2003
TL;DR: A novel Bayesian approach to the problem of value function estimation in continuous state spaces by imposing a Gaussian prior over value functions and assuming aGaussian noise model is presented.
Abstract: We present a novel Bayesian approach to the problem of value function estimation in continuous state spaces. We define a probabilistic generative model for the value function by imposing a Gaussian prior over value functions and assuming a Gaussian noise model. Due to the Gaussian nature of the random processes involved, the posterior distribution of the value function is also Gaussian and is therefore described entirely by its mean and covariance. We derive exact expressions for the posterior process moments, and utilizing an efficient sequential sparsification method, we describe an on-line algorithm for learning them. We demonstrate the operation of the algorithm on a 2-dimensional continuous spatial navigation domain.

Journal ArticleDOI
TL;DR: Expressions for multivariate Rayleigh and exponential probability density functions (PDFs) generated from correlated Gaussian random variables can serve as a useful tool in the performance analysis of digital modulation over correlated Rayleigh-fading channels using diversity combining.
Abstract: In this paper, expressions for multivariate Rayleigh and exponential probability density functions (PDFs) generated from correlated Gaussian random variables are presented. We first obtain a general integral form of the PDFs, and then study the case when the complex Gaussian generating vector is circular. We consider two specific circular cases: the exchangeable case when the variates are evenly correlated, and the exponentially correlated case. Expressions for the multivariate PDF in these cases are obtained in integral form as well as in the form of a series of products of univariate PDFs. We also derive a general expression for the multivariate exponential characteristic function (CF) in terms of determinants. In the exchangeable and exponentially correlated cases, CF expressions are obtained in the form of a series of products of univariate gamma CFs. The CF of the sum of exponential variates in these cases is obtained in closed form. Finally, the bivariate case is presented mentioning its main features. While the integral forms of the multivariate PDFs provide a general analytical framework, the series and determinant expressions for the exponential CFs and the series expressions for the PDFs can serve as a useful tool in the performance analysis of digital modulation over correlated Rayleigh-fading channels using diversity combining.

Dissertation
01 Jul 2003
TL;DR: The tractability and usefulness of simple greedy forward selection with information-theoretic criteria previously used in active learning is demonstrated and generic schemes for automatic model selection with many (hyper)parameters are developed.
Abstract: Non-parametric models and techniques enjoy a growing popularity in the field of machine learning, and among these Bayesian inference for Gaussian process (GP) models has recently received significant attention. We feel that GP priors should be part of the standard toolbox for constructing models relevant to machine learning in the same way as parametric linear models are, and the results in this thesis help to remove some obstacles on the way towards this goal. In the first main chapter, we provide a distribution-free finite sample bound on the difference between generalisation and empirical (training) error for GP classification methods. While the general theorem (the PAC-Bayesian bound) is not new, we give a much simplified and somewhat generalised derivation and point out the underlying core technique (convex duality) explicitly. Furthermore, the application to GP models is novel (to our knowledge). A central feature of this bound is that its quality depends crucially on task knowledge being encoded faithfully in the model and prior distributions, so there is a mutual benefit between a sharp theoretical guarantee and empirically well-established statistical practices. Extensive simulations on real-world classification tasks indicate an impressive tightness of the bound, in spite of the fact that many previous bounds for related kernel machines fail to give non-trivial guarantees in this practically relevant regime. In the second main chapter, sparse approximations are developed to address the problem of the unfavourable scaling of most GP techniques with large training sets. Due to its high importance in practice, this problem has received a lot of attention recently. We demonstrate the tractability and usefulness of simple greedy forward selection with information-theoretic criteria previously used in active learning (or sequential design) and develop generic schemes for automatic model selection with many (hyper)parameters. We suggest two new generic schemes and evaluate some of their variants on large real-world classification and regression tasks. These schemes and their underlying principles (which are clearly stated and analysed) can be applied to obtain sparse approximations for a wide regime of GP models far beyond the special cases we studied here.

16 Sep 2003
TL;DR: This work proposes to use a Gaussian Process model of the (log of the) posterior for most of the computations required by HMC, allowing Bayesian treatment of models with posteriors that are computationally demanding, such as models involving computer simulation.
Abstract: Hybrid Monte Carlo (HMC) is often the method of choice for computing Bayesian integrals that are not analytically tractable. However the success of this method may require a very large number of evaluations of the (un-normalized) posterior and its partial derivatives. In situations where the posterior is computationally costly to evaluate, this may lead to an unacceptable computational load for HMC. I propose to use a Gaussian Process model of the (log of the) posterior for most of the computations required by HMC. Within this scheme only occasional evaluation of the actual posterior is required to guarantee that the samples generated have exactly the desired distribution, even if the GP model is somewhat inaccurate. The method is demonstrated on a 10 dimensional problem, where 200 evaluations suffice for the generation of 100 roughly independent points from the posterior. Thus, the proposed scheme allows Bayesian treatment of models with posteriors that are computationally demanding, such as models involving computer simulation.

Journal ArticleDOI
TL;DR: The proposed Gauss-Markov framework provides a mechanism for capturing the slow and random drift in the fixed-pattern noise as the operational conditions of the sensor vary in time.
Abstract: A novel statistical approach is undertaken for the adaptive estimation of the gain and bias nonuniformity in infrared focal-plane array sensors from scene data. The gain and the bias of each detector are regarded as random state variables modeled by a discrete-time Gauss–Markov process. The proposed Gauss–Markov framework provides a mechanism for capturing the slow and random drift in the fixed-pattern noise as the operational conditions of the sensor vary in time. With a temporal stochastic model for each detector’s gain and bias at hand, a Kalman filter is derived that uses scene data, comprising the detector’s readout values sampled over a short period of time, to optimally update the detector’s gain and bias estimates as these parameters drift. The proposed technique relies on a certain spatiotemporal diversity condition in the data, which is satisfied when all detectors see approximately the same range of temperatures within the periods between successive estimation epochs. The performance of the proposed technique is thoroughly studied, and its utility in mitigating fixed-pattern noise is demonstrated with both real infrared and simulated imagery.

Proceedings ArticleDOI
06 Apr 2003
TL;DR: A novel recursive Bayesian estimation algorithm that combines an importance sampling based measurement update step with a bank of sigma-point Kalman filters for the time-update and proposal distribution generation is presented.
Abstract: For sequential probabilistic inference in nonlinear non-Gaussian systems, approximate solutions must be used. We present a novel recursive Bayesian estimation algorithm that combines an importance sampling based measurement update step with a bank of sigma-point Kalman filters for the time-update and proposal distribution generation. The posterior state density is represented by a Gaussian mixture model that is recovered from the weighted particle set of the measurement update step by means of a weighted EM algorithm. This step replaces the resampling stage needed by most particle filters and mitigates the "sample depletion" problem. We show that this new approach has an improved estimation performance and reduced computational complexity compared to other related algorithms.

Proceedings Article
09 Dec 2003
TL;DR: The goal is to estimate a mobile user's position, based on measurements of the signal strengths received from network base stations, by building Gaussian process models for the distribution of signal strengths, as obtained in a series of calibration measurements.
Abstract: In this article, we present a novel approach to solving the localization problem in cellular networks. The goal is to estimate a mobile user's position, based on measurements of the signal strengths received from network base stations. Our solution works by building Gaussian process models for the distribution of signal strengths, as obtained in a series of calibration measurements. In the localization stage, the user's position can be estimated by maximizing the likelihood of received signal strengths with respect to the position. We investigate the accuracy of the proposed approach on data obtained within a large indoor cellular network.

DOI
01 Jan 2003
TL;DR: It is shown that the Gaussian random fields and harmonic energy minimizing function framework for semi-supervised learning can be viewed in terms of Gaussian processes, with covariance matrices derived from the graph Laplacian, to derive hyperparameter learning with evidence maximization.
Abstract: "We show that the Gaussian random fields and harmonic energy minimizing function framework for semi-supervised learning can be viewed in terms of Gaussian processes, with covariance matrices derived from the graph Laplacian. We derive hyperparameter learning with evidence maximization, and give an empirical study of various ways to parameterize the graph weights."

Journal ArticleDOI
TL;DR: A low complexity quantization scheme using transform coding and bit allocation techniques which allows for easy mapping from observation to quantized value is developed for both fixed rate and variable rate systems.
Abstract: A computationally efficient, high quality, vector quantization scheme based on a parametric probability density function (PDF) is proposed. In this scheme, the observations are modeled as i.i.d realizations of a multivariate Gaussian mixture density. The mixture model parameters are efficiently estimated using the expectation maximization (EM) algorithm. A low complexity quantization scheme using transform coding and bit allocation techniques which allows for easy mapping from observation to quantized value is developed for both fixed rate and variable rate systems. An attractive feature of this method is that source encoding using the resultant codebook involves very few searches and its computational complexity is minimal and independent of the rate of the system. Furthermore, the proposed scheme is bit scalable and can switch seamlessly between a memoryless quantizer and a quantizer with memory. The usefulness of the approach is demonstrated for speech coding where Gaussian mixture models are used to model speech line spectral frequencies. The performance of the memoryless quantizer is 1-3 bits better than conventional quantization schemes.

Proceedings ArticleDOI
18 Jun 2003
TL;DR: A generative model based approach to man-made structure detection in 2D (two-dimensional) natural images by using a causal multiscale random field as a prior model on the class labels on the image sites to capture the local dependencies in the data using a multiscales feature vector.
Abstract: This paper presents a generative model based approach to man-made structure detection in 2D (two-dimensional) natural images. The proposed approach uses a causal multiscale random field suggested by Bouman and Shapiro (1994) as a prior model on the class labels on the image sites. However, instead of assuming the conditional independence of the observed data, we propose to capture the local dependencies in the data using a multiscale feature vector. The distribution of the multiscale feature vectors is modeled as mixture of Gaussians. A set of robust multi-scale features is presented that captures the general statistical properties of man-made structures at multiple scales without relying on explicit edge detection. The proposed approach was validated on real-world images from the Corel data set, and a performance comparison with other techniques is presented.

Journal ArticleDOI
TL;DR: In this paper, the detailed distributional properties of integrated non-Gaussian Ornstein-Uhlenbeck (intOU) processes are studied and the tail behavior of the intOUprocess is analyzed.
Abstract: In this paper, we study the detailed distributional properties of integrated non- Gaussian Ornstein-Uhlenbeck (intOU) processes. Both exact and approximate results are given. We emphasize the study of the tail behaviour of the intOUprocess. Our results have many potential applications in financial economics, as OUprocesses are used as models of instantaneous variance in stochastic volatility (SV) models. In this case, an intOUprocess can be regarded as a model of integrated variance. Hence, the tail behaviour of the intOUprocess will determine the tail behaviour of returns generated by SV models.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a rich class of covariance functions developed through the so-called linear coregionalization model for multivariate spatial observations, which can be used to reparameterize a multiivariate spatial model using suitable univariate conditional spatial processes, facilitating the computation.
Abstract: [1] Spatial data collection increasingly turns to vector valued measurements at spatial locations. An example is the observation of pollutant measurements. Typically, several different pollutants are observed at the same sampled location, referred to as a monitoring station or gauged site. Usually, interest lies in the modeling of the joint process for the levels of the different pollutants and in the prediction of pollutant levels at ungauged sites. In this case, it is important to take into account not only the spatial correlation but also the correlation among the different variables at each gauged site. Since, conceptually, there is a potentially observable measurement vector at every location in the study region, a multivariate spatial process becomes a natural modeling choice. In using a Gaussian process, the main challenge is the specification of a valid and flexible cross-covariance function. This paper proposes a rich class of covariance functions developed through the so-called linear coregionalization model [see, e.g., Wackernagel, 1998] for multivariate spatial observations. Following the ideas in the work of, for example, Royle and Berliner [1999], we can reparameterize a multivariate spatial model using suitable univariate conditional spatial processes, facilitating the computation. We provide explicit details, including the computation of the range associated with the different component processes. As an example, we fit our model to a particular day average of CO, NO, and NO2 for a set of monitoring stations in California, USA.

Proceedings ArticleDOI
06 Apr 2003
Abstract: The object of Bayesian modelling is predictive distribution, which, in a forecasting scenario, enables evaluation of forecasted values and their uncertainties. We focus on reliably estimating the predictive mean and variance of forecasted values using Bayesian kernel based models such as the Gaussian process and the relevance vector machine. We derive novel analytic expressions for the predictive mean and variance for Gaussian kernel shapes under the assumption of a Gaussian input distribution in the static case, and of a recursive Gaussian predictive density in iterative forecasting. The capability of the method is demonstrated for forecasting of time-series and compared to approximate methods.