scispace - formally typeset
Search or ask a question

Showing papers on "Gaussian process published in 2013"


Proceedings Article
11 Aug 2013
TL;DR: In this article, the authors introduce stochastic variational inference for Gaussian process models, which enables the application of Gaussian Process (GP) models to data sets containing millions of data points.
Abstract: We introduce stochastic variational inference for Gaussian process models. This enables the application of Gaussian process (GP) models to data sets containing millions of data points. We show how GPs can be variationally decomposed to depend on a set of globally relevant inducing variables which factorize the model in the necessary manner to perform variational inference. Our approach is readily extended to models with non-Gaussian likelihoods and latent variable models based around Gaussian processes. We demonstrate the approach on a simple toy problem and two real world data sets.

898 citations


Proceedings Article
29 Apr 2013
TL;DR: Deep Gaussian process (GP) models are introduced and model selection by the variational bound shows that a five layer hierarchy is justified even when modelling a digit data set containing only 150 examples.
Abstract: In this paper we introduce deep Gaussian process (GP) models. Deep GPs are a deep belief network based on Gaussian process mappings. The data is modeled as the output of a multivariate GP. The inputs to that Gaussian process are then governed by another GP. A single layer model is equivalent to a standard GP or the GP latent variable model (GP-LVM). We perform inference in the model by approximate variational marginalization. This results in a strict lower bound on the marginal likelihood of the model which we use for model selection (number of layers and nodes per layer). Deep belief networks are typically applied to relatively large data sets using stochastic gradient descent for optimization. Our fully Bayesian treatment allows for the application of deep models even when data is scarce. Model selection by our variational bound shows that a five layer hierarchy is justified even when modelling a digit data set containing only 150 examples.

743 citations


Proceedings ArticleDOI
01 Dec 2013
TL;DR: A visual saliency detection algorithm from the perspective of reconstruction errors that applies the Bayes formula to integrate saliency measures based on dense and sparse reconstruction errors and refined by an object-biased Gaussian model is proposed.
Abstract: In this paper, we propose a visual saliency detection algorithm from the perspective of reconstruction errors. The image boundaries are first extracted via super pixels as likely cues for background templates, from which dense and sparse appearance models are constructed. For each image region, we first compute dense and sparse reconstruction errors. Second, the reconstruction errors are propagated based on the contexts obtained from K-means clustering. Third, pixel-level saliency is computed by an integration of multi-scale reconstruction errors and refined by an object-biased Gaussian model. We apply the Bayes formula to integrate saliency measures based on dense and sparse reconstruction errors. Experimental results show that the proposed algorithm performs favorably against seventeen state-of-the-art methods in terms of precision and recall. In addition, the proposed algorithm is demonstrated to be more effective in highlighting salient objects uniformly and robust to background noise.

725 citations


Journal ArticleDOI
TL;DR: An approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement.
Abstract: Many problems arising in applications result in the need to probe a probability distribution for functions. Examples include Bayesian nonparametric statistics and conditioned diffusion processes. Standard MCMC algorithms typically become arbitrarily slow under the mesh refinement dictated by nonparametric description of the unknown function. We describe an approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement. Gaussian processes or random fields are fields whose marginal distributions, when evaluated at any finite set of NNpoints, are ℝ^N-valued Gaussians. The algorithmic approach that we describe is applicable not only when the desired probability measure has density with respect to a Gaussian process or Gaussian random field reference measure, but also to some useful non-Gaussian reference measures constructed through random truncation. In the applications of interest the data is often sparse and the prior specification is an essential part of the overall modelling strategy. These Gaussian-based reference measures are a very flexible modelling tool, finding wide-ranging application. Examples are shown in density estimation, data assimilation in fluid mechanics, subsurface geophysics and image registration. The key design principle is to formulate the MCMC method so that it is, in principle, applicable for functions; this may be achieved by use of proposals based on carefully chosen time-discretizations of stochastic dynamical systems which exactly preserve the Gaussian reference measure. Taking this approach leads to many new algorithms which can be implemented via minor modification of existing algorithms, yet which show enormous speed-up on a wide range of applied problems.

553 citations


Journal ArticleDOI
TL;DR: This paper discusses how domain knowledge influences design of the Gaussian process models and provides case examples to highlight the approaches.
Abstract: In this paper, we offer a gentle introduction to Gaussian processes for time-series data analysis. The conceptual framework of Bayesian modelling for time-series data is discussed and the foundations of Bayesian non-parametric modelling presented for Gaussian processes . We discuss how domain knowledge influences design of the Gaussian process models and provide case examples to highlight the approaches.

502 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a statistical channel model which incorporates physical laws of acoustic propagation (frequency-dependent attenuation, bottom/surface reflections), as well as the effects of inevitable random local displacements.
Abstract: Underwater acoustic channel models provide a tool for predicting the performance of communication systems before deployment, and are thus essential for system design. In this paper, we offer a statistical channel model which incorporates physical laws of acoustic propagation (frequency-dependent attenuation, bottom/surface reflections), as well as the effects of inevitable random local displacements. Specifically, we focus on random displacements on two scales: those that involve distances on the order of a few wavelengths, to which we refer as small-scale effects, and those that involve many wavelengths, to which we refer as large-scale effects. Small-scale effects include scattering and motion-induced Doppler shifting, and are responsible for fast variations of the instantaneous channel response, while large-scale effects describe the location uncertainty and changing environmental conditions, and affect the locally averaged received power. We model each propagation path by a large-scale gain and micromultipath components that cumulatively result in a complex Gaussian distortion. Time- and frequency-correlation properties of the path coefficients are assessed analytically, leading to a computationally efficient model for numerical channel simulation. Random motion of the surface and transmitter/receiver displacements introduce additional variation whose temporal correlation is described by Bessel-type functions. The total energy, or the gain contained in the channel, averaged over small scale, is modeled as log-normally distributed. The models are validated using real data obtained from four experiments. Specifically, experimental data are used to assess the distribution and the autocorrelation functions of the large-scale transmission loss and the short-term path gains. While the former indicates a log-normal distribution with an exponentially decaying autocorrelation, the latter indicates a conditional Ricean distribution with Bessel-type autocorrelation.

436 citations


ReportDOI
TL;DR: It is demonstrated how the Gaussian approximations and the multiplier bootstrap can be used for modern high dimensional estimation, multiple hypothesis testing, and adaptive specification testing.
Abstract: We derive a Gaussian approximation result for the maximum of a sum of high-dimensional random vectors. Specifically, we establish conditions under which the distribution of the maximum is approximated by that of the maximum of a sum of the Gaussian random vectors with the same covariance matrices as the original vectors. This result applies when the dimension of random vectors ($p$) is large compared to the sample size ($n$); in fact, $p$ can be much larger than $n$, without restricting correlations of the coordinates of these vectors. We also show that the distribution of the maximum of a sum of the random vectors with unknown covariance matrices can be consistently estimated by the distribution of the maximum of a sum of the conditional Gaussian random vectors obtained by multiplying the original vectors with i.i.d. Gaussian multipliers. This is the Gaussian multiplier (or wild) bootstrap procedure. Here too, $p$ can be large or even much larger than $n$. These distributional approximations, either Gaussian or conditional Gaussian, yield a high-quality approximation to the distribution of the original maximum, often with approximation error decreasing polynomially in the sample size, and hence are of interest in many applications. We demonstrate how our Gaussian approximations and the multiplier bootstrap can be used for modern high-dimensional estimation, multiple hypothesis testing, and adaptive specification testing. All these results contain nonasymptotic bounds on approximation errors.

383 citations


Posted Content
TL;DR: Stochastic variational inference for Gaussian process models is introduced and it is shown how GPs can be variationally decomposed to depend on a set of globally relevant inducing variables which factorize the model in the necessary manner to perform Variational inference.
Abstract: We introduce stochastic variational inference for Gaussian process models. This enables the application of Gaussian process (GP) models to data sets containing millions of data points. We show how GPs can be vari- ationally decomposed to depend on a set of globally relevant inducing variables which factorize the model in the necessary manner to perform variational inference. Our ap- proach is readily extended to models with non-Gaussian likelihoods and latent variable models based around Gaussian processes. We demonstrate the approach on a simple toy problem and two real world data sets.

374 citations


Posted Content
TL;DR: In this paper, simple closed-form kernels are derived by modelling a spectral density with a Gaussian mixture, which can be used with Gaussian processes to discover patterns and enable extrapolation, and demonstrate the proposed kernels by discovering patterns and performing long range extrapolation on synthetic examples, as well as atmospheric CO2 trends and airline passenger data.
Abstract: Gaussian processes are rich distributions over functions, which provide a Bayesian nonparametric approach to smoothing and interpolation. We introduce simple closed form kernels that can be used with Gaussian processes to discover patterns and enable extrapolation. These kernels are derived by modelling a spectral density -- the Fourier transform of a kernel -- with a Gaussian mixture. The proposed kernels support a broad class of stationary covariances, but Gaussian process inference remains simple and analytic. We demonstrate the proposed kernels by discovering patterns and performing long range extrapolation on synthetic examples, as well as atmospheric CO2 trends and airline passenger data. We also show that we can reconstruct standard covariances within our framework.

356 citations


01 Jan 2013
TL;DR: In this paper, the authors describe an approach to modify a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement.
Abstract: Many problems arising in applications result in the need to probe a probability distribution for functions. Examples include Bayesian nonparametric statistics and conditioned diffusion processes. Standard MCMC algorithms typically become arbitrarily slow under the mesh refinement dictated by nonparametric description of the un- known function. We describe an approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement. Gaussian processes or random fields are fields whose marginal distri- butions, when evaluated at any finite set of N points, are RN-valued Gaussians. The algorithmic approach that we describe is applicable not only when the desired probability measure has density with respect to a Gaussian process or Gaussian random field reference measure, but also to some useful non-Gaussian reference measures constructed through random truncation. In the applications of interest the data is often sparse and the prior specification is an essential part of the over- all modelling strategy. These Gaussian-based reference measures are a very flexible modelling tool, finding wide-ranging application. Examples are shown in density estimation, data assimilation in fluid mechanics, subsurface geophysics and image registration. The key design principle is to formulate the MCMC method so that it is, in principle, applicable for functions; this may be achieved by use of proposals based on carefully chosen time-discretizations of stochas- tic dynamical systems which exactly preserve the Gaussian reference measure. Taking this approach leads to many new algorithms which can be implemented via minor modification of existing algorithms, yet which show enormous speed-up on a wide range of applied problems.

340 citations


Journal ArticleDOI
TL;DR: A comprehensive review of existing kriging-based methods for the optimization of noisy functions is provided, and the three most intuitive criteria are found as poor alternatives.
Abstract: Responses of many real-world problems can only be evaluated perturbed by noise. In order to make an efficient optimization of these problems possible, intelligent optimization strategies successfully coping with noisy evaluations are required. In this article, a comprehensive review of existing kriging-based methods for the optimization of noisy functions is provided. In summary, ten methods for choosing the sequential samples are described using a unified formalism. They are compared on analytical benchmark problems, whereby the usual assumption of homoscedastic Gaussian noise made in the underlying models is meet. Different problem configurations (noise level, maximum number of observations, initial number of observations) and setups (covariance functions, budget, initial sample size) are considered. It is found that the choices of the initial sample size and the covariance function are not critical. The choice of the method, however, can result in significant differences in the performance. In particular, the three most intuitive criteria are found as poor alternatives. Although no criterion is found consistently more efficient than the others, two specialized methods appear more robust on average.

Journal ArticleDOI
TL;DR: The approximate message passing (AMP) algorithm is extended to solve the complex-valued LASSO problem and obtained the complex approximate message passed algorithm (CAMP), and the state evolution framework recently introduced for the analysis of AMP is generalized to the complex setting.
Abstract: Recovering a sparse signal from an undersampled set of random linear measurements is the main problem of interest in compressed sensing. In this paper, we consider the case where both the signal and the measurements are complex-valued. We study the popular recovery method of l1-regularized least squares or LASSO. While several studies have shown that LASSO provides desirable solutions under certain conditions, the precise asymptotic performance of this algorithm in the complex setting is not yet known. In this paper, we extend the approximate message passing (AMP) algorithm to solve the complex-valued LASSO problem and obtain the complex approximate message passing algorithm (CAMP). We then generalize the state evolution framework recently introduced for the analysis of AMP to the complex setting. Using the state evolution, we derive accurate formulas for the phase transition and noise sensitivity of both LASSO and CAMP. Our theoretical results are concerned with the case of i.i.d. Gaussian sensing matrices. Simulations confirm that our results hold for a larger class of random matrices.

Journal Article
TL;DR: The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference, including various inference methods, sparse approximations and model assessment methods.
Abstract: The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference. The tools include, among others, various inference methods, sparse approximations and model assessment methods.

Proceedings Article
16 Jun 2013
TL;DR: In this article, simple closed-form kernels are derived by modelling a spectral density with a Gaussian mixture, which can be used with Gaussian processes to discover patterns and enable extrapolation, and demonstrate the proposed kernels by discovering patterns and performing long range extrapolation on synthetic examples, as well as atmospheric CO2 trends and airline passenger data.
Abstract: Gaussian processes are rich distributions over functions, which provide a Bayesian nonparametric approach to smoothing and interpolation. We introduce simple closed form kernels that can be used with Gaussian processes to discover patterns and enable extrapolation. These kernels are derived by modelling a spectral density - the Fourier transform of a kernel - with a Gaussian mixture. The proposed kernels support a broad class of stationary covariances, but Gaussian process inference remains simple and analytic. We demonstrate the proposed kernels by discovering patterns and performing long range extrapolation on synthetic examples, as well as atmospheric CO2 trends and airline passenger data. We also show that it is possible to reconstruct several popular standard covariances within our framework.

Journal ArticleDOI
TL;DR: The ability of Gaussian processes to guide the search through protein sequence space by designing, constructing, and testing chimeric cytochrome P450s allowed us to engineer active P450 enzymes that are more thermostable than any previously made by chimeragenesis, rational design, or directed evolution.
Abstract: Knowing how protein sequence maps to function (the “fitness landscape”) is critical for understanding protein evolution as well as for engineering proteins with new and useful properties. We demonstrate that the protein fitness landscape can be inferred from experimental data, using Gaussian processes, a Bayesian learning technique. Gaussian process landscapes can model various protein sequence properties, including functional status, thermostability, enzyme activity, and ligand binding affinity. Trained on experimental data, these models achieve unrivaled quantitative accuracy. Furthermore, the explicit representation of model uncertainty allows for efficient searches through the vast space of possible sequences. We develop and test two protein sequence design algorithms motivated by Bayesian decision theory. The first one identifies small sets of sequences that are informative about the landscape; the second one identifies optimized sequences by iteratively improving the Gaussian process model in regions of the landscape that are predicted to be optimized. We demonstrate the ability of Gaussian processes to guide the search through protein sequence space by designing, constructing, and testing chimeric cytochrome P450s. These algorithms allowed us to engineer active P450 enzymes that are more thermostable than any previously made by chimeragenesis, rational design, or directed evolution.

Journal ArticleDOI
TL;DR: A Gaussian bare-bones DE and its modified version (MGBDE) are proposed which are almost parameter free and indicate that the MGBDE performs significantly better than, or at least comparable to, several state-of-the-art DE variants and some existing bare-bone algorithms.
Abstract: Differential evolution (DE) is a well-known algorithm for global optimization over continuous search spaces. However, choosing the optimal control parameters is a challenging task because they are problem oriented. In order to minimize the effects of the control parameters, a Gaussian bare-bones DE (GBDE) and its modified version (MGBDE) are proposed which are almost parameter free. To verify the performance of our approaches, 30 benchmark functions and two real-world problems are utilized. Conducted experiments indicate that the MGBDE performs significantly better than, or at least comparable to, several state-of-the-art DE variants and some existing bare-bones algorithms.

Journal ArticleDOI
TL;DR: Methods for converting spatiotemporal Gaussian process regression problems into infinite-dimensional state-space models are presented and the use of machine-learning models in signal processing becomes computationally feasible, and it opens the possibility to combine machine- learning techniques with signal processing methods.
Abstract: Gaussian process-based machine learning is a powerful Bayesian paradigm for nonparametric nonlinear regression and classification. In this article, we discuss connections of Gaussian process regression with Kalman filtering and present methods for converting spatiotemporal Gaussian process regression problems into infinite-dimensional state-space models. This formulation allows for use of computationally efficient infinite-dimensional Kalman filtering and smoothing methods, or more general Bayesian filtering and smoothing methods, which reduces the problematic cubic complexity of Gaussian process regression in the number of time steps into linear time complexity. The implication of this is that the use of machine-learning models in signal processing becomes computationally feasible, and it opens the possibility to combine machine-learning techniques with signal processing methods.

Journal ArticleDOI
TL;DR: The Maximum Likelihood (ML) and Cross Validation (CV) methods for estimating covariance hyper-parameters are compared and it is shown that when the correlation function is misspecified, the CV does better compared to ML, while ML is optimal when the model is well-specified.

Posted Content
TL;DR: In this article, the authors derive a family of local sequential design schemes that dynamically define the support of a Gaussian process predictor based on a local subset of the data, and derive expressions for fast sequential updating of all needed quantities as the local designs are built up iteratively.
Abstract: We provide a new approach to approximate emulation of large computer experiments. By focusing expressly on desirable properties of the predictive equations, we derive a family of local sequential design schemes that dynamically define the support of a Gaussian process predictor based on a local subset of the data. We further derive expressions for fast sequential updating of all needed quantities as the local designs are built-up iteratively. Then we show how independent application of our local design strategy across the elements of a vast predictive grid facilitates a trivially parallel implementation. The end result is a global predictor able to take advantage of modern multicore architectures, while at the same time allowing for a nonstationary modeling feature as a bonus. We demonstrate our method on two examples utilizing designs sized in the thousands, and tens of thousands of data points. Comparisons are made to the method of compactly supported covariances.

Journal ArticleDOI
TL;DR: In this article, the authors review characterizations of positive definite functions on spheres in terms of Gegenbauer expansions, and apply them to dimension walks, where monotonicity properties of the Geggenbauer coefficients guarantee positive definiteness in higher dimensions.
Abstract: Isotropic positive definite functions on spheres play important roles in spatial statistics, where they occur as the correlation functions of homogeneous random fields and star-shaped random particles. In approximation theory, strictly positive definite functions serve as radial basis functions for interpolating scattered data on spherical domains. We review characterizations of positive definite functions on spheres in terms of Gegenbauer expansions and apply them to dimension walks, where monotonicity properties of the Gegenbauer coefficients guarantee positive definiteness in higher dimensions. Subject to a natural support condition, isotropic positive definite functions on the Euclidean space $\mathbb{R} ^{3}$, such as Askey’s and Wendland’s functions, allow for the direct substitution of the Euclidean distance by the great circle distance on a one-, two- or three-dimensional sphere, as opposed to the traditional approach, where the distances are transformed into each other. Completely monotone functions are positive definite on spheres of any dimension and provide rich parametric classes of such functions, including members of the powered exponential, Matern, generalized Cauchy and Dagum families. The sine power family permits a continuous parameterization of the roughness of the sample paths of a Gaussian process. A collection of research problems provides challenges for future work in mathematical analysis, probability theory and spatial statistics.

Journal ArticleDOI
TL;DR: The novelty of this work, is the recognition that the Gaussian process model defines a posterior probability measure on the function space of possible surrogates for the computer code and the derivation of an algorithmic procedure that allows us to sample it efficiently.

Proceedings Article
05 Dec 2013
TL;DR: The SI-BO algorithm is presented, which leverages recent low-rank matrix recovery techniques to learn the underlying subspace of the unknown function and applies Gaussian Process Upper Confidence sampling for optimization of the function.
Abstract: Many applications in machine learning require optimizing unknown functions defined over a high-dimensional space from noisy samples that are expensive to obtain. We address this notoriously hard challenge, under the assumptions that the function varies only along some low-dimensional subspace and is smooth (i.e., it has a low norm in a Reproducible Kernel Hilbert Space). In particular, we present the SI-BO algorithm, which leverages recent low-rank matrix recovery techniques to learn the underlying subspace of the unknown function and applies Gaussian Process Upper Confidence sampling for optimization of the function. We carefully calibrate the exploration-exploitation tradeoff by allocating the sampling budget to subspace estimation and function optimization, and obtain the first subexponential cumulative regret bounds and convergence rates for Bayesian optimization in high-dimensions under noisy observations. Numerical results demonstrate the effectiveness of our approach in difficult scenarios.

Journal ArticleDOI
TL;DR: The proposed Coupled Scaled Gaussian Process Regression model for head-posing normalization outperforms state-of-the-art regression-based approaches to head-pose normalization, 2D and 3D Point Distribution Models (PDMs), and Active Appearance Models (AAMs), especially in cases of unknown poses and imbalanced training data.
Abstract: We propose a method for head-pose invariant facial expression recognition that is based on a set of characteristic facial points. To achieve head-pose invariance, we propose the Coupled Scaled Gaussian Process Regression (CSGPR) model for head-pose normalization. In this model, we first learn independently the mappings between the facial points in each pair of (discrete) nonfrontal poses and the frontal pose, and then perform their coupling in order to capture dependences between them. During inference, the outputs of the coupled functions from different poses are combined using a gating function, devised based on the head-pose estimation for the query points. The proposed model outperforms state-of-the-art regression-based approaches to head-pose normalization, 2D and 3D Point Distribution Models (PDMs), and Active Appearance Models (AAMs), especially in cases of unknown poses and imbalanced training data. To the best of our knowledge, the proposed method is the first one that is able to deal with expressive faces in the range from $(-45^\circ)$ to $(+45^\circ)$ pan rotation and $(-30^\circ)$ to $(+30^\circ)$ tilt rotation, and with continuous changes in head pose, despite the fact that training was conducted on a small set of discrete poses. We evaluate the proposed method on synthetic and real images depicting acted and spontaneously displayed facial expressions.

Journal ArticleDOI
TL;DR: Nonseparable covariance structures for Gaussian process emulators are developed, based on the linear model of coregionalization and convolution methods, finding that only emulators with nonseparable covariances structures have sufficient flexibility both to give good predictions and to represent joint uncertainty about the simulator outputs appropriately.
Abstract: The Gaussian process regression model is a popular type of “emulator” used as a fast surrogate for computationally expensive simulators (deterministic computer models). For simulators with multivariate output, common practice is to specify a separable covariance structure for the Gaussian process. Though computationally convenient, this can be too restrictive, leading to poor performance of the emulator, particularly when the different simulator outputs represent different physical quantities. Also, treating the simulator outputs as independent can lead to inappropriate representations of joint uncertainty. We develop nonseparable covariance structures for Gaussian process emulators, based on the linear model of coregionalization and convolution methods. Using two case studies, we compare the performance of these covariance structures both with standard separable covariance structures and with emulators that assume independence between the outputs. In each case study, we find that only emulators with nons...

ReportDOI
TL;DR: In this paper, the authors give explicit comparisons of expectations of smooth functions and distribution functions of maxima of Gaussian random vectors without any restriction on the covariance matrices, and establish an anti-concentration inequality for the maximum of a Gaussian vector, which derives a useful upper bound on the Levy concentration function for the Gaussian maximum.
Abstract: Slepian and Sudakov–Fernique type inequalities, which compare expectations of maxima of Gaussian random vectors under certain restrictions on the covariance matrices, play an important role in probability theory, especially in empirical process and extreme value theories. Here we give explicit comparisons of expectations of smooth functions and distribution functions of maxima of Gaussian random vectors without any restriction on the covariance matrices. We also establish an anti-concentration inequality for the maximum of a Gaussian random vector, which derives a useful upper bound on the Levy concentration function for the Gaussian maximum. The bound is dimension-free and applies to vectors with arbitrary covariance matrices. This anti-concentration inequality plays a crucial role in establishing bounds on the Kolmogorov distance between maxima of Gaussian random vectors. These results have immediate applications in mathematical statistics. As an example of application, we establish a conditional multiplier central limit theorem for maxima of sums of independent random vectors where the dimension of the vectors is possibly much larger than the sample size.

Journal ArticleDOI
TL;DR: The proposed spectral modeling method can significantly alleviate the over-smoothing effect and improve the naturalness of the conventional HMM-based speech synthesis system using mel-cepstra.
Abstract: This paper presents a new spectral modeling method for statistical parametric speech synthesis. In the conventional methods, high-level spectral parameters, such as mel-cepstra or line spectral pairs, are adopted as the features for hidden Markov model (HMM)-based parametric speech synthesis. Our proposed method described in this paper improves the conventional method in two ways. First, distributions of low-level, un-transformed spectral envelopes (extracted by the STRAIGHT vocoder) are used as the parameters for synthesis. Second, instead of using single Gaussian distribution, we adopt the graphical models with multiple hidden variables, including restricted Boltzmann machines (RBM) and deep belief networks (DBN), to represent the distribution of the low-level spectral envelopes at each HMM state. At the synthesis time, the spectral envelopes are predicted from the RBM-HMMs or the DBN-HMMs of the input sentence following the maximum output probability parameter generation criterion with the constraints of the dynamic features. A Gaussian approximation is applied to the marginal distribution of the visible stochastic variables in the RBM or DBN at each HMM state in order to achieve a closed-form solution to the parameter generation problem. Our experimental results show that both RBM-HMM and DBN-HMM are able to generate spectral envelope parameter sequences better than the conventional Gaussian-HMM with superior generalization capabilities and that DBN-HMM and RBM-HMM perform similarly due possibly to the use of Gaussian approximation. As a result, our proposed method can significantly alleviate the over-smoothing effect and improve the naturalness of the conventional HMM-based speech synthesis system using mel-cepstra.

Journal ArticleDOI
TL;DR: This article investigates the use of Gaussian process (GP) priors for one-class classification and shows the suitability of the methods in the area of attribute prediction, defect localization, bacteria recognition, and background subtraction.

Journal ArticleDOI
TL;DR: In this article, the asymptotic power of tests of sphericity against perturbations in a single unknown direction was studied, where both the dimensionality of the data and the number of observations go to infinity.
Abstract: This paper studies the asymptotic power of tests of sphericity against perturbations in a single unknown direction as both the dimensionality of the data and the number of observations go to infinity. We establish the convergence, under the null hypothesis and the alternative, of the log ratio of the joint densities of the sample covariance eigenvalues to a Gaussian process indexed by the norm of the perturbation. When the perturbation norm is larger than the phase transition threshold studied in Baik et al. (2005), the limiting process is degenerate and discrimination between the null and the alternative is asymptotically certain. When the norm is below the threshold, the process is non-degenerate, so that the joint eigenvalue densities under the null and alternative hypotheses are mutually contiguous. Using the asymptotic theory of statistical experiments, we obtain asymptotic power envelopes and derive the asymptotic power for various sphericity tests in the contiguity region. In particular, we show that the asymptotic power of the Tracy-Widom-type tests is trivial, whereas that of the eigenvalue-based likelihood ratio test is strictly larger than the size, and close to the power envelope.

Book ChapterDOI
TL;DR: The Gaussian Process Upper Confidence Bound and Pure exploration algorithm (GP-UCB-PE) is introduced which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations and proves theoretical upper bounds on the regret with batches of size K for this procedure.
Abstract: In this paper, we consider the challenge of maximizing an unknown function f for which evaluations are noisy and are acquired with high cost. An iterative procedure uses the previous measures to actively select the next estimation of f which is predicted to be the most useful. We focus on the case where the function can be evaluated in parallel with batches of fixed size and analyze the benefit compared to the purely sequential procedure in terms of cumulative regret. We introduce the Gaussian Process Upper Confidence Bound and Pure Exploration algorithm (GP-UCB-PE) which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations. We prove theoretical upper bounds on the regret with batches of size K for this procedure which show the improvement of the order of sqrt{K} for fixed iteration cost over purely sequential versions. Moreover, the multiplicative constants involved have the property of being dimension-free. We also confirm empirically the efficiency of GP-UCB-PE on real and synthetic problems compared to state-of-the-art competitors.

Journal ArticleDOI
TL;DR: The role of the extremal t process as the maximum attractor for processes with finite-dimensional elliptical distributions is highlighted and all results naturally also hold within the multivariate domain.