scispace - formally typeset
Search or ask a question

Showing papers by "Robert B. Gramacy published in 2009"


Journal ArticleDOI
TL;DR: This work proposes an approach that automatically explores the space while simultaneously fitting the response surface, using predictive uncertainty to guide subsequent experimental runs, and develops an adaptive sequential design framework to cope with an asynchronous, random, agent–based supercomputing environment.
Abstract: Computer experiments often are performed to allow modeling of a response surface of a physical experiment that can be too costly or difficult to run except by using a simulator. Running the experiment over a dense grid can be prohibitively expensive, yet running over a sparse design chosen in advance can result in insufficient information in parts of the space, particularly when the surface calls for a nonstationary model. We propose an approach that automatically explores the space while simultaneously fitting the response surface, using predictive uncertainty to guide subsequent experimental runs. We use the newly developed Bayesian treed Gaussian process as the surrogate model; a fully Bayesian approach allows explicit measures of uncertainty. We develop an adaptive sequential design framework to cope with an asynchronous, random, agent–based supercomputing environment by using a hybrid approach that melds optimal strategies from the statistics literature with flexible strategies from the active learni...

192 citations


Posted Content
TL;DR: This work develops a simulation-based method for the online updating of Gaussian process regression and classification models and exploits sequential Monte Carlo to produce a fast sequential design algorithm for these models relative to the established MCMC alternative.
Abstract: We develop a simulation-based method for the online updating of Gaussian process regression and classification models. Our method exploits sequential Monte Carlo to produce a fast sequential design algorithm for these models relative to the established MCMC alternative. The latter is less ideal for sequential design since it must be restarted and iterated to convergence with the inclusion of each new design point. We illustrate some attractive ensemble aspects of our SMC approach, and show how active learning heuristics may be implemented via particles to optimize a noisy function or to explore classification boundaries online.

99 citations


Journal ArticleDOI
05 Jun 2009-PLOS ONE
TL;DR: A flexible statistical framework for generating optimal epidemiological interventions that are designed to minimize the total expected cost of an emerging epidemic while simultaneously propagating uncertainty regarding the underlying disease model parameters through to the decision process is described.
Abstract: Background Epidemiological interventions aim to control the spread of infectious disease through various mechanisms, each carrying a different associated cost. Methodology We describe a flexible statistical framework for generating optimal epidemiological interventions that are designed to minimize the total expected cost of an emerging epidemic while simultaneously propagating uncertainty regarding the underlying disease model parameters through to the decision process. The strategies produced through this framework are adaptive: vaccination schedules are iteratively adjusted to reflect the anticipated trajectory of the epidemic given the current population state and updated parameter estimates. Conclusions Using simulation studies based on a classic influenza outbreak, we demonstrate the advantages of adaptive interventions over non-adaptive ones, in terms of cost and resource efficiency, and robustness to model misspecification.

60 citations


Journal ArticleDOI
TL;DR: The R package LogConcDEAD (Log-concave density estimation in arbitrary dimensions) is introduced, its main function is to compute the nonparametric maximum likelihood estimator of a log-conCave density.
Abstract: In this article we introduce the R package LogConcDEAD (Log-concave density estimation in arbitrary dimensions). Its main function is to compute the nonparametric maximum likelihood estimator of a log-concave density. Functions for plotting, sampling from the density estimate and evaluating the density estimate are provided. All of the functions available in the package are illustrated using simple, reproducible examples with simulated data.

50 citations


Proceedings Article
15 Apr 2009
TL;DR: This paper motivate and present families of Markov chain Monte Carlo (MCMC) proposals that exploit the particular structure of mixtures of copulas, and an application in financial forecasting with missing data illustrates the usefulness of the methodology.
Abstract: Applications of copula models have been increasing in number in recent years. This class of models provides a modular parameterization of joint distributions: the specification of the marginal distributions is parameterized separately from the dependence structure of the joint, a convenient way of encoding a model for domains such as finance. Some recent advances on how to specify copulas for arbitrary dimensions have been proposed, by means of mixtures of decomposable graphical models. This paper introduces a Bayesian approach for dealing with mixtures of copulas which, due to the lack of prior conjugacy, raise computational challenges. We motivate and present families of Markov chain Monte Carlo (MCMC) proposals that exploit the particular structure of mixtures of copulas. Different algorithms are evaluated according to their mixing properties, and an application in financial forecasting with missing data illustrates the usefulness of the methodology.

16 citations


Posted Content
TL;DR: A sequential tree model whose state changes in time with the accumulation of new data is created, and particle learning algorithms that allow for the efficient online posterior filtering of tree states are provided.
Abstract: Dynamic regression trees are an attractive option for automatic regression and classification with complicated response surfaces in on-line application settings. We create a sequential tree model whose state changes in time with the accumulation of new data, and provide particle learning algorithms that allow for the efficient on-line posterior filtering of tree-states. A major advantage of tree regression is that it allows for the use of very simple models within each partition. The model also facilitates a natural division of labor in our sequential particle-based inference: tree dynamics are defined through a few potential changes that are local to each newly arrived observation, while global uncertainty is captured by the ensemble of particles. We consider both constant and linear mean functions at the tree leaves, along with multinomial leaves for classification problems, and propose default prior specifications that allow for prediction to be integrated over all model parameters conditional on a given tree. Inference is illustrated in some standard nonparametric regression examples, as well as in the setting of sequential experiment design, including both active learning and optimization applications, and in on-line classification. We detail implementation guidelines and problem specific methodology for each of these motivating applications. Throughout, it is demonstrated that our practical approach is able to provide better results compared to commonly used methods at a fraction of the cost.

5 citations


Posted Content
TL;DR: This work forms a Bayesian model averaging scheme to combine these two models and describes a Monte Carlo method for sampling from the full posterior distribution with joint proposals for the tree topology and the GP parameters corresponding to latent variables at the leaves.
Abstract: Recognizing the successes of treed Gaussian process (TGP) models as an interpretable and thrifty model for nonparametric regression, we seek to extend the model to classification. Both treed models and Gaussian processes (GPs) have, separately, enjoyed great success in application to classification problems. An example of the former is Bayesian CART. In the latter, real-valued GP output may be utilized for classification via latent variables, which provide classification rules by means of a softmax function. We formulate a Bayesian model averaging scheme to combine these two models and describe a Monte Carlo method for sampling from the full posterior distribution with joint proposals for the tree topology and the GP parameters corresponding to latent variables at the leaves. We concentrate on efficient sampling of the latent variables, which is important to obtain good mixing in the expanded parameter space. The tree structure is particularly helpful for this task and also for developing an efficient scheme for handling categorical predictors, which commonly arise in classification problems. Our proposed classification TGP (CTGP) methodology is illustrated on a collection of synthetic and real data sets. We assess performance relative to existing methods and thereby show how CTGP is highly flexible, offers tractable inference, produces rules that are easy to interpret, and performs well out of sample.

4 citations


Posted Content
TL;DR: In this paper, a fully Bayesian hierarchical formulation of shrinkage regression for portfolio balancing is presented, allowing for heavy-tailed errors, relaxing the historical missingness assumption, and accounting for estimation risk.
Abstract: Portfolio balancing requires estimates of covariance between asset returns. Returns data have histories which greatly vary in length, since assets begin public trading at different times. This can lead to a huge amount of missing data--too much for the conventional imputation-based approach. Fortunately, a well-known factorization of the MVN likelihood under the prevailing historical missingness pattern leads to a simple algorithm of OLS regressions that is much more reliable. When there are more assets than returns, however, OLS becomes unstable. Gramacy, et al. (2008), showed how classical shrinkage regression may be used instead, thus extending the state of the art to much bigger asset collections, with further accuracy and interpretation advantages. In this paper, we detail a fully Bayesian hierarchical formulation that extends the framework further by allowing for heavy-tailed errors, relaxing the historical missingness assumption, and accounting for estimation risk. We illustrate how this approach compares favorably to the classical one using synthetic data and an investment exercise with real returns. An accompanying R package is on CRAN.

3 citations


Posted Content
08 Dec 2009
TL;DR: In this paper, a sequential tree model whose state changes in time with the accumulation of new data, and particle learning algorithms that allow for the efficient on-line posterior filtering of tree-states is presented.
Abstract: Dynamic regression trees are an attractive option for automatic regression and classification with complicated response surfaces in on-line application settings. We create a sequential tree model whose state changes in time with the accumulation of new data, and provide particle learning algorithms that allow for the efficient on-line posterior filtering of tree-states. A major advantage of tree regression is that it allows for the use of very simple models within each partition. The model also facilitates a natural division of labor in our sequential particle-based inference: tree dynamics are defined through a few potential changes that are local to each newly arrived observation, while global uncertainty is captured by the ensemble of particles. We consider both constant and linear mean functions at the tree leaves, along with multinomial leaves for classification problems, and propose default prior specifications that allow for prediction to be integrated over all model parameters conditional on a given tree. Inference is illustrated in some standard nonparametric regression examples, as well as in the setting of sequential experiment design, including both active learning and optimization applications, and in on-line classification. We detail implementation guidelines and problem specific methodology for each of these motivating applications. Throughout, it is demonstrated that our practical approach is able to provide better results compared to commonly used methods at a fraction of the cost.