Showing papers by "Robert B. Gramacy published in 2016"

PDF

Open Access

Journal Article•DOI•

laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R

[...]

16 Aug 2016-Journal of Statistical Software

TL;DR: This work discusses an implementation of local approximate Gaussian process models, in the laGP package for R, that offers a particular sparse-matrix remedy uniquely positioned to leverage modern parallel computing architectures.

...read moreread less

Abstract: Gaussian process (GP) regression models make for powerful predictors in out of sample exercises, but cubic runtimes for dense matrix decompositions severely limit the size of data - training and testing - on which they can be deployed. That means that in computer experiment, spatial/geo-physical, and machine learning contexts, GPs no longer enjoy privileged status as data sets continue to balloon in size. We discuss an implementation of local approximate Gaussian process models, in the laGP package for R, that offers a particular sparse-matrix remedy uniquely positioned to leverage modern parallel computing architectures. The laGP approach can be seen as an update on the spatial statistical method of local kriging neighborhoods. We briefly review the method, and provide extensive illustrations of the features in the package through worked-code examples. The appendix covers custom building options for symmetric multi-processor and graphical processing units, and built-in wrapper routines that automate distribution over a simple network of workstations.

...read moreread less

161 citations

Journal Article•DOI•

Modeling an Augmented Lagrangian for Blackbox Constrained Optimization

[...]

Robert B. Gramacy¹, Genetha Anne Gray², Sébastien Le Digabel³, Herbert K. H. Lee⁴, Pritam Ranjan, Garth N. Wells⁵, Stefan M. Wild⁶ - Show less +3 more•Institutions (6)

University of Chicago¹, Sandia National Laboratories², École Polytechnique de Montréal³, University of California, Santa Cruz⁴, University of Cambridge⁵, Argonne National Laboratory⁶

22 Feb 2016-Technometrics

TL;DR: In this article, a combination of response surface modeling, expected improvement, and the augmented Lagrangian numerical optimization framework is proposed to solve the problem of constrained black-box optimization.

...read moreread less

Abstract: Constrained blackbox optimization is a difficult problem, with most approaches coming from the mathematical programming literature. The statistical literature is sparse, especially in addressing problems with nontrivial constraints. This situation is unfortunate because statistical methods have many attractive properties: global scope, handling noisy objectives, sensitivity analysis, and so forth. To narrow that gap, we propose a combination of response surface modeling, expected improvement, and the augmented Lagrangian numerical optimization framework. This hybrid approach allows the statistical model to think globally and the augmented Lagrangian to act locally. We focus on problems where the constraints are the primary bottleneck, requiring expensive simulation to evaluate and substantial modeling effort to map out. In that context, our hybridization presents a simple yet effective solution that allows existing objective-oriented statistical approaches, like those based on Gaussian process surrogates ...

...read moreread less

85 citations

Journal Article•DOI•

Practical heteroskedastic Gaussian process modeling for large simulation experiments

[...]

Mickaël Binois, Robert B. Gramacy, Michael Ludkovski

17 Nov 2016-arXiv: Methodology

TL;DR: In this paper, a unified view of likelihood based Gaussian progress regression for simulation experiments exhibiting input-dependent noise is presented, where multiple applications of a well-known Woodbury identity facilitate inference for all parameters under the likelihood, bypassing the typical full-data sized calculations.

...read moreread less

Abstract: We present a unified view of likelihood based Gaussian progress regression for simulation experiments exhibiting input-dependent noise. Replication plays an important role in that context, however previous methods leveraging replicates have either ignored the computational savings that come from such design, or have short-cut full likelihood-based inference to remain tractable. Starting with homoskedastic processes, we show how multiple applications of a well-known Woodbury identity facilitate inference for all parameters under the likelihood (without approximation), bypassing the typical full-data sized calculations. We then borrow a latent-variable idea from machine learning to address heteroskedasticity, adapting it to work within the same thrifty inferential framework, thereby simultaneously leveraging the computational and statistical efficiency of designs with replication. The result is an inferential scheme that can be characterized as single objective function, complete with closed form derivatives, for rapid library-based optimization. Illustrations are provided, including real-world simulation experiments from manufacturing and the management of epidemics.

...read moreread less

61 citations

Journal Article•DOI•

Speeding Up Neighborhood Search in Local Gaussian Process Prediction

[...]

Robert B. Gramacy¹, Benjamin Haaland²•Institutions (2)

University of Chicago¹, National University of Singapore²

08 Jul 2016-Technometrics

TL;DR: In this article, the authors show that searching the space radially, continuously along rays emanating from the predictive location of interest, is a far thriftier alternative than the exhaustive and discrete nature of an important search subroutine involved in building such local designs may be overly conservative.

...read moreread less

Abstract: Recent implementations of local approximate Gaussian process models have pushed computational boundaries for nonlinear, nonparametric prediction problems, particularly when deployed as emulators for computer experiments. Their flavor of spatially independent computation accommodates massive parallelization, meaning that they can handle designs two or more orders of magnitude larger than previously. However, accomplishing that feat can still require massive computational horsepower. Here we aim to ease that burden. We study how predictive variance is reduced as local designs are built up for prediction. We then observe how the exhaustive and discrete nature of an important search subroutine involved in building such local designs may be overly conservative. Rather, we suggest that searching the space radially, that is, continuously along rays emanating from the predictive location of interest, is a far thriftier alternative. Our empirical work demonstrates that ray-based search yields predictors with accur...

...read moreread less

45 citations

Bayesian optimization under mixed constraints with a slack-variable augmented Lagrangian

[...]

Sébastien Le Digabel, Robert B. Gramacy, Stefan M. Wild, Victor Picheny

01 Jun 2016

33 citations

Posted Content•

Bayesian optimization under mixed constraints with a slack-variable augmented Lagrangian

[...]

Victor Picheny¹, Robert B. Gramacy², Stefan M. Wild³, Sébastien Le Digabel⁴•Institutions (4)

University of Toulouse¹, Virginia Tech², Argonne National Laboratory³, École Polytechnique de Montréal⁴

31 May 2016-arXiv: Computation

TL;DR: In this article, an alternative slack variable augmented Lagrangian (ALBO) is proposed to evaluate the expected improvement (EI) with library routines, and the slack variables furthermore facilitate equality as well as inequality constraints, and mixtures thereof.

...read moreread less

Abstract: An augmented Lagrangian (AL) can convert a constrained optimization problem into a sequence of simpler (e.g., unconstrained) problems, which are then usually solved with local solvers. Recently, surrogate-based Bayesian optimization (BO) sub-solvers have been successfully deployed in the AL framework for a more global search in the presence of inequality constraints; however, a drawback was that expected improvement (EI) evaluations relied on Monte Carlo. Here we introduce an alternative slack variable AL, and show that in this formulation the EI may be evaluated with library routines. The slack variables furthermore facilitate equality as well as inequality constraints, and mixtures thereof. We show how our new slack "ALBO" compares favorably to the original. Its superiority over conventional alternatives is reinforced on several mixed constraint examples.

...read moreread less

25 citations

Proceedings Article•

Bayesian optimization under mixed constraints with a slack-variable augmented Lagrangian

[...]

Victor Picheny¹, Robert B. Gramacy², Stefan M. Wild³, Sébastien Le Digabel⁴•Institutions (4)

University of Toulouse¹, Virginia Tech², Argonne National Laboratory³, École Polytechnique de Montréal⁴

05 Dec 2016

...read moreread less

Abstract: An augmented Lagrangian (AL) can convert a constrained optimization problem into a sequence of simpler (e.g., unconstrained) problems, which are then usually solved with local solvers. Recently, surrogate-based Bayesian optimization (BO) sub-solvers have been successfully deployed in the AL framework for a more global search in the presence of inequality constraints; however, a drawback was that expected improvement (EI) evaluations relied on Monte Carlo. Here we introduce an alternative slack variable AL, and show that in this formulation the EI may be evaluated with library routines. The slack variables furthermore facilitate equality as well as inequality constraints, and mixtures thereof. We show our new slack "ALBO" compares favorably to the original. Its superiority over conventional alternatives is reinforced on several mixed constraint examples.

...read moreread less

14 citations

Journal Article•

Supplementary material: Speeding Up Neighborhood Search in Local Gaussian Process Prediction

[...]

Robert B. Gramacy, Benjamin Haaland

01 Aug 2016-Technometrics

TL;DR: In this article, the authors describe a method for speeding up neighborhood search in local Gaussian process prediction. But this method is not suitable for predicting the Gaussian Process in general.

...read moreread less

Abstract: Supplementary material to "Speeding Up Neighborhood Search in Local Gaussian Process Prediction"

...read moreread less

12 citations

Posted Content•DOI•

Hockey Player Performance via Regularized Logistic Regression

[...]

Robert B. Gramacy, Matt Taddy, Sen Tian

21 Dec 2016-arXiv: Applications

TL;DR: This chapter describes and illustrates a simple algorithm for recovering partial effects of hockey players' plus-minus, and provides a logistic regression model that can predict which team has scored a given goal as a function of who was on the ice, what teams were playing, and details of the game situation.

...read moreread less

Abstract: A hockey player's plus-minus measures the difference between goals scored by and against that player's team while the player was on the ice. This measures only a marginal effect, failing to account for the influence of the others he is playing with and against. A better approach would be to jointly model the effects of all players, and any other confounding information, in order to infer a partial effect for this individual: his influence on the box score regardless of who else is on the ice. This chapter describes and illustrates a simple algorithm for recovering such partial effects. There are two main ingredients. First, we provide a logistic regression model that can predict which team has scored a given goal as a function of who was on the ice, what teams were playing, and details of the game situation (e.g. full-strength or power-play). Since the resulting model is so high dimensional that standard maximum likelihood estimation techniques fail, our second ingredient is a scheme for regularized estimation. This adds a penalty to the objective that favors parsimonious models and stabilizes estimation. Such techniques have proven useful in fields from genetics to finance over the past two decades, and have demonstrated an impressive ability to gracefully handle large and highly imbalanced data sets. The latest software packages accompanying this new methodology -- which exploit parallel computing environments, sparse matrices, and other features of modern data structures -- are widely available and make it straightforward for interested analysts to explore their own models of player contribution.

...read moreread less

6 citations

Journal Article•DOI•

Timing Foreign Exchange Markets

[...]

Samuel W. Malone, Robert B. Gramacy, Enrique ter Horst

11 Mar 2016-Econometrics

TL;DR: In this article, the authors employ foreign exchange market risk factors as fundamentals, and Bayesian treed Gaussian process (BTGP) models to handle nonlinear, time-varying relationships between these fundamentals and exchange rates.

...read moreread less

Abstract: To improve short-horizon exchange rate forecasts, we employ foreign exchange market risk factors as fundamentals, and Bayesian treed Gaussian process (BTGP) models to handle non-linear, time-varying relationships between these fundamentals and exchange rates. Forecasts from the BTGP model conditional on the carry and dollar factors dominate random walk forecasts on accuracy and economic criteria in the Meese-Rogoff setting. Superior market timing ability for large moves, more than directional accuracy, drives the BTGP’s success. We explain how, through a model averaging Monte Carlo scheme, the BTGP is able to simultaneously exploit smoothness and rough breaks in between-variable dynamics. Either feature in isolation is unable to consistently outperform benchmarks throughout the full span of time in our forecasting exercises. Trading strategies based on ex ante BTGP forecasts deliver the highest out-of-sample risk-adjusted returns for the median currency, as well as for both predictable, traded risk factors.

...read moreread less

4 citations

Journal Article•DOI•

Potentially Predictive Variance Reducing Subsample Locations in Local Gaussian Process Regression

[...]

Chih-Li Sung, Robert B. Gramacy, Benjamin Haaland

18 Apr 2016-arXiv: Methodology

TL;DR: In this paper, two computationally efficient neighborhood search limiting techniques are proposed, a maximum distance method and a feature approximation method, which can save substantial computation while retaining the emulation accuracy.

...read moreread less

Abstract: Gaussian process models are commonly used as emulators for computer experiments. However, developing a Gaussian process emulator can be computationally prohibitive when the number of experimental samples is even moderately large. Local Gaussian process approximation (Gramacy and Apley, 2015) was proposed as an accurate and computationally feasible emulation alternative. However, constructing local sub-designs specific to predictions at a particular location of interest remains a substantial computational bottleneck to the technique. In this paper, two computationally efficient neighborhood search limiting techniques are proposed, a maximum distance method and a feature approximation method. Two examples demonstrate that the proposed methods indeed save substantial computation while retaining emulation accuracy.

...read moreread less

Journal Article•

Supplementary Material: Modeling an Augmented Lagrangian for Blackbox Constrained Optimization

[...]

Robert B. Gramacy, Genetha Anne Gray, Sébastien Le Digabel, Herbert K. H. Lee, Pritam Ranjan, Garth N. Wells, Stefan M. Wild - Show less +3 more

01 Feb 2016-Technometrics

TL;DR: In this article, an Augmented Lagrangian for black box constrained optimization is proposed, where the Lagrangians are modelled by augmented Lagrangia for blackbox constrained optimization.

...read moreread less

Abstract: Supplementary material to "Modeling an Augmented Lagrangian for Blackbox Constrained Optimization"

...read moreread less