What are the contributions mentioned in the paper "Adaptive designs of experiments for accurate approximation of a target region" ?

In this paper, Picheny et al. presented an adaptive design of experiments for accurate estimation of a target region using the Kriging model.

What have the authors stated for future works in "Adaptive designs of experiments for accurate approximation of a target region" ?

However, it has been found some limitations to the method, which were not solved here and requires future work to apply the method to a wide range of problems: Future research may compare the results obtained with this method to alternative methods, in particular in the frameworks of reliability analysis and constrained optimization. Since it relies on numerical integration, the method can become computationally expensive if a large number of integration points are needed to compute the criterion. Although sequential strategies allow some correction of the model during the process ( through re-estimation of the parameters for instance ), the success of the method will strongly depend on the capability of the Kriging model to fit the actual response.

What is the DoE for a classical space-filling?

The classical space-filling DoE leads to a uniform error behavior, while the optimal DoE lead to large errors when the response is far from the target value, while small errors when it is close to the target.

What is the importance of the Kriging distribution when approximating the limit-state?

When approximating the limit-state, it is clear that accuracy is critical in the regions where it is close to zero, since error in that region is likely to affect the probability estimate.

What is the advantage of sequential strategies over other DoEs?

In general, a particular advantage of sequential strategies over other DoEs is that they can integrate the information given by the first k observation values to choose the (k+1)th training point, for instance by reevaluating the Kriging covariance parameters.

How many integration points are chosen for CMA-ES?

The number of integration points is chosen equal to 5,000, and the number of function evaluations for CMA-ES is limited to 1,000.

What are some of the methods for calculating the failure probability of a system?

Some of them use the relation between input random variables and the limit-state (e.g., first-order reliability method) and some consider the limitstate as a black-box (e.g., Monte-Carlo Simulations, MCS).

What is the cost of a metamodel to approximate the limit state?

using a metamodel to approximate the limitstate g is a natural solution to the lack of data; MCS is then performed on the metamodel that is inexpensive to evaluate.

What is the effect of parameter re-evaluation on the efficiency of the method?

In the numerical examples used in this work, the authors found that after a first few iterations, the parameter re-evaluation had a negligible impact on the efficiency of the method.

How do the authors address the probability distribution of input variables?

To address this probability distribution of input variables, the authors modify the weighted IMSE criterion by integrating the weighted MSE not with a uniform measure, but with the law µ of the input variables.

What is the main difference between the two criterion-based strategies?

It was found that both criterion-based strategies significantly outperformed space-filling designs, and taking into account the input distribution provides additional improvement in the accuracy of the probability of failure.

What is the second example of the fitting of random processes in six dimensions?

The second is the fitting of realizations of random processes in six dimensions with known covariance parameters, which allows us to decompose the problem and evaluate the relevance of their criterion since in this case there is no modeling error.

(Open Access) Adaptive Designs of Experiments for Accurate Approximation of a Target Region (2010) | Victor Picheny

Q: What is the famous adaptive strategy?

One of the most famous adaptive strategy is the EGO algorithm Jones et al. [7], used to derive sequential designs for the optimization of deterministic simulation models, by choosing at each step the point that maximizes the expected improvement, a functional that represents a compromise between exploration of unknown regions and local search.

Q: What is the objective of the present work?

The objective of the present work is to provide a methodology to construct a design of experiments such that the metamodel accurately approximates the vicinity of a boundary in design space defined by a target value of the function of interest.

HAL Id: hal-00319385

https://hal.archives-ouvertes.fr/hal-00319385v2

Submitted on 22 Jun 2010

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Adaptive Designs of Experiments for Accurate

Approximation of a Target Region

Victor Picheny, David Ginsbourger, Olivier Roustant, Raphael T. Haftka,

Nam-Ho Kim

To cite this version:

Victor Picheny, David Ginsbourger, Olivier Roustant, Raphael T. Haftka, Nam-Ho Kim. Adaptive

Designs of Experiments for Accurate Approximation of a Target Region. 2010. �hal-00319385v2�

Adaptive Designs of Experiments for Accurate

Approximation of a Target Region

Victor Picheny

Ecole Centrale Paris

victor.picheny@ecp.fr

David Ginsbourger

University of Bern

david.ginsbourger@stat.unibe.ch

Olivier Roustant

Ecole des Mines de St Etienne

roustant@emse.fr

Raphael T. Haftka

University of Florida

Email: haftka@uﬂ.edu

Nam-Ho Kim

University of Florida

nkim@uﬂ.edu

This paper addresses the issue of designing experiments for

a metamodel that needs to be accurate for a certain level

of the response value. Such a situation is common in con-

strained optimization and reliability analysis. Here, we pro-

pose an adaptive strategy to build designs of experiments that

is based on an explicit trade-off between reduction of global

uncertainty and exploration of regions of interest. A mod-

iﬁed version of the classical integrated mean square error

criterion is used that weights the prediction variance with

the expected proximity to the target level of response. The

method is illustrated by two simple examples. It is shown

that a substantial reduction of error can be achieved in the

target regions, with reasonable loss of global accuracy. The

method is ﬁnally applied to a reliability analysis problem;

it is found that the adaptive designs signiﬁcantly outperform

classical space-ﬁlling designs.

1 Introduction

In the past decades, the use of metamodeling techniques

has been recognized to efﬁciently address the issues of pre-

diction and optimization of expensive-to-compute numeri-

cal simulators or black-box functions [1, 2]. A metamodel

(or surrogate model) is an approximation to system response

constructed from its value at a limited number of selected

input values, the design of experiments (DoE). In many en-

gineering problems, the total number of function evaluations

is drastically limited by computational cost; hence, it is of

crucial interest to develop methods for efﬁciently selecting

the experiments.

In this paper, we focus on a particular application where

metamodels are used in a way that their accuracy is crucial

for certain level-sets. This situation is common in two popu-

lar frameworks:

In constrained optimization, the constraint function of-

ten relies on expensive calculations. For instance, a typ-

ical structural optimization formulation is to minimize a

weight such that the maximum stress, computed by ﬁ-

nite element analysis, does not exceed a certain value.

When using a metamodel to approximate the constraint,

it is of utmost importance that the approximation error

is minimal on the boundary that separates the feasible

designs from infeasible ones. Substantial errors for val-

ues far from the boundary, on the other hand, are not

detrimental.

In reliability analysis, a metamodelis often used to prop-

agate the uncertainty of random input variables to the

performance function of a system [3,4]. In particular,

the probability of failure of the system can be computed

using sampling techniques (i.e. Monte-Carlo Simula-

tions, MCS), by counting the number of responses that

are above a certain threshold. The contour line of the re-

sponse equal to the threshold must be known accurately

to discriminate between samples.

The objectiveof the present work is to providea method-

ology to construct a design of experiments such that the

metamodel accurately approximates the vicinity of a bound-

ary in design space deﬁned by a target value of the func-

tion of interest. Mourelatos et al. [5] used a combination of

global and local metamodels to ﬁrst detect the critical regions

and then obtain a locally accurate approximation. Ranjan et

al. [6] proposed a modiﬁed version of the famous EGO al-

gorithm (Efﬁcient Global Optimization, [7]) to sequentially

explore the domain region along a contourline. Tu et al. used

a modiﬁed D-optimal strategy for boundary-focusedpolyno-

mial regression [8]. Vazquez and Bect [9] proposed an it-

erative strategy for accurate computation of a probability of

failure based on Kriging. In this paper,we present an alterna-

tive criterion to choose sequentially the experiments, based

on an explicit trade-off between the exploration of the tar-

get region (on the vicinity of the contour line) and reduction

of the global uncertainty (prediction variance) in the meta-

model.

The paper is organized as follows: in Section 2, the Krig-

ing model and the framework of design of experiments are

described. In Section 3, the original criterion of selecting

experiments is presented, followed by its associated sequen-

tial strategy to derive designs of experiments in Section 4.

Results are presented for two analytical examples in Section

5. Finally, the criterion is applied to a probability of failure

estimation problem.

2 Kriging Metamodel and Design of Experiments

Let us ﬁrst introduce some notation. We denote by y the

response of a numerical simulator or function that is to be

studied:

y : D ⊂ R

−→ R

x 7−→ y(x) (1)

where x = {x

, ..., x

}

is a vector of input variables,

and D is the design space. In order to build a metamodel, the

response y is observed at n distinct locations X:

X = [x

, ..., x

]

Y = [y(x

), ..., y(x

)]

= y(X) (2)

In Eqn. 2, choosing X is called the design of experi-

ments (DoE), and Y is the vector of observations. Since the

response y is expensive to evaluate, we approximate it by a

simple model M, called the metamodel or surrogate model,

based on assumptions on the nature of y and on its observa-

tions Y at the points of the DoE. In this paper, we present a

particular metamodel, Universal Kriging (UK), and we dis-

cuss some important issues about the choice of the design of

experiments.

2.1 Universal Kriging Model

The main hypothesis behind the Kriging model is to as-

sume that the true function y is one realization of a Gaussian

stochastic process Y, y(x) = Y(x,ω), where ω belongs to the

underlying probability space Ω. In the following we use the

notation Y(x) for the process and Y(x, ω) for one realization.

For Universal Kriging [10], Y is typically of the form:

Y(x) =

∑

j=1

(x) + Z(x) (3)

where f

are linearly independent known functions, and Z is

a Gaussian process [11] with zero mean and stationary co-

variance kernel k with known correlation structure and pa-

rameters.

Under such hypothesis, the best linear unbiased predictor

(BLUP) for Y(x) (for any x in D), knowing the observations

Y, is given by the following equation [10,11]:

(x) = f(x)

β+ c(x)

−1



Y− F



(4)

where f(x) = [ f

(x), . . . , f

(x)]

is p × 1 vector of basis

functions,

β =

, . . . ,

is p × 1 vector of estimates of

β, c(x) = [k(x, x

), . . . , k(x,x

)]

is n × 1 vector of covari-

ance, C = [k(x

, x

)]

1≤i, j≤n

is n × n covariance matrix, and

F = [f(x

), . . . , f(x

)]

is n × p experimental matrix. In Eqn.

β is the vector of generalized least square estimates of β:

β =



−1



−1

Y (5)

In addition, the Universal Kriging model providesan es-

timate of the accuracy of the mean predictor, the Kriging pre-

diction variance:

(x) = k(x, x) − c(x)

−1

c(x)



f(x)

− c(x)

−1



−1



−1



f(x)

− c(x)

−1



(6)

where σ

is the process variance. Note that if the predic-

tion variance is written in terms of correlations (instead of

covariance here), Eqn. 6 can be factorized by σ

. For de-

tails of derivations, see for instance [10, 11]. It is important

to notice here that the Kriging variance in Eqn. 6, assuming

that the covariance parameters are known, does not depend

on the observations Y, but only on the Kriging model and on

the design of experiments.

We denote by M(x) the Gaussian process conditional on the

observations Y:

M := (M(x))

x∈D

= (Y(x)|Y(X) = Y)

x∈D

= (Y(x)|obs)

x∈D

(7)

The Kriging model provides the marginal distribution of M

at a prediction point x:

M(x) ∼ N



(x), s

(x)



(8)

The Kriging mean m

interpolates the function y(x) at the

design of experiment points:

) = y(x

), 1 ≤ i ≤ n

(9)

The Kriging variance is null at the observation points x

and greater than zero elsewhere:

) = 0, 1 ≤ i ≤ n and s

(x) ≥ 0, x 6= x

(10)

Besides, the Kriging variance increaseswith the low val-

ues of the covariance between Y(x) and Y(x

) (1 ≤ i ≤ n).

Some parameters of the covariance kernel are often unknown

and must be estimated based on the observations, using max-

imum likelihood, cross-validation or variogram techniques

for instance (see [10,11]). However, in the Kriging model

they are considered as known. To account for additional

variability due to the parameter estimation, one may use

Bayesian Kriging models (see [12, 13]), which will not be

detailed here. With such models, Eqn. 8 does not stand in

general. However, the methodology proposed here also ap-

plies to Bayesian Kriging, with the appropriate modiﬁcations

of the calculations shown in Section 3.

2.2 Design of experiments

Choosing the set of experiments (sampling points) X

plays a critical role in the accuracy of the metamodel and the

subsequent use of the metamodel for prediction. DoEs are

often based on geometric considerations, such as Latin Hy-

percube sampling (LHS) [14], or Full-factorial designs [15].

In this section, we introduce two important notions: model-

oriented and adaptive designs.

2.2.1 Model-oriented designs

Model-oriented designs aim at maximizing the quality

of statistical inference of a given metamodel. In linear re-

gression, [16,17], A- and D- optimal designs minimize the

uncertainty in the coefﬁcients, when uncertainty is due to

noisy observations. Formally, the A- and D-optimality cri-

teria are, respectively, the trace and determinant of Fisher’s

information matrix.

These criteria are particularly relevant in regression since

minimizing the uncertainty in the coefﬁcients also minimizes

the uncertainty in the prediction (Kiefer, [16]). For Kriging,

uncertainties in covariance parameters and prediction are not

simply related. Instead, a natural alternative is to take ad-

vantage of the prediction variance associated with the meta-

model, assuming that the covariance structure and param-

eters are accurately estimated. The prediction variance al-

lows us to build measures that reﬂect the overall accuracy of

Kriging. Two different criteria are available: the integrated

mean square error (IMSE) and maximum mean square error

(MMSE) [18]:

IMSE =

MSE(x)dµ(x) (11)

MMSE = max

x∈D

[MSE(x)] (12)

µ is a positive measure on D and

MSE(x) = E

(x) − M(x))

= s

(x) (13)

Note that the above criteria are often called I-criterion

and G-criterion, respectively, in the regression framework.

The IMSE is a measure of the average accuracy of the meta-

model, while the MMSE measures the risk of large error in

prediction.

Optimal designs are model-dependent, in the sense that the

optimality criterion is determined by the choice of the meta-

model. In regression, A- and D-criteria depend on the choice

of the basis functions, while in Kriging, the prediction vari-

ance s

depends on the linear trend, the covariance structure,

and parameter values. However, one may notice that, assum-

ing that the trend and covariance structures are known, none

of the criteria depends on the response values at the design

points.

2.2.2 Adaptive designs

The previous DoE strategies choose all the points of the

design before computing any observation. It is also possible

to build the DoE sequentially, by choosing a new point as a

function of the other points and their correspondingresponse

values. Such approach has received considerable attention

from the engineering and mathematical statistic communi-

ties, for its advantages of ﬂexibility and adaptability over

other methods [19,20].

Typically, the new point achieves a maximum on some crite-

rion. For instance, a sequential DoE can be built by making

at each step a new observation where the prediction variance

is maximal. Sacks et al. [18] use this strategy as a heuristic

to build IMSE-optimal designs for Kriging. The advantage

of sequential strategy here is twofold. Firstly, it is computa-

tionally efﬁcient because it transforms an optimization prob-

lem of dimension n × d (for the IMSE minimization) into

k optimizations of dimension d. Secondly, it allows us to

reevaluate the covariance parameters after each observation.

In the same fashion, Williams et al. [21], Currin et al. [22],

and Santner [2] use a Bayesian approach to derive sequential

IMSE designs. Osio and Amon [23] proposed a multistage

approach to enhance ﬁrst space-ﬁlling in order to accurately

estimate the Kriging covariance parameters and then reﬁne

the DoE by reducing the model uncertainty. Some reviews

of adaptive sampling in engineering design can be found in

Jin et al. [24].

In general, a particular advantage of sequential strategies

over other DoEs is that they can integrate the information

given by the ﬁrst k observation values to choose the (k+ 1)

training point, for instance by reevaluating the Kriging co-

variance parameters. It is also possible to deﬁne response-

dependent criteria, with naturally leading to surrogate-based

optimization. One of the mostfamous adaptive strategy is the

EGO algorithm Jones et al. [7], used to derive sequential de-

signs for the optimization of deterministic simulation mod-

els, by choosing at each step the point that maximizes the

expected improvement, a functional that represents a com-

promise between exploration of unknown regions and local

search. Jones [25] also proposes maximum probability of

improvement as an alternative criterion.

In this paper, the objective is not optimization, but to accu-

rately ﬁt a function when it is close to a given threshold. It is

then obvious that the DoE needs to be built according to the

observation values, hence sequentially. Shan and Wang [26]

proposed a rough set based approach to identify sub-regions

of the design space that are expected to have performance

values equal to a given level. Ranjan et al. [6] proposed a

sequential DoE method for contour estimation, which con-

sists of a modiﬁed version of the EGO algorithm. The func-

tional minimized at each step is a trade-off between uncer-

tainty and proximity to the actual contour. Tu et al. [8] used

a weighted D-optimal strategy for polynomialregression, the

acceptable sampling region at each step being limited by ap-

proximate bounds around the target contour. Oakley [27]

uses Kriging and sequential strategies for uncertainty propa-

gation and estimation of percentiles of the output of com-

puter codes. Vazquez and Bect [9] proposed an iterative

strategy for probability of failure estimation by minimizing

the classiﬁcation error when using Kriging. All these papers

aim at constructing DoEs for accurate approximation of sub-

0 0.2 0.4 0.6 0.8 1

−2

−1

y(x)

T +/− ε

Fig. 1. One-dimensional illustration of the target region. Here, T =

1 and ε = 0.2. The target region consists of two distinct intervals.

regions of the design space. Our work proposes an alterna-

tive criterion which focuses on the integral of the prediction

variance (rather than punctual criterion).

3 Weighted IMSE Criterion

In this section, we present a variation of the IMSE crite-

rion, adapted to the problem of ﬁtting a function accurately

for a certain level-set. The controlling idea of this work is

that the surrogate does not need to be globally accurate, but

only in some critical regions, which are the vicinity of the

target boundary.

3.1 Target region deﬁned by an indicator function

The IMSE criterion is convenient because it sums up

the uncertainty associated with the Kriging model over the

entire domain D. However, we are interested in predict-

ing Y accurately in the vicinity of a level-set y

−1

(T) =

{x ∈ D : y(x) = T} (T a constant). Then, such a criterion is

not suitable since it weights all points in D according to their

Kriging variance, which does not depend on the observations

Y, and hence does not favor zones with respect to properties

concerning their y values but only on the basis of their posi-

tion with respect to the DoE.

We propose to change the integration domain from D to a

neighborhood of y

−1

(T) in order to learn y accurately near

the contour line. We deﬁne a region of interest X

T,ε

(param-

eterized by ε > 0) as the subset in D whose image is within

the bounds T − ε and T + ε:

T,ε

= y

−1

([T − ε, T + ε]) = {x ∈ D|y(x) ∈ [T − ε, T + ε]}

(14)

Figure 1 illustrates a one-dimensional function with the

region of interest being at T = 1 and ε = 0.2. Note that the

target region consists of two distinct intervals.

With the region of interest, the targeted IMSE criterion

is deﬁned as follows:

imse

T,ε

(x)dx =

(x)1

[T−ε,T+ε]

[y(x)]dx (15)

where 1

[T−ε,T+ε]

[y(x)] is the indicator function, equal to 1

when y(x) ∈ [T − ε, T + ε] and 0 elsewhere.

Finding a design that minimizes imse

would make the meta-

model accurate in the subset X

T,ε

, which is exactly what we

want. Weighting the IMSE criterion over a region of interest

is classical and proposed for instance by [15], pp.433-434.

However, the notable difference here is that this region is un-

known a priori.

Now, we can adapt the criterion in the context of Kriging

modeling, where y is a realization of a random process Y

(see Section 2.1).

Thus, imse

is deﬁned with respect to the event ω:

imse

(x)1

[T−ε,T+ε]

[Y (x, ω)]dx = I(ω) (16)

To come back to a deterministic criterion, we consider

the expectation of I(ω), conditional on the observations:

IMSE

= E

I(ω)



obs

= E





(x)1

[T−ε,T+ε]

[Y (x, ω)]dx



obs





(17)

Since the quantity inside the integral is positive, we can

commute the expectation and the integral:

IMSE

(x)E

[T−ε,T+ε]

[Y (x, ω)]



obs

(x)E



[T−ε,T+ε]

[M (x)]



(x)W (x)dx (18)

According to Eqn. 18, the reduced criterion is the average

of the prediction variance weighted by the function W(x).

Besides, W(x) is simply the probability that the response is

inside the interval [T − ε, T + ε]:

W(x) = E



[T−ε,T +ε]

[M (x)]



= P



M(x) ∈ [T − ε, T + ε]



(19)

Using Eqn. 8), we obtain a simple analytical form for

W(x):

W (x) =

T+ε

T−ε

(

(x),s

(x)

)

(u)du (20)

Adaptive Designs of Experiments for Accurate Approximation of a Target Region

Figures

Citations

Probability, Reliability and Statistical Methods in Engineering Design

Metamodel-based importance sampling for structural reliability analysis

Reliability-based design optimization using kriging surrogates and subset simulation

Design of computer experiments: space filling and beyond

Sequential design of computer experiments for the estimation of a probability of failure

References

Gaussian Processes for Machine Learning

Response Surface Methodology: Process and Product Optimization Using Designed Experiments

Statistics for spatial data

A comparison of three methods for selecting values of input variables in the analysis of output from a computer code

Efficient Global Optimization of Expensive Black-Box Functions

Related Papers (5)

The design and analysis of computer experiments

Efficient Global Optimization of Expensive Black-Box Functions

Gaussian Processes for Machine Learning

A comparison of three methods for selecting values of input variables in the analysis of output from a computer code

Review of Metamodeling Techniques in Support of Engineering Design Optimization

Frequently Asked Questions (15)

Q1. What are the contributions mentioned in the paper "Adaptive designs of experiments for accurate approximation of a target region" ?

Q2. What have the authors stated for future works in "Adaptive designs of experiments for accurate approximation of a target region" ?

Q3. What is the famous adaptive strategy?

Q4. What is the DoE for a classical space-filling?

Q5. What is the importance of the Kriging distribution when approximating the limit-state?

Q6. What is the advantage of sequential strategies over other DoEs?

Q7. What is the objective of the present work?

Q8. How many integration points are chosen for CMA-ES?

Q9. What are some of the methods for calculating the failure probability of a system?

Q10. What is the cost of a metamodel to approximate the limit state?

Q11. How can the probability of failure of a system be calculated?

Q12. What is the effect of parameter re-evaluation on the efficiency of the method?

Q13. How do the authors address the probability distribution of input variables?

Q14. What is the main difference between the two criterion-based strategies?

Q15. What is the second example of the fitting of random processes in six dimensions?