What have the authors stated for future works in "Minimax optimal designs via particle swarm optimization methods" ?

The authors have two areas for future work. The second area for future work is to apply PSO to find optimal designs under a non-convex criterion, where the authors no longer have an equivalence theorem to confirm whether a design is optimal or not. The authors plan to apply PSO methodology to find these types of optimal designs and hope to report results in the near future. The authors thank the editorial team for all the helpful comments and suggestions.

What is the weights used in the optimal design?

The weightstypically used in popular algorithms such as Fedorov’s algorithm for finding optimal designs to combine designs from each successive iterations are between 0 and 1 and have the following properties: (a) their sum is infinity and (b) the sum of squares of each term is finite.

How did the PSO algorithm find the minimax optimal design for Example 2?

The numerically minimax optimal design for Example 2 was found by repeated guess work followed by confirmation with the equivalence theorem in King and Wong (2000) with the aid of Mathematica.

What is the way to find the minimax optimal design for the quadratic model?

the authors applied Nested PSO and tested if it can find the minimax optimal design for the quadratic model with a monotonic increasing efficiency function when (a) X = Z and (b) Z is outside of X .

(Open Access) Minimax optimal designs via particle swarm optimization methods (2015) | Ray-Bing Chen

Q: What contributions have the authors mentioned in the paper "Minimax optimal designs via particle swarm optimization methods" ?

The authors modify PSO techniques to find minimax optimal designs, which have been notoriously challenging to find to date even for linear models, and show that the PSO methods can readily generate a variety of minimax optimal designs in a novel and interesting way, including adapting the algorithm to generate standardized maximin optimal designs.

Q: What are the key tuning parameters in the PSOmethod?

The key tuning parameters in the PSOmethod are (i) flock size, i.e. number of particles (designs) to use in the search, (ii) the number of common support points these designs have, and (iii) the number of iterations allowed in the search process.

UCLA

Research Reports

Title

Minimax optimal designs via particle swarm optimization methods

Permalink

https://escholarship.org/uc/item/9vw0p4pn

Journal

Statistics and Computing, 25(5)

ISSN

0960-3174 1573-1375

Authors

Chen, Ray-Bing

Chang, Shin-Perng

Wang, Weichung

et al.

Publication Date

2014-04-12

DOI

10.1007/s11222-014-9466-0

Peer reviewed

eScholarship.org Powered by the California Digital Library

University of California

Stat Comput (2015) 25:975–988

DOI 10.1007/s11222-014-9466-0

Minimax optimal designs via particle swarm optimization methods

Ray-Bing Chen · Shin-Perng Chang · Weichung Wang ·

Heng-Chih Tung · Weng Kee Wong

Received: 27 June 2013 / Accepted: 24 March 2014 / Published online: 12 April 2014

Abstract Particle swarm optimization (PSO) techniques

are widely used in applied ﬁelds to solve challenging opti-

mization problems but they do not seem to have made an

impact in mainstream statistical applications hitherto. PSO

methods are popular because they are easy to implement and

use, and seem increasingly capable of solving complicated

problems without requiring any assumption on the objec-

tive function to be optimized. We modify PSO techniques

to ﬁnd minimax optimal designs, which have been notori-

ously challenging to ﬁnd to date even for linear models, and

show that the PSO methods can readily generate a variety

of minimax optimal designs in a novel and interesting way,

including adapting the algorithm to generate standardized

maximin optimal designs.

Keywords Continuous optimal design · Equivalence

theorem ·Fisher information matrix ·Standardized maximin

optimality criterion · Regression model

R.-B. C hen · H.-C. Tung

Department of Statistics, National Cheng-Kung University,

Tainan 70101, Taiwan

S.-P. Chang

Department of Digital Fashion Design, Toko University,

Puzih, Chiayi 61363, Taiwan

W. Wa ng (

)

Institute of Applied Mathematical Sciences, National Taiwan

University, Taipei 10617, Taiwan

e-mail: wwang@ntu.edu.tw

W. K. Wong

Department of Biostatistics, Fielding School of Public Health, UCLA,

Los Angeles, CA 90095-1772, USA

1 Introduction

Particle swarm optimization (PSO) is a population based sto-

chastic optimization method inspired by social behavior of

bird ﬂocking or ﬁsh schooling and proposed by Eberhart

and Kennedy (1995). In the last decade or so, PSO has sin-

gularly generated considerable interest in optimization cir-

cles as evident by its ever increasing applications in vari-

ous disciplines. The importance and popularity of PSO can

also be seen in the existence of many websites which pro-

vide PSO tutorials and PSO codes, track PSO development

and applications in different ﬁelds. Some exemplary web-

sites on PSO are http://www.swarmintelligence.org/index.

php, http://www.particleswarm.info/ and http://www.cis.syr.

edu/~mohan/pso/. Currently, there are at least 3 journals

which have a focus theme on swarm intelligence and appli-

cations with a few more having an emphasis on the more

general class of nature-inspired metaheuristic algorithms, of

which PSO is a member. Nature-inspired metaheuristic algo-

rithms have been rising in popularity in the optimization

literature in the last 2 decades and in the last decade have

dominated the optimization world compared with traditional

mathematical optimization tools (Whitacre 2011a,b). Of par-

ticular note is Yang (2010), who saw a need to publish a

second edition of his book on nature-inspired metaheuristic

algorithms published less than 2 years earlier. This shows

just how dynamic and rapidly expanding the ﬁeld is. Clerc

(2006) seems to be the ﬁrst book devoted entirely to PSO

and an updated overview of PSO methodology is available

in Polietal.(2007).

Interestingly, PSO has yet to make an impact in the statis-

tical literature. We believe PSO methodology can be poten-

tially useful in solving many statistical problems because

ideas behind PSO are very simple and general yet requiring

minimal or no assumption on the function to be optimized.

123

976 Stat Comput (2015) 25:975–988

Our aim is to show that PSO methodology is effective in ﬁnd-

ing many types of optimal designs, including minimax opti-

mal designs, which are notoriously difﬁcult to ﬁnd and study.

This is because t he design criterion is non-differentiable and

there is no effective algorithm for ﬁnding such designs to

date, even for linear models. Speciﬁcally, we demonstrate

that PSO can readily generate different types of minimax

optimal designs for linear and nonlinear models which agree

with the few published results in the literature.

PSO is a stochastically iterative procedure for optimiz-

ing a function. The key advantages of this approach are that

PSO is fast and ﬂexible, there are few tuning parameters

required of the algorithm and PSO codes can be easily writ-

ten down generically to ﬁnd optimal designs for a regression

model. For more complicated problems, such as minimax

design problems, the code will have to be modiﬁed appropri-

ately. Generally, only the optimality criterion and the infor-

mation matrix in the codes have to be changed to ﬁnd an

optimal design for another problem. We discuss this further

in the exemplary pseudo MATLAB codes which we provide

in Sect. 4 to generate the optimal designs.

In the next section, we provide the background. In Sect. 3,

we demonstrate that PSO methodology can efﬁciently gener-

ate different types of minimax optimal designs for linear and

nonlinear models. In Sect. 4, we provide computational and

implementation details for our proposed PSO-based proce-

dure. Section 5 shows that PSO methodology can be modiﬁed

to ﬁnd standardized maximin optimal designs. As illustra-

tive examples, we construct such designs for enzyme kinetic

models and Sect. 6 closes with a discussion.

2 Background

We focus on continuous designs which are treated as prob-

ability measures on a given design space X. This approach

was proposed by Kiefer and his collection of voluminous

work in this area is now documented in a single collection

(Kiefer 1985). If a continuous design takes p

proportion of

the total observations at x

∈ X, i = 1, 2,...,k, we denote

it by ξ with p

+ p

+···+p

= 1. Given a ﬁxed sample

size N , we implement ξ by taking roughly Np

observations

at x

, i = 1, 2, .., k subject to Np

+Np

+···+Np

= N .

As Kiefer had shown, one can round each of the Np

’s to

the nearest integer so that they sum to N without losing too

much efﬁciency if the sample size is large. The proportion p

is sometimes called the weight of the design at x

. Continu-

ous designs are practical to work with, along with many other

advantages widely documented in design monographs, such

as Fedorov (1972), Silvey (1980), Pázman (1986), Atkinson

et al. (2007) and in Kiefer (1985).

Our setup assumes we have a statistical model deﬁned on

given compact design region X . The mean of the univari-

ate response is modeled by a known function g(x,θ) apart

from the values of the vector of parameters θ . We assume

errors are normally and independently distributed, all with

zero means and possibly unequal variances. The mean func-

tion g(x,θ) can be a linear or nonlinear function of θ and

the set of independent variables x. Following convention, the

value of the design ξ is measured by its Fisher information

matrix deﬁned to be the negative of the expectation of the

matrix of second derivatives of the log-likelihood function.

For example, consider the popular Michaelis–Menten model

in the biological sciences given by

y = g(x,θ)+ ε =

b + x

+ ε, x > 0,

where a > 0 denotes the maximal response possible and b >

0isthevalueofx for which there is a half-maximal response.

In practice, the design space is truncated to X =[0, c]where

c is a sufﬁciently large user-selected constant. If θ



= (a, b)

and the errors ε are normally and independently distributed

with means 0 and constant variance, the Fisher information

matrix for a given design ξ is

I (θ, ξ ) =



∂g(x,θ)

∂θ

∂g(x,θ)

∂θ

ξ(dx)





b + x





−

a(b+x)

−

a(b+x)

(b+x)



ξ(dx).

For nonlinear models, such as the Michaelis–Menten model,

the information matrix depends on the model parameters. For

linear models, the information matrix does not depend on the

model parameters and we denote it simply by I (ξ ).

Following convention, the optimality criterion is formu-

lated as a convex function of the design and the optimal

design is found by minimizing the criterion over all designs

on the design space X . This means that for nonlinear mod-

els, the design criterion that we want to optimize contains

unknown parameters. For example, to estimate parameters

accurately, we minimize log |I (θ, ξ)

−1

|over all designs ξ on

X (D-optimality). As such, a nominal value or best guess for

θ is needed before the function can be optimized. The result-

ing D-optimal design depends on the nominal value and so

it is called locally D-optimal. More generally, locally opti-

mal designs require nominal values for the model parameters

before optimal designs can be found. In addition, when the

criterion is a convex function in ξ , this means that a standard

directional derivative argument can be applied to produce an

equivalence theorem which checks whether a given design is

optimal among all designs on X. Details are available in the

above cited design monographs.

Minimax optimal designs arise naturally when we wish to

have protection against the worst case s cenario. For example

if the vector of model parameters is θ and  is a user-selected

set of plausible values for θ , one may want to implement a

minimax optimal design ξ

∗

deﬁned by

123

Stat Comput (2015) 25:975–988 977

∗

= arg min

max

θ∈

log |I

−1

(θ, ξ )|, (1)

where the minimization is over all designs on X. The optimal

design provides some global protection against the worst case

scenario by minimizing the maximal inefﬁciencies of the

parameter estimates. Clearly, when  is a singleton set, the

optimal minimax design is the same as the locally optimal

design.

A common application of the minimax design criterion is

in a dose response study where the goal is to ﬁnd an extrap-

olation optimal design which provides the best inference on

the mean responses over a known interval Z outside the dose

interval X. If we have a heteroscedastic linear model with

mean function g(x) and λ(x ) is the assumed reciprocal vari-

ance of the response at dose x, then the variance of the ﬁtted

response at the point z is proportional to

v(z,ξ) = g

(z)I (ξ )

−1

g(z),

where

I (ξ ) =



λ(x)g(x)g

(x)ξ(dx).

The best design for inference at the point z is the one that

minimizes v(z,ξ)among all designs ξ on X. However if we

know there are several dose levels of interest and they are all

in some pre-determined compact set Z, one may seek a design

to minimize the maximal variance of the ﬁtted responses on

Z. Such a design criterion is also convex and one can use the

following equivalence theorem: ξ

∗

is minimax optimal for

extrapolation on Z if and only if there exists a probability

measure μ

∗

on A(ξ

∗

) such that for all x in X ,

c(x,μ

∗

,ξ

∗

) =



A(ξ

∗

)

λ(x)r(x, u,ξ

∗

)μ

∗

(du) − v(u,ξ

∗

) ≤ 0

with equality at the support points of ξ

∗

. Here, A(ξ ) =

{u ∈ Z |v(u,ξ) = max

z∈Z

v(z,ξ)} and r(x, u,ξ) =

(x)I (ξ )

−1

g(u))

.IfX is one or two-dimensional, one

may visually inspect the plot of c(x,μ

∗

,ξ

∗

) versus values of

x ∈ X to conﬁrm the optimality of ξ

∗

. In what is to follow,

we display such plots to verify the optimality of a design

without reporting the measure μ

∗

. A formal proof of this

equivalence theorem can be found in Berger et al. (2000)

and further details on minimax optimal design problems are

available in Wong (1992) and Wong and Cook (1993) with

further examples in King and Wong (1998, 2000). Extensions

to nonlinear models are straightforward if one assumes the

mean response can be adequately approximated by a linear

model via a ﬁrst order Taylor Series expansion.

There are three points worth noting: (i) when Z is a sin-

gleton set, the probability measure μ

∗

is necessarily degen-

erate at Z and the resulting equivalence theorem reduces to

one for checking whether a design is c-optimal, see Fedorov

(1972)orSilvey (1980); (ii) equivalence theorems for min-

imax optimality criteria all have a form similar to the one

shown above and they are more complicated because we need

to work with the subgradient μ

∗

. A reference for subgradi-

ent is the full chapter called “The subgradient method” in

Shor (1985). Finding the subgradient requires another set of

optimization procedures which usually is more tricky to han-

dle and this in part explains why minimax optimal designs

are much harder to ﬁnd than optimal designs under a differ-

entiable criterion, and (iii) under the setup here, the convex

design criterion allows us to derive a lower bound on the efﬁ-

ciency of any design (Pázman 1986). This implies that one

can always assess how good a design is by providing its efﬁ-

ciency lower bound (without knowing the optimal design).

3 PSO-generated minimax optimal designs

Minimax optimal designs are notoriously difﬁcult to ﬁnd and

we know of no algorithm to date which is guaranteed to ﬁnd

such optimal designs. Even for linear polynomial models

with a few factors, recent papers acknowledge the difﬁculty

of ﬁnding minimax optimal designs; see Rodriguez et al.

(2010) and Johnson et al. (2011), who considered ﬁnding a G-

optimal design to minimize the maximal variance of the ﬁtted

response across the design space. Optimal minimax designs

for nonlinear models can be challenging even when there are

just two parameters in the model; earlier attempts to solve

such minimax problems have to impose constraints to sim-

plify the optimization problem. For example, Sitter (1992)

found minimax D-optimal designs for the two-parameter

logistic model among designs which allocated equal num-

bers of observations at equally spaced points placed sym-

metrically about the location parameter. Similarly, Noubiap

and Seidel (2000) found minimax optimal designs numer-

ically among symmetric and balanced designs after noting

that ”by restricting the set of regarded designs in a suitable

way, the minimax problem becomes numerically tractable in

principle; nevertheless it is still a two-level problem requiring

nested global optimization.” In the same paper on p.152, the

authors remark that “Unfortunately, the minimax procedure

is, in general, numerically intractable”.

We are therefore naturally interested in investigating

whether the PSO methodology provides an effective way to

ﬁnd minimax optimal designs. Our examples in this section

are conﬁned to the scattered few minimax optimal designs

reported in the literature, either numerically or analytically.

The hope is that all optimal designs found by PSO agree

with results in the literature and this would then suggest that

the algorithm should also work well for problems whose

minimax optimal designs are unknown. Of course, we can

also conﬁrm t he optimality of the design found by the PSO

123

978 Stat Comput (2015) 25:975–988

Table 1 Selected locally

E-optimal designs for the

Michaelis–Menten model found

by PSO and from theory when

the design space is

[0, ˜x]=[0, 200]

Ta ble shows the two support

points with their weights in

parentheses

ab ξ

PSO

E-optimal designs

100 150 46.520 (0.6925) 200 (0.3075) 45.510 (0.6927) 200 (0.3073)

100 100 38.152 (0.6770) 200 (0.3230) 38.150 (0.6769) 200 (0.3231)

100 50 24.783 (0.6171) 200 (0.3829) 24.780 (0.6171) 200 (0.3829)

100 10 6.516 (0.2600) 200 (0.7400) 6.515 (0.2600) 200 (0.7400)

100 1 0.701 (0.0222) 200 (0.9778) 0.701 (0.0220) 200 (0.9778)

10 150 46.497 (0.7071) 200 (0.2929) 46.510 (0.7070) 200 (0.2931)

10 100 38.142 (0.7068) 200 (0.2932) 38.150 (0.7068) 200 (0.2933)

10 50 24.778 (0.7058) 200 (0.2942) 24.780 (0.7058) 200 (0.2942)

10 10 6.515 (0.6837) 200 (0.3163) 6.515 (0.6838) 200 (0.3162)

10 1 0.701 (0.1882) 200 (0.8118) 0.701 (0.1881) 200 (0.8119)

using an equivalence theorem. Example 3 below is one such

instance.

We selectively present three examples and brieﬂy a fourth

with two independent variables out of many successes we

have had with PSO for ﬁnding different types of minimax

optimal designs. One of the examples has a binary response

and the rest have continuous responses. The ﬁrst example

seeks to ﬁnd a locally E-optimal design which minimizes

the maximum eigenvalue of the inverse of the Fisher infor-

mation matrix. Example 2 seeks a best design for estimat-

ing parameters in a two-parameter logistic model when we

have a priori a range of plausible values for each of the

two parameters. The desired design is the one which max-

imizes the smallest determinant of the information matrix

over all nominal values of the two parameters in the plausi-

ble region. Equivalently, this is the minimax optimal design

which minimizes the maximum determinant of the inverse

of the information matrix where the maximum is taken over

all nominal values in the plausible region for the parameters.

The numerically minimax optimal design for Example 2 was

found by repeated guess work followed by conﬁrmation with

the equivalence theorem in King and Wong (2000) with the

aid of Mathematica. We will compare their designs with our

PSO-generated designs. The third example concerns a het-

eroscedastic quadratic model with a known efﬁciency func-

tion and we want to ﬁnd a design to minimize the maximum

variance of the ﬁtted responses across a user-speciﬁed inter-

val. The minimax optimal designs are unknown for this exam-

ple and we will check the optimality of the PSO-generated

design using an equivalence theorem.

The key tuning parameters in the PSO method are (i) ﬂock

size, i.e. number of particles (designs) to use in the search,

(ii) the number of common support points these designs

have, and (iii) the number of iterations allowed in the search

process. Unless mentioned otherwise, we use the same val-

ues for these tuning parameters for the outer problem [e.g the

minimization problem in Eq. (1)] and the inner problem [e.g

the maximization problem in Eq. (1)]. We use default values

for all other tuning parameters in the PSO codes which we

programmed in MATLAB version R2010b. Section 4 pro-

vides information on these default values. All CPU comput-

ing times (in seconds) were from a Intel Core2 6300 computer

with 5 GB RAM and operating system Ubuntu 64bit Linux

with kernel 2.6.35-30.

Before we present our modiﬁed PSO method called

Nested PSO in Sect. 4, we present four examples, with a

bit more detail for the ﬁrst example.

3.1 Example 1: E-optimal designs for the

Michaelis–Menten model

The Michaelis–Menten model is one of the simplest and most

widely used model in the biological sciences. Dette and Wong

(1999) used a geometric argument based on the celebrated

Elfving’s theorem and constructed locally E-optimal designs

for the model with two parameters θ



= (a, b). Such optimal

designs are useful for making inference on θ by making the

area of the conﬁdence ellipsoid small in terms of minimizing

the length of the longest principal axis. This is achieved by

minimizing the larger of the two eigenvalues of the inverse

of the information matrix over all designs on X. For a given

θ, they showed that if the known design space is X =[0, ˜x]

and ˜z =˜x/(b+˜x), the locally E-optimal design is supported

at ˜x and {(

√

2 − 1)b ˜x}/{(2 −

√

2) ˜x + b} and the weight at

the latter support point is

w =

√

2(a/b)

(1 −˜z){

√

2 − (4 − 2

√

2)˜z}

2 + (a/b)

{

√

2 − (4 − 2

√

2)˜z}

We use the Nested PSO procedure to be described in

the next section to search for the locally 2-point E-optimal

design using 128 particles and 100 iterations. Selected mini-

max optimal designs are shown in Table 1 along with the the-

oretical optimal designs reported in Dette and Wong (1999).

All the PSO-generated designs are close to the theoretical

123

Minimax optimal designs via particle swarm optimization methods

Figures

Citations

A Modified Particle Swarm Optimization Technique for Finding Optimal Designs for Mixture Models

Algorithmic Searches for Optimal Designs

Optimal robust control strategy of a solid oxide fuel cell system

Cluster-Randomized Trial to Increase Hepatitis B Testing among Koreans in Los Angeles.

Using animal instincts to design efficient biomedical studies via particle swarm optimization

References

Particle swarm optimization

A new optimizer using particle swarm theory

A modified particle swarm optimizer

Parameter Selection in Particle Swarm Optimization

Comparing inertia weights and constriction factors in particle swarm optimization

Related Papers (5)

Theory of optimal experiments

Optimum Experimental Designs, with SAS

Optimal Bayesian design applied to logistic regression experiments

General Equivalence Theory for Optimum Designs (Approximate Theory)

Locally Optimal Designs for Estimating Parameters

Frequently Asked Questions (7)

Q1. What contributions have the authors mentioned in the paper "Minimax optimal designs via particle swarm optimization methods" ?

Q2. What have the authors stated for future works in "Minimax optimal designs via particle swarm optimization methods" ?

Q3. What are the key tuning parameters in the PSOmethod?

Q4. What is the weights used in the optimal design?

Q5. How did the PSO algorithm find the minimax optimal design for Example 2?

Q6. What is the way to solve a minimax problem?

Q7. What is the way to find the minimax optimal design for the quadratic model?