scispace - formally typeset
Open AccessJournal ArticleDOI

Input–Output Uncertainty Comparisons for Discrete Optimization via Simulation

Reads0
Chats0
TLDR
Selecting the optimal policy using simulation is subject to input model risk when input models that mimic real-world randomness in the simulation have estimation error due to finite sample sizes.
Abstract
Selecting the optimal policy using simulation is subject to input model risk when input models that mimic real-world randomness in the simulation have estimation error due to finite sample sizes. I...

read more

Content maybe subject to copyright    Report

Submitted to Operations Research
manuscript (Please, provide the manuscript number!)
Authors are encouraged to submit new papers to INFORMS journals by means of
a style file template, which includes the journal title. However, use of a template
does not certify that the paper has been accepted for publication in the named jour-
nal. INFORMS journal templates are for the exclusive purpose of submitting to an
INFORMS journal and should not be used to distribute the papers in print or online
or to submit the papers to another publication.
Input-Output Uncertainty Comparisons for Discrete
Optimization via Simulation
Eunhye Song
Department of Industrial and Manufacturing Engineering, The Pennsylvania State University, University Park, PA 16802,
eus358@psu.edu
Barry L. Nelson
Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, IL 60208,
nelsonb@northwestern.edu
When input distributions to a simulation model are estimated from real-world data, they naturally have
estimation error causing input uncertainty in the simulation output. If an optimization via simulation (OvS)
method is applied that treats the input distributions as “correct,” then there is a risk of making a suboptimal
decision for the real world, which we call input model risk. This paper addresses a discrete OvS (DOvS)
problem of selecting the real-world optimal from among a finite number of systems when all of them share
the same input distributions estimated from common input data. Since input uncertainty cannot be reduced
without collecting additional real-world data—which may be expensive or impossible—a DOvS procedure
should reflect the limited resolution provided by the simulation model in distinguishing the real-world opti-
mal solution from the others. In light of this, our input-output uncertainty comparisons (IOU-C) procedure
focuses on comparisons rather than selection: it provides simultaneous confidence intervals for the difference
between each system’s real-world mean and the best mean of the rest with any desired probability, while
accounting for both stochastic and input uncertainty. To make the resolution as high as possible (intervals
as short as possible) we exploit the common input data effect to reduce uncertainty in the estimated differ-
ences. Under mild conditions we prove that the IOU-C procedure provides the desired statistical guarantee
asymptotically as the real-world sample size and simulation effort increase, but it is designed to be effective
in finite samples.
Key words : Optimization via simulation under input uncertainty, common input data effect, multiple
comparisons with the best
History : First submitted on June 2016; revisions submitted on July 2017 and May 2018.
1

Song and Nelson: Input-Output Uncertainty Comparisons for DOvS
2 Article submitted to Operations Research; manuscript no. (Please, provide the manuscript number!)
1. Introduction
Due to the flexibility of simulation, optimization via simulation (OvS) is a widely accepted tool
to improve system performance. Real-world problems typically involve stochastic processes, e.g.,
demand for a new product or arrivals of patients to an emergency room, which are often modeled
by probability distributions. Stochastic simulation is driven by random variates generated from
these input models to produce outputs that mimic real-world performance. Therefore, when we
make decisions based on the simulation outputs, we are subject to the risk of making suboptimal
decisions when the input models do not faithfully represent the real-world stochastic processes; this
is known as input model risk. Most standard OvS methods do not take into account input model
risk and instead optimize under the assumption that the input models are accurate representations
of the real-world randomness. However, the best system chosen conditional on the input models
may not be the best system with respect to real-world performance when implemented. We refine
this point below and illustrate it further using an inventory management example with estimated
input demand distribution in Section 2. Of course, there may also be a logical discrepancy between
the simulation model and the real-world system but that is beyond the scope of this paper.
The problem of interest is to compare k systems, where the ith system’s performance measure is
its simulation output mean, E[Y
i
(F
c
i
)], under real-world input distribution F
c
i
(c for correct), where
Y
i
(·) is the stochastic output performance which depends on the chosen input distribution. When
there are many input processes in the system, F
c
i
represents the joint distribution of all of the
input random variables. Our specific goal is to find arg max
i
E[Y
i
(F
c
i
)] (or arg min
i
E[Y
i
(F
c
i
)]) with
a statistical guarantee (e.g., 95%) that the selected system is the real-world optimal. As mentioned
earlier, in most cases F
c
1
, F
c
2
, . .. , F
c
k
are unknown, which forces us to use estimates,
b
F
1
,
b
F
2
, . . . ,
b
F
k
, to
run simulations and implicitly target E[Y
i
(
b
F
i
)|
b
F
i
] instead of E[Y (F
c
i
)] to evaluate the ith system’s

Song and Nelson: Input-Output Uncertainty Comparisons for DOvS
Article submitted to Operations Research; manuscript no. (Please, provide the manuscript number!) 3
performance. Typically,
b
F
i
is estimated from finite real-world observations from F
c
i
and therefore
is subject to estimation error. Input model risk arises as E[Y
i
(
b
F
i
)|
b
F
i
] depends on random
b
F
i
, and
thus the conditional optimal, arg max
i
E[Y
i
(
b
F
i
)|
b
F
i
], may not be the same as arg max
i
E[Y
i
(F
c
i
)]. In
this paper we show that it is possible to provide a meaningful statistical guarantee with respect to
the real-world optimal, rather than the conditional optimal.
To accomplish this we first need to understand how much uncertainty in E[Y
i
(
b
F
i
)|
b
F
i
] is caused
by the estimation error in
b
F
i
. This is referred to input uncertainty and formally defined as
Var(E[Y
i
(
b
F
i
)|
b
F
i
]), where the variance is taken with respect to the sampling distribution of
b
F
i
. Typ-
ically, we have only one “observation” of
b
F
i
estimated from the real-world data, which makes it
difficult to evaluate the variance. Another challenge is that the functional form of E[Y
i
(
b
F
i
)|
b
F
i
] is
generally unknown and can only be estimated via simulations. Several methods have been devel-
oped to quantify the marginal impact of input uncertainty on a single simulated system; see Barton
(2012), Song et al. (2014), and Lam (2016) for surveys.
Unlike simulation stochastic error, which can be reduced by increasing the number of simulation
replications, input uncertainty can only be reduced by collecting more real-world data. However,
real-world data collection is typically much more expensive than simulation replications, or it may
be impossible if an implementation decision has to be made before having another chance to collect
data (e.g., logistics decisions for a natural disaster). Our DOvS procedure is designed to provide
statistical inference on the real-world optimal solution in the presence of input model risk that will
not be further reduced by collecting more real-world data.
Optimization under input model risk is more challenging than conditional DOvS since even with
an infinite number of simulation replications we may not be able to distinguish the real-world best
from the others due to the remaining input uncertainty. But effective DOvS under input model risk
requires more than just quantifying the marginal input uncertainty in each system’s simulation
output; instead we need to compare how systems are affected jointly by input uncertainty.
Recently, several DOvS procedures that incorporate input model risk have been proposed; they
can be categorized into three groups in terms of what they promise to deliver: the first group

Song and Nelson: Input-Output Uncertainty Comparisons for DOvS
4 Article submitted to Operations Research; manuscript no. (Please, provide the manuscript number!)
of procedures selects a system that best hedges input model risk by identifying the worst-case
input distributions given real-world data for each system marginally, and then selects the sys-
tem with the best worst-case performance. For a maximization problem this beomes selecting
arg max
i
min
b
F
i
∈U
i
E[Y
i
(
b
F
i
)|
b
F
i
], where U
i
is the uncertainty set that contains the candidates for F
c
i
inferred from the real-world data. Such a formulation is used in the distributionally robust opti-
mization literature (Scarf 1958, Delage and Ye 2010, Ben-Tal et al. 2013). The robust selection of
the best procedure of Fan et al. (2013) and the optimal computational budget allocation scheme
of Gao et al. (2017) belong in this category. A benefit of this formulation is that we can always
select a single solution no matter how large input uncertainty is. However, the selected system may,
and often will, perform poorly under the true real-world input distributions. See Section 2.
The second category selects a system with the best performance averaged over input uncertainty,
i.e., arg max
i
E
E[Y
i
(
b
F
i
)|
b
F
i
]
, where the outer expectation is taken with respect to the sampling
or posterior distribution of
b
F
i
. Corlu and Biller (2015) propose a subset selection procedure that
averages both stochastic and input uncertainties to find a subset of optimal/near-optimal systems
where
b
F
i
is a Bayesian posterior distribution given real-world data. Even if the input uncertainty,
Var(E[Y
i
(
b
F
i
)|
b
F
i
]), is large the variance of an estimate of E
E[Y
i
(
b
F
i
)|
b
F
i
]
may be reduced by more
simulation replications. Hence, with a sufficiently large simulation budget the size of the subset may
be as small as one provided that E
E[Y
i
(
b
F
i
)|
b
F
i
]
is distinct for each i. However, E
E[Y
i
(
b
F
i
)|
b
F
i
]
6=
E[Y
i
(F
c
i
)] in general, and therefore arg max
i
E
E[Y
i
(
b
F
i
)|
b
F
i
]
may not be arg max
i
E[Y
i
(F
c
i
)]. The
bias of E
E[Y
i
(
b
F
i
)|
b
F
i
]
is larger when the number of real-world observations is smaller, causing
this fomulation to pose greater input model risk.
The last category of procedures directly attacks the problem of finding arg max
i
E[Y
i
(F
c
i
)]. Corlu
and Biller (2013) present a subset selection procedure that includes the real-world best system
in the subset assuming that max
i
E[Y
i
(F
c
i
)] is at least δ > 0 better than the rest of the systems’
true means. This procedure is distinguished from the subset selection procedure in Corlu and
Biller (2015) in that it does not average E[Y
i
(
b
F
i
)|
b
F
i
] over the distribution of
b
F
i
, but uses δ to

Song and Nelson: Input-Output Uncertainty Comparisons for DOvS
Article submitted to Operations Research; manuscript no. (Please, provide the manuscript number!) 5
control the resolution to which the procedure can successfully separate the real-world best from
the rest with a given statistical guarantee. Under the same indifference-zone (IZ) setting, Song
et al. (2015) discuss a ranking-and-selection approach that guarantees the probability of correctly
selecting arg max
i
E[Y
i
(F
c
i
)] in the presence of input model risk. Both Corlu and Biller (2013) and
Song et al. (2015) find that δ has an unknown nonzero lower bound, which is an increasing function
of input uncertainty reflecting the fact that the procedures may not distinguish the real-world best
system from the rest if the mean difference is too small relative to input uncertainty. To put it
differently, for δ below an unknown threshold the probability of correctly selecting the optimal (or
including the optimal in the subset) has an upper bound less than 1 so that even with infinite
simulation effort we may not achieve the desired statistical guarantee. Further, assuming an IZ
mean configuration makes both procedures conservative, because they are designed to provide the
statistical guarantee for the case where all suboptimal systems’ means are arg max
i
E[Y
i
(F
c
i
)] δ.
When F
c
1
, F
c
2
, . . . , F
c
k
are assumed known, this only makes us spend more simulation budget than
necessary to correctly select the optimal solution with the target probability. In the presence of
input model risk, however, the problem is much more severe and we may conclude that we cannot
provide the target probability guarantee at all when in fact we could if we did not assume an IZ
configuration.
Our input-output uncertainty comparisons (IOU-C) procedure belongs in the third category.
However, we focus on comparisons of systems, not selection, and we do not assume any configura-
tion for the system means, which differentiates our approach from Corlu and Biller (2013) and Song
et al. (2015). By extending the multiple comparisons with the best (MCB) framework of Chang
and Hsu (1992) to incorporate input model risk, IOU-C provides k joint confidence intervals (CIs)
on the true mean differences between each system and the best of the rest that account for both
stochastic and input uncertainties. With any given target probability guarantee, the CIs that con-
tain 0 indicate systems that are statistically inseparable from the real-world optimal.
We restrict our attention to the case where all systems share the same input distributions, i.e.,
F
c
i
= F
c
and
b
F
i
=
b
F for i = 1, 2, . . . , k, which is a common setting for DOvS problems. For instance,

Citations
More filters
Posted Content

Subsampling to Enhance Efficiency in Input Uncertainty Quantification

TL;DR: This paper proposes a subsampling framework to bypass this computational bottleneck, by leveraging the form of the output variance and its estimation error in terms of data size and sampling effort to reduce the sampling complexity of the two-layer bootstrap required in simulation uncertainty quantification.
Journal ArticleDOI

Classification and literature review on the integration of simulation and optimization in maritime logistics studies

TL;DR: This article reviews 107 papers on the integration of simulation and optimization for the maritime logistics studies published in the last two decades and identifies five modes of integration based on how the two methods interact.
Proceedings ArticleDOI

Random perturbation and bagging to quantify input uncertainty

TL;DR: This work analyzes a particular type of random perturbation motivated from resampling that connects to an infinitesimal jackknife estimator used in bagging and studies the direct use of this representation in obtaining efficient estimators for the input-contributed variance.
Proceedings ArticleDOI

Stochastic approximation for simulation optimization under input uncertainty with streaming data

TL;DR: A stochastic approximation framework is developed to solve a sequence of problems defined by the sequence of input parameter estimates to increasing levels of exactness and a sampling scheme is proposed so that the resulting solution iterates converge to the optimal solution under the real-world input distribution at the best possible rate.
Proceedings ArticleDOI

Fixed confidence ranking and selection under input uncertainty

TL;DR: A new version of the fixed confidence R&S problem, where sequential input data can be acquired to reduce IU over time, is studied, where a moving average estimator for online estimation with sequential data is proposed.
References
More filters
Journal ArticleDOI

Multivariate stochastic approximation using a simultaneous perturbation gradient approximation

TL;DR: The paper presents an SA algorithm that is based on a simultaneous perturbation gradient approximation instead of the standard finite-difference approximation of Keifer-Wolfowitz type procedures that can be significantly more efficient than the standard algorithms in large-dimensional problems.
Journal ArticleDOI

Distributionally Robust Optimization Under Moment Uncertainty with Application to Data-Driven Problems

TL;DR: This paper proposes a model that describes uncertainty in both the distribution form (discrete, Gaussian, exponential, etc.) and moments (mean and covariance matrix) and demonstrates that for a wide range of cost functions the associated distributionally robust stochastic program can be solved efficiently.
Journal ArticleDOI

Robust Solutions of Optimization Problems Affected by Uncertain Probabilities

TL;DR: Cachon et al. as mentioned in this paper studied robust linear optimization problems with uncertainty regions defined by φ-divergences and showed that the robust counterpart of a linear optimization problem with φ divergence uncertainty is tractable for most of the choices of φ typically considered in the literature.
Posted Content

Robust Solutions of Optimization Problems Affected by Uncertain Probabilities

TL;DR: In this paper, robust linear optimization problems with uncertainty regions defined by o-divergences (for example, chi-squared, Hellinger, Kullback-Leibler) are studied.
Related Papers (5)
Frequently Asked Questions (12)
Q1. What are the contributions mentioned in the paper "Input-output uncertainty comparisons for discrete optimization via simulation" ?

However, use of a template does not certify that the paper has been accepted for publication in the named journal. INFORMS journal templates are for the exclusive purpose of submitting to an INFORMS journal and should not be used to distribute the papers in print or online or to submit the papers to another publication. 

The problem of interest is to compare k systems, where the ith system’s performance measure isits simulation output mean, E[Yi(F c i )], under real-world input distribution F c i (c for correct), where Yi(·) is the stochastic output performance which depends on the chosen input distribution. 

Theorem 2 requires B = mγ for 0 < γ < 2, which is the condition for asymptotic normality of√ B/m(B̂i − Bi) in Proposition 3 in Section EC.5. 

In a realistic DOvS setting, each system’s performance measure is estimated via simulation replications, which introduces stochastic error. 

The all-in IOU-C procedure is protected against such an error by accounting for the estimation error in the gradients at the price of its conservatism. 

A total of L= 1,000 values of (θ̂−θc) were sampled in the random search algorithm (see Section EC.2) to approximate the optimal solutions of Pi`, i 6= `. 

The average subset size of the plug-in procedure is 1.82, which is much smaller than that of all-in IOU-C, yet theestimated simultaneous coverage probability of the plug-in procedure is 0.874 (dashed line). 

The average size of S0 is 1.03 for this procedure, which is the smallest among all three procedures since it ignores input uncertainty. 

Assumption 1(vii) states that given the plug-in distribution of CID effects and Vi(θ̂), the authors can find the exact multidimensional quantile vectors for −w(1)i` and −w(2)i` , respectively. 

when wemake decisions based on the simulation outputs, the authors are subject to the risk of making suboptimaldecisions when the input models do not faithfully represent the real-world stochastic processes; thisis known as input model risk. 

Figure 2 also shows that the simultaneous MCB coverage probabilityof the conditional procedure is 0.235 (dotted line), which is far lower than 0.9. 

The actual number of units that arrive has a binomial distribution where the probability that each unit in the order arrives is 0.95.