Probabilistic sensitivity analysis of complex models: a Bayesian approach
Summary (4 min read)
1. Introduction
- The authors suppose that η.·/ is a complex model, such that the way that the model responds to changes in its inputs is not transparent.
- Sensitivity analysis is concerned with understanding how changes in the inputs x influence the output y.
- This may be motivated simply by a wish to understand the implications of a complex model but often arises because there is uncertainty about the true values of the inputs that should be used for a particular application.
- Large process models in engineering, environmental science, chemistry, etc. are often implemented in complex computer codes that require many minutes, hours or even days for a single run.
2.1. Main effects and interactions
- Note that the definitions of these terms depend on the distribution G of the uncertain inputs.
- The representation reflects the structure of the model itself, comprising a linear effect of x1 with no x2-effect and no interaction.
2.2. Variance-based methods
- This approach is reviewed by Saltelli, Chan and Scott (2000).
- The first is Vi =var{E.Y |Xi/}: Thus, if the authors were to learn xi, then the uncertainty about Y would become var.
- The second measure, first proposed by Homma and Saltelli (1996), is VTi =var.
- If it were possible to observe one of the xis, to learn its true value exactly, and the cost of that observation would be the same for each i, then the authors should choose that with the largest Si. Nevertheless, the analysis does suggest where there is the greatest potential for reducing uncertainty through new research.
- It does not follow that the two inputs with the largest main effect variances will be the best two inputs to observe.
2.3. Variance decomposition
- When G is such that the elements of X are mutually independent, the authors have already remarked that the definitions of main effects and interactions will directly reflect the model structure.
- One can also decompose the variance of Y into terms relating to the main effects and various interactions between the input variables.the authors.
- It is clear that when equation (5) holds the authors can identify V−i = var{E.Y |X−i/} with the sum of all the Wp-terms not including the subscript i.
- Therefore the total effect index (3) is the proportion of var.Y/ that is accounted for by all the terms in equation (5) with a subscript i, and so STi Si.
- Without orthogonality, the authors can still define the sum of squares attributable to any set of variables, but sums of squares for different sets of regressors no longer partition the total sum of squares.
2.4. Regression components
- Thus, if the authors wish to predict Y without gaining any further information about x, then the best prediction (in terms of minimizing the expected squared error) is E.Y/.
- Then var.Y/ is the expected squared error of this prediction.
- It should be noted that regression coefficients, correlation coefficients and related sums of squares have been widely used in sensitivity analysis.
- In practice, it is easy to see that the regression coefficients of Helton and Davis (2000) are estimates of their optimal coefficients γ in the corresponding regression fit.
- The interpretation is different, the authors allow for non-linear fits and they add the very important step of interpreting the difference between the regression variance component and the corresponding main effect variance as a lack-of-fit variance component.
2.5. Discussion
- The preceding subsections have presented a very broad perspective on probabilistic sensitivity analysis.
- The authors formulation unifies a variety of current approaches and offers new measures, to provide a deeper understanding of a model and its dependence on the uncertain model inputs.
- The authors define new population-based regression measures that provide a link between the sample measures and variance-based sensitivity analysis.
- The authors proposal to use the difference between Vi and Vxi to measure non-linearity in zi.xi/ is novel and, the authors believe, powerful.
- The authors end this section by briefly addressing some other issues.
2.5.1. Local sensitivity
- Local sensitivity analysis is based on partial derivatives of the function η.·/, evaluated at the base-line inputs x0.
- Baker (2001) suggested approximating η.·/ by a first-order Taylor series and derived the D2i as measures of sensitivity.
2.5.2. Value of information
- The authors use of squared prediction error as a criterion can be justified formally in decision theoretic terms by using the squared error loss.
- It may then be shown that Vp is the expected value of gaining perfect information about xp.
- More generally, wherever the computer model is to be used for decision-making, the authors could again measure sensitivity by the expected value of information, but now defined with respect to the relevant utility or loss function and the available decisions.
2.5.3. Computation
- For models of sufficient complexity for it not to be obvious how the output would respond to the model inputs, the authors cannot hope for such tractability and must instead seek to obtain the desired measures computationally.
- If η.·/ is sufficiently cheap to evaluate for many different inputs, simple Monte Carlo methods can be used to estimate var.Y/ or the component of variance Vg.x/ for any regression fit with negligible error.
- The method of Sobol’ (1993) and the Fourier amplitude sensitivity test, devised by Cukier et al. (1973) and extended by Saltelli et al. (1999), are techniques that have been developed specifically to compute some of these measures.
- Nevertheless, sensitivity analysis by these techniques demands many thousands of function evaluations.
- For an expensive function, where the evaluation of η.x/ at a single x might take minutes or even hours, such methods are impractical.
3. Bayesian sensitivity analysis
- The authors shall develop Bayesian inference tools for estimating all the quantities of interest in sensitivity analysis, for the case of expensive functions.
- In addition to making it feasible to carry out sensitivity analysis with a much smaller number of model runs, a key benefit of their approach is that it can estimate all the many sensitivity measures that were discussed in Section 2, from a single set of runs.
- The essence of the Bayesian approach is that the model η.·/ is treated as an unknown function.
- In an absolute sense, of course, η.·/ is certainly not unknown, since it implements a model that has been specified in precise mathematical form by someone (or some group of people).
- The authors therefore formulate a prior distribution for the function η.·/.
3.1. Inference about functions using Gaussian processes
- The authors first develop the prior model for η.·/ in the form of a Gaussian process prior distribution and derive the posterior distribution.
- The choice of h.·/ is arbitrary, though it should be chosen to incorporate any beliefs that the authors might have about the form of η.·/.
- This implies an infinite prior variance of η.x/, whereas in practice the authors expect there to be cases when the model developer can provide some proper prior knowledge about the function η.·/.
- Full details of the prior to posterior analysis can be found in O’Hagan (1994).
- Monte Carlo methods applied to very cheap functions typically employ many thousands of model runs, so that the estimation error is very small.
3.2. Inference for main effects and interactions
- First consider inference about E.Y |xp/= ∫ X−p η.x/dG−p|p.x−p|xp/, using obvious notation for the space of possible values for x−p and for its conditional distribution given xp.
- The authors can derive the posterior mean as follows.
- From the plot, it is tempting to think of the inputs showing the greatest variation as the most important, but var[EÅ{zi.
3.3. Inference for variances
- The authors now consider posterior inference for Vi and VTi.
- Haylock and O’Hagan (1996) derived the posterior mean and variance of var.
- As before, all the required integrals can be done numerically if necessary but are available analytically for certain common modelling choices.
3.4. Inference for regression fits
- All the resulting integrals may be computed numerically and may be obtained analytically for common modelling choices.
- Relevant theory is given for one dimension in O’Hagan (1992) and is easily generalized to higher dimensions.
- Inference about D2i can then also be derived.
4. Examples
- The authors present two illustrative examples, which are typical of a variety of models that they have considered.
- To apply the techniques of Section 3 in practice, it is necessary to identify the functions h.·/ and c.·, ·/ that represent prior beliefs about the function η.·/, and the distribution G.·/ that defines the uncertainty about the model inputs.
- This implies a belief that the output is an analytic differentiable function of its inputs.
4.1. Synthetic example
- The authors illustrate their methodology first with a synthetic example.
- If the new design decreases the value of the integral, the candidate design point is exchanged for the current point.
- The simulation method involves generating many additional runs of the code η.·/ from its posterior distribution and re-estimating Si each time.
- The remaining variance after the main effects is estimated as 29% of the total variance (true value 28%).
- Plotting the posterior expectation (with respect to the unknown function η.·/) of E.Y |xi/ against xi for each variable also allows us to identify the three groups of variables.
4.2. Oil-field simulator
- This model was used in Craig et al. (1997, 2001) to demonstrate their methodology for calibration and forecasting.
- The authors choose notional distributions for these inputs; they first take log-transformations of the permeability inputs as in Craig et al. (2001).
- The authors then suppose that each input has a normal distribution, with the ranges of each input representing six standard deviations.
- The authors have 101 runs of the code, with the design points chosen to form a Latin hypercube.
- Though the authors do not show the results here, they have also performed the same sensitivity analysis for different wells in the reservoir, and at different time points.
5. Conclusions
- The authors method facilitates a deep and thorough analysis of the sensitivity of a model output to variation in its inputs—through decomposition of the output variance into components representing main effects and interactions, through further decomposition of individual terms into components for linear or other regression-based fits, and for non-linearity, and through graphical presentation of main effects and first-order interactions.
- This is particularly important in the case of expensive models, since Monte Carlo methods become infeasible if each model run takes an appreciable amount of computer time.
- The Bayesian approach also allows the complete range of sensitivity measures to be computed from a single set of model runs.
- The authors examples involve 15 and 40 uncertain model inputs and are therefore of realistic, albeit moderate, dimensionality.
Did you find this useful? Give us your feedback
Citations
2,265 citations
1,934 citations
1,296 citations
888 citations
Cites background or methods from "Probabilistic sensitivity analysis ..."
...…(FAST (Cukier et al., 1973)) for the approximation of the first-order indices, and the extended FAST (Saltelli et al., 1999) for the total-order indices (for an introduction to these techniques, see Norton (2015)); and (ii) methods using an emulator like the approach by Oakley and O'Hagan (2004)....
[...]
...…indices is that they are related with the terms in the variance decomposition of the model output (Sobol', 1993), which “reflects the structure of the model itself” (Oakley and O'Hagan, 2004) and holds under relatively broad assumptions, the strongest one being that input factors are independent....
[...]
...An interesting property of first-order and higher-order indices is that they are related with the terms in the variance decomposition of the model output (Sobol', 1993), which “reflects the structure of the model itself” (Oakley and O'Hagan, 2004) and holds under relatively broad assumptions, the strongest one being that input factors are independent....
[...]
...In the presence of correlations among the input factors, instead, the tidy correspondence between variancebased indices and model structure is lost (see e.g. discussion in Oakley and O'Hagan (2004)) and counterintuitive results may be obtained....
[...]
850 citations
Cites background or methods from "Probabilistic sensitivity analysis ..."
...Good practices for sensitivity analysis are also increasingly seen on this journal based on regression analysis (Manache and Melching, 2008), variance based methods (Confalonieri et al., 2010) and meta-modelling (Ziehn and Tomlin, 2009)....
[...]
...Existing guidelines and textbooks reviewed here recommend that mathematical modeling of natural or man-made system be accompanied by a ‘sensitivity analysis’ (SA)....
[...]
References
6,583 citations
3,745 citations
1,662 citations
"Probabilistic sensitivity analysis ..." refers background in this paper
...The second measure, first proposed by Homma and Saltelli (1996), is VTi =var....
[...]
1,652 citations