scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review

01 Jun 1996-Journal of the American Statistical Association (Taylor & Francis Group)-Vol. 91, Iss: 434, pp 883-904
TL;DR: All of the methods in this work can fail to detect the sorts of convergence failure that they were designed to identify, so a combination of strategies aimed at evaluating and accelerating MCMC sampler convergence are recommended.
Abstract: A critical issue for users of Markov chain Monte Carlo (MCMC) methods in applications is how to determine when it is safe to stop sampling and use the samples to estimate characteristics of the distribution of interest. Research into methods of computing theoretical convergence bounds holds promise for the future but to date has yielded relatively little of practical use in applied work. Consequently, most MCMC users address the convergence problem by applying diagnostic tools to the output produced by running their samplers. After giving a brief overview of the area, we provide an expository review of 13 convergence diagnostics, describing the theoretical basis and practical implementation of each. We then compare their performance in two simple models and conclude that all of the methods can fail to detect the sorts of convergence failure that they were designed to identify. We thus recommend a combination of strategies aimed at evaluating and accelerating MCMC sampler convergence, including ap...
Citations
More filters
Journal ArticleDOI
TL;DR: The emcee algorithm as mentioned in this paper is a Python implementation of the affine-invariant ensemble sampler for Markov chain Monte Carlo (MCMC) proposed by Goodman & Weare (2010).
Abstract: We introduce a stable, well tested Python implementation of the affine-invariant ensemble sampler for Markov chain Monte Carlo (MCMC) proposed by Goodman & Weare (2010). The code is open source and has already been used in several published projects in the astrophysics literature. The algorithm behind emcee has several advantages over traditional MCMC sampling methods and it has excellent performance as measured by the autocorrelation time (or function calls per independent sample). One major advantage of the algorithm is that it requires hand-tuning of only 1 or 2 parameters compared to ~N2 for a traditional algorithm in an N-dimensional parameter space. In this document, we describe the algorithm and the details of our implementation. Exploiting the parallelism of the ensemble method, emcee permits any user to take advantage of multiple CPU cores without extra effort. The code is available online at http://dan.iel.fm/emcee under the GNU General Public License v2.

8,805 citations

Book
01 Jan 2003
TL;DR: In this paper, the authors describe the new generation of discrete choice methods, focusing on the many advances that are made possible by simulation, and compare simulation-assisted estimation procedures, including maximum simulated likelihood, method of simulated moments, and methods of simulated scores.
Abstract: This book describes the new generation of discrete choice methods, focusing on the many advances that are made possible by simulation. Researchers use these statistical methods to examine the choices that consumers, households, firms, and other agents make. Each of the major models is covered: logit, generalized extreme value, or GEV (including nested and cross-nested logits), probit, and mixed logit, plus a variety of specifications that build on these basics. Simulation-assisted estimation procedures are investigated and compared, including maximum simulated likelihood, method of simulated moments, and method of simulated scores. Procedures for drawing from densities are described, including variance reduction techniques such as anithetics and Halton draws. Recent advances in Bayesian procedures are explored, including the use of the Metropolis-Hastings algorithm and its variant Gibbs sampling. No other book incorporates all these fields, which have arisen in the past 20 years. The procedures are applicable in many fields, including energy, transportation, environmental studies, health, labor, and marketing.

7,768 citations


Cites background from "Markov Chain Monte Carlo Convergenc..."

  • ...Cowles and Carlin (1996) provide a description of the various tests and diagnostics that have been proposed....

    [...]

Journal ArticleDOI
TL;DR: The use (and misuse) of GLMMs in ecology and evolution are reviewed, estimation and inference are discussed, and 'best-practice' data analysis procedures for scientists facing this challenge are summarized.
Abstract: How should ecologists and evolutionary biologists analyze nonnormal data that involve random effects? Nonnormal data such as counts or proportions often defy classical statistical procedures. Generalized linear mixed models (GLMMs) provide a more flexible approach for analyzing nonnormal data when random effects are present. The explosion of research on GLMMs in the last decade has generated considerable uncertainty for practitioners in ecology and evolution. Despite the availability of accurate techniques for estimating GLMM parameters in simple cases, complex GLMMs are challenging to fit and statistical inference such as hypothesis testing remains difficult. We review the use (and misuse) of GLMMs in ecology and evolution, discuss estimation and inference and summarize 'best-practice' data analysis procedures for scientists facing this challenge.

7,207 citations

Journal ArticleDOI
TL;DR: This work generalizes the method proposed by Gelman and Rubin (1992a) for monitoring the convergence of iterative simulations by comparing between and within variances of multiple chains, in order to obtain a family of tests for convergence.
Abstract: We generalize the method proposed by Gelman and Rubin (1992a) for monitoring the convergence of iterative simulations by comparing between and within variances of multiple chains, in order to obtain a family of tests for convergence. We review methods of inference from simulations in order to develop convergence-monitoring summaries that are relevant for the purposes for which the simulations are used. We recommend applying a battery of tests for mixing based on the comparison of inferences from individual sequences and from the mixture of sequences. Finally, we discuss multivariate analogues, for assessing convergence of several parameters simultaneously.

5,493 citations


Cites background or methods from "Markov Chain Monte Carlo Convergenc..."

  • ...See Brooks and Roberts (in press) and Cowles and Carlin (1996) for reviews of commonly used techniques of convergence assessment....

    [...]

  • ...See Brooks and Roberts (in press) and Cowles and Carlin (1996) for reviews of commonly used techniques of convergence assessment. Gelman and Rubin (1992a,b) pointed out that, in many problems, lack of convergence can be easily determined from multiple independent sequences but cannot be diagnosed using simulation output from any single sequence. They proposed a method using multiple replications of the chain to decide whether or not stationarity has been achieved within the second half of each of the sample paths. The idea behind this is an implicit assumption that convergence will have been achieved within the first half of the sample paths, and the validity of this assumption is essentially being tested by the diagnostic. This method's popularity may be largely due to its implementational simplicity and the fact that generic code is widely available to implement the method. In addition, even when this method is not formally used, something like it is often done implicitly or informally. For example, Green, Roesch, Smith, and Strawderman (1994) wrote, in implementing a Gibbs sampler, "In the present study we used . . . one long chain. Experimentation with different starting values convinced us that the chain was converging and covering the entire posterior distribution." As is often the case in statistics, it generally useful to study the formal procedures that lie behind an informal method, in this case for monitoring the mixing of multiple sequences. In this article, we generalize the method of Gelman and Rubin (1992a) by (1) adding graphical methods for tracking the approach to convergence; (2) generalizing the scale reduction factor to track measures of scale other than the variance; and (3) extending to multivariate summaries....

    [...]

  • ...See Brooks and Roberts (in press) and Cowles and Carlin (1996) for reviews of commonly used techniques of convergence assessment. Gelman and Rubin (1992a,b) pointed out that, in many problems, lack of convergence can be easily determined from multiple independent sequences but cannot be diagnosed using simulation output from any single sequence. They proposed a method using multiple replications of the chain to decide whether or not stationarity has been achieved within the second half of each of the sample paths. The idea behind this is an implicit assumption that convergence will have been achieved within the first half of the sample paths, and the validity of this assumption is essentially being tested by the diagnostic. This method's popularity may be largely due to its implementational simplicity and the fact that generic code is widely available to implement the method. In addition, even when this method is not formally used, something like it is often done implicitly or informally. For example, Green, Roesch, Smith, and Strawderman (1994) wrote, in implementing a Gibbs sampler, "In the present study we used ....

    [...]

Posted Content
TL;DR: This article offers an approach, built on the technique of statistical simulation, to extract the currently overlooked information from any statistical method and to interpret and present it in a reader-friendly manner.
Abstract: Social Scientists rarely take full advantage of the information available in their statistical results. As a consequence, they miss opportunities to present quantities that are of greatest substantive interest for their research and express the appropriate degree of certainty about these quantities. In this article, we offer an approach, built on the technique of statistical simulation, to extract the currently overlooked information from any statistical method and to interpret and present it in a reader-friendly manner. Using this technique requires some expertise, which we try to provide herein, but its application should make the results of quantitative articles more informative and transparent. To illustrate our recommendations, we replicate the results of several published works, showing in each case how the authors' own conclusions can be expressed more sharply and informatively, and, without changing any data or statistical assumptions, how our approach reveals important new information about the research questions at hand. We also offer very easy-to-use Clarify software that implements our suggestions.

2,938 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, a modified Monte Carlo integration over configuration space is used to investigate the properties of a two-dimensional rigid-sphere system with a set of interacting individual molecules, and the results are compared to free volume equations of state and a four-term virial coefficient expansion.
Abstract: A general method, suitable for fast computing machines, for investigating such properties as equations of state for substances consisting of interacting individual molecules is described. The method consists of a modified Monte Carlo integration over configuration space. Results for the two‐dimensional rigid‐sphere system have been obtained on the Los Alamos MANIAC and are presented here. These results are compared to the free volume equation of state and to a four‐term virial coefficient expansion.

35,161 citations


"Markov Chain Monte Carlo Convergenc..." refers methods in this paper

  • ...In a surprisingly short period of time, Markov chain Monte Carlo (MCMC) integration methods, especially the Metropolis-Hastings algorithm (Metropolis et al., 1953; Hastings, 1970) and the Gibbs sampler (Geman and Geman, 1984; Gelfand and Smith, 1990) have emerged as extremely popular tools for the analysis of complex statistical models....

    [...]

Journal ArticleDOI
TL;DR: The analogy between images and statistical mechanics systems is made and the analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations, creating a highly parallel ``relaxation'' algorithm for MAP estimation.
Abstract: We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

18,761 citations


"Markov Chain Monte Carlo Convergenc..." refers methods in this paper

  • ...pler ( Geman and Geman, 1984; Gelfand and Smith, 1990) have emerged as extremely popular tools for...

    [...]

  • ...…integration methods, especially the MetropolisHastings algorithm (Hastings 1970; Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller 1953) and the Gibbs sampler (Geman and Geman 1984; Gelfand and Smith 1990) have emerged as extremely popular tools for the analysis of complex statistical models....

    [...]

Journal ArticleDOI
TL;DR: A generalization of the sampling method introduced by Metropolis et al. as mentioned in this paper is presented along with an exposition of the relevant theory, techniques of application and methods and difficulties of assessing the error in Monte Carlo estimates.
Abstract: SUMMARY A generalization of the sampling method introduced by Metropolis et al. (1953) is presented along with an exposition of the relevant theory, techniques of application and methods and difficulties of assessing the error in Monte Carlo estimates. Examples of the methods, including the generation of random orthogonal matrices and potential applications of the methods to numerical problems arising in statistics, are discussed. For numerical problems in a large number of dimensions, Monte Carlo methods are often more efficient than conventional numerical methods. However, implementation of the Monte Carlo methods requires sampling from high dimensional probability distributions and this may be very difficult and expensive in analysis and computer time. General methods for sampling from, or estimating expectations with respect to, such distributions are as follows. (i) If possible, factorize the distribution into the product of one-dimensional conditional distributions from which samples may be obtained. (ii) Use importance sampling, which may also be used for variance reduction. That is, in order to evaluate the integral J = X) p(x)dx = Ev(f), where p(x) is a probability density function, instead of obtaining independent samples XI, ..., Xv from p(x) and using the estimate J, = Zf(xi)/N, we instead obtain the sample from a distribution with density q(x) and use the estimate J2 = Y{f(xj)p(x1)}/{q(xj)N}. This may be advantageous if it is easier to sample from q(x) thanp(x), but it is a difficult method to use in a large number of dimensions, since the values of the weights w(xi) = p(x1)/q(xj) for reasonable values of N may all be extremely small, or a few may be extremely large. In estimating the probability of an event A, however, these difficulties may not be as serious since the only values of w(x) which are important are those for which x -A. Since the methods proposed by Trotter & Tukey (1956) for the estimation of conditional expectations require the use of importance sampling, the same difficulties may be encountered in their use. (iii) Use a simulation technique; that is, if it is difficult to sample directly from p(x) or if p(x) is unknown, sample from some distribution q(y) and obtain the sample x values as some function of the corresponding y values. If we want samples from the conditional dis

14,965 citations


"Markov Chain Monte Carlo Convergenc..." refers methods in this paper

  • ...In a surprisingly short period, Markov chain Monte Carlo (MCMC) integration methods, especially the MetropolisHastings algorithm (Hastings 1970; Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller 1953) and the Gibbs sampler (Geman and Geman 1984; Gelfand and Smith 1990) have emerged as…...

    [...]

  • ...pecially the Metropolis-Hastings algorithm (Metropolis et al., 1953; Hastings, 1970 ) and the Gibbs sam-...

    [...]

Journal ArticleDOI
TL;DR: The focus is on applied inference for Bayesian posterior distributions in real problems, which often tend toward normal- ity after transformations and marginalization, and the results are derived as normal-theory approximations to exact Bayesian inference, conditional on the observed simulations.
Abstract: The Gibbs sampler, the algorithm of Metropolis and similar iterative simulation methods are potentially very helpful for summarizing multivariate distributions. Used naively, however, iterative simulation can give misleading answers. Our methods are simple and generally applicable to the output of any iterative simulation; they are designed for researchers primarily interested in the science underlying the data and models they are analyzing, rather than for researchers interested in the probability theory underlying the iterative simulations themselves. Our recommended strategy is to use several independent sequences, with starting points sampled from an overdispersed distribution. At each step of the iterative simulation, we obtain, for each univariate estimand of interest, a distributional estimate and an estimate of how much sharper the distributional estimate might become if the simulations were continued indefinitely. Because our focus is on applied inference for Bayesian posterior distributions in real problems, which often tend toward normality after transformations and marginalization, we derive our results as normal-theory approximations to exact Bayesian inference, conditional on the observed simulations. The methods are illustrated on a random-effects mixture model applied to experimental measurements of reaction times of normal and schizophrenic patients.

13,884 citations


"Markov Chain Monte Carlo Convergenc..." refers background or methods in this paper

  • ...The convergence diagnostics of Gelman and Rubin (1992) and of Raftery and Lewis (1992) currently are the most popular amongst the statistical community, at least in part because computer programs for their implementation are available from their creators. In addition to these two, we discuss the methods of Geweke (1992), Roberts (1992, 1994), Ritter and Tanner (1992), Zellner and Min (1995), Liu, Liu, and Rubin (1992), Garren and Smith (1993), Johnson (1994), Mykland, Tiermey, and Yu (1995), Yu (1994), and Yu and Mykland (1994)....

    [...]

  • ...…or Univariate/ Ease Quantitative multiple full joint Bias! of Method graphical chains Theoretical basis distribution variance Applicability use Gelman and Rubin (1992) Quantitative Multiple Large-sample normal theory Univariate Bias Any MCMC a Raftery and Lewis (1992) Quantitative Single…...

    [...]

  • ...The convergence diagnostics of Gelman and Rubin (1992) and of Raftery and Lewis (1992) currently are the most popular in the statistical community, at least in part because computer programs for their implementation are available from their creators....

    [...]

  • ...The convergence diagnostics of Gelman and Rubin (1992) and of Raftery and Lewis (1992) currently are the most popular amongst the statistical community, at least in part because computer programs for their implementation are available from their creators....

    [...]

Journal ArticleDOI
William S. Cleveland1
TL;DR: Robust locally weighted regression as discussed by the authors is a method for smoothing a scatterplot, in which the fitted value at z k is the value of a polynomial fit to the data using weighted least squares, where the weight for (x i, y i ) is large if x i is close to x k and small if it is not.
Abstract: The visual information on a scatterplot can be greatly enhanced, with little additional cost, by computing and plotting smoothed points. Robust locally weighted regression is a method for smoothing a scatterplot, (x i , y i ), i = 1, …, n, in which the fitted value at z k is the value of a polynomial fit to the data using weighted least squares, where the weight for (x i , y i ) is large if x i is close to x k and small if it is not. A robust fitting procedure is used that guards against deviant points distorting the smoothed points. Visual, computational, and statistical issues of robust locally weighted regression are discussed. Several examples, including data on lead intoxication, are used to illustrate the methodology.

10,225 citations


"Markov Chain Monte Carlo Convergenc..." refers background in this paper

  • ...The plotted lines are "lowess" smooths (Cleveland 1979) of log-transformed In and Dn values....

    [...]