scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Understanding the Metropolis-Hastings Algorithm

01 Nov 1995-The American Statistician (Taylor & Francis Group)-Vol. 49, Iss: 4, pp 327-335
TL;DR: A detailed, introductory exposition of the Metropolis-Hastings algorithm, a powerful Markov chain method to simulate multivariate distributions, and a simple, intuitive derivation of this method is given along with guidance on implementation.
Abstract: We provide a detailed, introductory exposition of the Metropolis-Hastings algorithm, a powerful Markov chain method to simulate multivariate distributions. A simple, intuitive derivation of this method is given along with guidance on implementation. Also discussed are two applications of the algorithm, one for implementing acceptance-rejection sampling when a blanketing function is not available and the other for implementing the algorithm with block-at-a-time scans. In the latter situation, many different algorithms, including the Gibbs sampler, are shown to be special cases of the Metropolis-Hastings algorithm. The methods are illustrated with examples.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
01 Jun 2000-Genetics
TL;DR: Pritch et al. as discussed by the authors proposed a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations, which can be applied to most of the commonly used genetic markers, provided that they are not closely linked.
Abstract: We describe a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations. We assume a model in which there are K populations (where K may be unknown), each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are assigned (probabilistically) to populations, or jointly to two or more populations if their genotypes indicate that they are admixed. Our model does not assume a particular mutation process, and it can be applied to most of the commonly used genetic markers, provided that they are not closely linked. Applications of our method include demonstrating the presence of population structure, assigning individuals to populations, studying hybrid zones, and identifying migrants and admixed individuals. We show that the method can produce highly accurate assignments using modest numbers of loci— e.g. , seven microsatellite loci in an example using genotype data from an endangered bird species. The software used for this article is available from http://www.stats.ox.ac.uk/~pritch/home.html.

27,454 citations

Book
01 Jan 2003
TL;DR: In this paper, the authors describe the new generation of discrete choice methods, focusing on the many advances that are made possible by simulation, and compare simulation-assisted estimation procedures, including maximum simulated likelihood, method of simulated moments, and methods of simulated scores.
Abstract: This book describes the new generation of discrete choice methods, focusing on the many advances that are made possible by simulation. Researchers use these statistical methods to examine the choices that consumers, households, firms, and other agents make. Each of the major models is covered: logit, generalized extreme value, or GEV (including nested and cross-nested logits), probit, and mixed logit, plus a variety of specifications that build on these basics. Simulation-assisted estimation procedures are investigated and compared, including maximum simulated likelihood, method of simulated moments, and method of simulated scores. Procedures for drawing from densities are described, including variance reduction techniques such as anithetics and Halton draws. Recent advances in Bayesian procedures are explored, including the use of the Metropolis-Hastings algorithm and its variant Gibbs sampling. No other book incorporates all these fields, which have arisen in the past 20 years. The procedures are applicable in many fields, including energy, transportation, environmental studies, health, labor, and marketing.

7,768 citations


Cites background or methods from "Understanding the Metropolis-Hastin..."

  • ...In this case, the MH algorithm can be utilized, since it is applicable to (practically) any distribution (Chib and Greenberg, 1995)....

    [...]

  • ...Chib and Greenberg, (1995) provide an excellent description of the more general algorithm as well as an explanation of why it works....

    [...]

BookDOI
TL;DR: The Markov Chain Monte Carlo Implementation Results Summary and Discussion MEDICAL MONITORING Introduction Modelling Medical Monitoring Computing Posterior Distributions Forecasting Model Criticism Illustrative Application Discussion MCMC for NONLINEAR HIERARCHICAL MODELS.
Abstract: INTRODUCING MARKOV CHAIN MONTE CARLO Introduction The Problem Markov Chain Monte Carlo Implementation Discussion HEPATITIS B: A CASE STUDY IN MCMC METHODS Introduction Hepatitis B Immunization Modelling Fitting a Model Using Gibbs Sampling Model Elaboration Conclusion MARKOV CHAIN CONCEPTS RELATED TO SAMPLING ALGORITHMS Markov Chains Rates of Convergence Estimation The Gibbs Sampler and Metropolis-Hastings Algorithm INTRODUCTION TO GENERAL STATE-SPACE MARKOV CHAIN THEORY Introduction Notation and Definitions Irreducibility, Recurrence, and Convergence Harris Recurrence Mixing Rates and Central Limit Theorems Regeneration Discussion FULL CONDITIONAL DISTRIBUTIONS Introduction Deriving Full Conditional Distributions Sampling from Full Conditional Distributions Discussion STRATEGIES FOR IMPROVING MCMC Introduction Reparameterization Random and Adaptive Direction Sampling Modifying the Stationary Distribution Methods Based on Continuous-Time Processes Discussion IMPLEMENTING MCMC Introduction Determining the Number of Iterations Software and Implementation Output Analysis Generic Metropolis Algorithms Discussion INFERENCE AND MONITORING CONVERGENCE Difficulties in Inference from Markov Chain Simulation The Risk of Undiagnosed Slow Convergence Multiple Sequences and Overdispersed Starting Points Monitoring Convergence Using Simulation Output Output Analysis for Inference Output Analysis for Improving Efficiency MODEL DETERMINATION USING SAMPLING-BASED METHODS Introduction Classical Approaches The Bayesian Perspective and the Bayes Factor Alternative Predictive Distributions How to Use Predictive Distributions Computational Issues An Example Discussion HYPOTHESIS TESTING AND MODEL SELECTION Introduction Uses of Bayes Factors Marginal Likelihood Estimation by Importance Sampling Marginal Likelihood Estimation Using Maximum Likelihood Application: How Many Components in a Mixture? Discussion Appendix: S-PLUS Code for the Laplace-Metropolis Estimator MODEL CHECKING AND MODEL IMPROVEMENT Introduction Model Checking Using Posterior Predictive Simulation Model Improvement via Expansion Example: Hierarchical Mixture Modelling of Reaction Times STOCHASTIC SEARCH VARIABLE SELECTION Introduction A Hierarchical Bayesian Model for Variable Selection Searching the Posterior by Gibbs Sampling Extensions Constructing Stock Portfolios With SSVS Discussion BAYESIAN MODEL COMPARISON VIA JUMP DIFFUSIONS Introduction Model Choice Jump-Diffusion Sampling Mixture Deconvolution Object Recognition Variable Selection Change-Point Identification Conclusions ESTIMATION AND OPTIMIZATION OF FUNCTIONS Non-Bayesian Applications of MCMC Monte Carlo Optimization Monte Carlo Likelihood Analysis Normalizing-Constant Families Missing Data Decision Theory Which Sampling Distribution? Importance Sampling Discussion STOCHASTIC EM: METHOD AND APPLICATION Introduction The EM Algorithm The Stochastic EM Algorithm Examples GENERALIZED LINEAR MIXED MODELS Introduction Generalized Linear Models (GLMs) Bayesian Estimation of GLMs Gibbs Sampling for GLMs Generalized Linear Mixed Models (GLMMs) Specification of Random-Effect Distributions Hyperpriors and the Estimation of Hyperparameters Some Examples Discussion HIERARCHICAL LONGITUDINAL MODELLING Introduction Clinical Background Model Detail and MCMC Implementation Results Summary and Discussion MEDICAL MONITORING Introduction Modelling Medical Monitoring Computing Posterior Distributions Forecasting Model Criticism Illustrative Application Discussion MCMC FOR NONLINEAR HIERARCHICAL MODELS Introduction Implementing MCMC Comparison of Strategies A Case Study from Pharmacokinetics-Pharmacodynamics Extensions and Discussion BAYESIAN MAPPING OF DISEASE Introduction Hypotheses and Notation Maximum Likelihood Estimation of Relative Risks Hierarchical Bayesian Model of Relative Risks Empirical Bayes Estimation of Relative Risks Fully Bayesian Estimation of Relative Risks Discussion MCMC IN IMAGE ANALYSIS Introduction The Relevance of MCMC to Image Analysis Image Models at Different Levels Methodological Innovations in MCMC Stimulated by Imaging Discussion MEASUREMENT ERROR Introduction Conditional-Independence Modelling Illustrative examples Discussion GIBBS SAMPLING METHODS IN GENETICS Introduction Standard Methods in Genetics Gibbs Sampling Approaches MCMC Maximum Likelihood Application to a Family Study of Breast Cancer Conclusions MIXTURES OF DISTRIBUTIONS: INFERENCE AND ESTIMATION Introduction The Missing Data Structure Gibbs Sampling Implementation Convergence of the Algorithm Testing for Mixtures Infinite Mixtures and Other Extensions AN ARCHAEOLOGICAL EXAMPLE: RADIOCARBON DATING Introduction Background to Radiocarbon Dating Archaeological Problems and Questions Illustrative Examples Discussion Index

7,399 citations

Journal ArticleDOI
TL;DR: Bayesian model averaging (BMA) provides a coherent mechanism for ac- counting for this model uncertainty and provides improved out-of- sample predictive performance.
Abstract: Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This approach ignores the uncertainty in model selection, leading to over-confident inferences and decisions that are more risky than one thinks they are. Bayesian model averaging (BMA)provides a coherent mechanism for accounting for this model uncertainty. Several methods for implementing BMA have recently emerged. We discuss these methods and present a number of examples.In these examples, BMA provides improved out-of-sample predictive performance. We also provide a catalogue of currently available BMA software.

3,942 citations

Journal ArticleDOI
TL;DR: In this paper, Markov chain Monte Carlo sampling methods are exploited to provide a unified, practical likelihood-based framework for the analysis of stochastic volatility models, and a highly effective method is developed that samples all the unobserved volatilities at once using an approximate offset mixture model, followed by an importance reweighting procedure.
Abstract: In this paper, Markov chain Monte Carlo sampling methods are exploited to provide a unified, practical likelihood-based framework for the analysis of stochastic volatility models. A highly effective method is developed that samples all the unobserved volatilities at once using an approximating offset mixture model, followed by an importance reweighting procedure. This approach is compared with several alternative methods using real data. The paper also develops simulation-based methods for filtering, likelihood evaluation and model failure diagnostics. The issue of model choice using non-nested likelihood ratios and Bayes factors is also investigated. These methods are used to compare the fit of stochastic volatility and GARCH models. All the procedures are illustrated in detail.

1,892 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, a modified Monte Carlo integration over configuration space is used to investigate the properties of a two-dimensional rigid-sphere system with a set of interacting individual molecules, and the results are compared to free volume equations of state and a four-term virial coefficient expansion.
Abstract: A general method, suitable for fast computing machines, for investigating such properties as equations of state for substances consisting of interacting individual molecules is described. The method consists of a modified Monte Carlo integration over configuration space. Results for the two‐dimensional rigid‐sphere system have been obtained on the Los Alamos MANIAC and are presented here. These results are compared to the free volume equation of state and to a four‐term virial coefficient expansion.

35,161 citations

Book
01 Jan 1970
TL;DR: In this article, a complete revision of a classic, seminal, and authoritative book that has been the model for most books on the topic written since 1970 is presented, focusing on practical techniques throughout, rather than a rigorous mathematical treatment of the subject.
Abstract: From the Publisher: This is a complete revision of a classic, seminal, and authoritative book that has been the model for most books on the topic written since 1970. It focuses on practical techniques throughout, rather than a rigorous mathematical treatment of the subject. It explores the building of stochastic (statistical) models for time series and their use in important areas of application —forecasting, model specification, estimation, and checking, transfer function modeling of dynamic relationships, modeling the effects of intervention events, and process control. Features sections on: recently developed methods for model specification, such as canonical correlation analysis and the use of model selection criteria; results on testing for unit root nonstationarity in ARIMA processes; the state space representation of ARMA models and its use for likelihood estimation and forecasting; score test for model checking; and deterministic components and structural components in time series models and their estimation based on regression-time series model methods.

19,748 citations

Journal ArticleDOI
TL;DR: A generalization of the sampling method introduced by Metropolis et al. as mentioned in this paper is presented along with an exposition of the relevant theory, techniques of application and methods and difficulties of assessing the error in Monte Carlo estimates.
Abstract: SUMMARY A generalization of the sampling method introduced by Metropolis et al. (1953) is presented along with an exposition of the relevant theory, techniques of application and methods and difficulties of assessing the error in Monte Carlo estimates. Examples of the methods, including the generation of random orthogonal matrices and potential applications of the methods to numerical problems arising in statistics, are discussed. For numerical problems in a large number of dimensions, Monte Carlo methods are often more efficient than conventional numerical methods. However, implementation of the Monte Carlo methods requires sampling from high dimensional probability distributions and this may be very difficult and expensive in analysis and computer time. General methods for sampling from, or estimating expectations with respect to, such distributions are as follows. (i) If possible, factorize the distribution into the product of one-dimensional conditional distributions from which samples may be obtained. (ii) Use importance sampling, which may also be used for variance reduction. That is, in order to evaluate the integral J = X) p(x)dx = Ev(f), where p(x) is a probability density function, instead of obtaining independent samples XI, ..., Xv from p(x) and using the estimate J, = Zf(xi)/N, we instead obtain the sample from a distribution with density q(x) and use the estimate J2 = Y{f(xj)p(x1)}/{q(xj)N}. This may be advantageous if it is easier to sample from q(x) thanp(x), but it is a difficult method to use in a large number of dimensions, since the values of the weights w(xi) = p(x1)/q(xj) for reasonable values of N may all be extremely small, or a few may be extremely large. In estimating the probability of an event A, however, these difficulties may not be as serious since the only values of w(x) which are important are those for which x -A. Since the methods proposed by Trotter & Tukey (1956) for the estimation of conditional expectations require the use of importance sampling, the same difficulties may be encountered in their use. (iii) Use a simulation technique; that is, if it is difficult to sample directly from p(x) or if p(x) is unknown, sample from some distribution q(y) and obtain the sample x values as some function of the corresponding y values. If we want samples from the conditional dis

14,965 citations

Journal ArticleDOI
TL;DR: The focus is on applied inference for Bayesian posterior distributions in real problems, which often tend toward normal- ity after transformations and marginalization, and the results are derived as normal-theory approximations to exact Bayesian inference, conditional on the observed simulations.
Abstract: The Gibbs sampler, the algorithm of Metropolis and similar iterative simulation methods are potentially very helpful for summarizing multivariate distributions. Used naively, however, iterative simulation can give misleading answers. Our methods are simple and generally applicable to the output of any iterative simulation; they are designed for researchers primarily interested in the science underlying the data and models they are analyzing, rather than for researchers interested in the probability theory underlying the iterative simulations themselves. Our recommended strategy is to use several independent sequences, with starting points sampled from an overdispersed distribution. At each step of the iterative simulation, we obtain, for each univariate estimand of interest, a distributional estimate and an estimate of how much sharper the distributional estimate might become if the simulations were continued indefinitely. Because our focus is on applied inference for Bayesian posterior distributions in real problems, which often tend toward normality after transformations and marginalization, we derive our results as normal-theory approximations to exact Bayesian inference, conditional on the observed simulations. The methods are illustrated on a random-effects mixture model applied to experimental measurements of reaction times of normal and schizophrenic patients.

13,884 citations


"Understanding the Metropolis-Hastin..." refers background in this paper

  • ...For more details the reader should consult Gelman and Rubin (1992) and the accompanying discussion....

    [...]

  • ...One possibility, due to Gelman and Rubin (1992) , is to start multiple chains from dispersed initial values and compare the within and between variation of the sampled draws....

    [...]

Journal ArticleDOI
TL;DR: This revision of a classic, seminal, and authoritative book explores the building of stochastic models for time series and their use in important areas of application —forecasting, model specification, estimation, and checking, transfer function modeling of dynamic relationships, modeling the effects of intervention events, and process control.
Abstract: From the Publisher: This is a complete revision of a classic, seminal, and authoritative book that has been the model for most books on the topic written since 1970. It focuses on practical techniques throughout, rather than a rigorous mathematical treatment of the subject. It explores the building of stochastic (statistical) models for time series and their use in important areas of application —forecasting, model specification, estimation, and checking, transfer function modeling of dynamic relationships, modeling the effects of intervention events, and process control. Features sections on: recently developed methods for model specification, such as canonical correlation analysis and the use of model selection criteria; results on testing for unit root nonstationarity in ARIMA processes; the state space representation of ARMA models and its use for likelihood estimation and forecasting; score test for model checking; and deterministic components and structural components in time series models and their estimation based on regression-time series model methods.

12,650 citations