scispace - formally typeset
Open accessJournal ArticleDOI: 10.1214/SS/1177011136

Inference from Iterative Simulation Using Multiple Sequences

01 Nov 1992-Statistical Science (Institute of Mathematical Statistics)-Vol. 7, Iss: 4, pp 457-472
Abstract: The Gibbs sampler, the algorithm of Metropolis and similar iterative simulation methods are potentially very helpful for summarizing multivariate distributions. Used naively, however, iterative simulation can give misleading answers. Our methods are simple and generally applicable to the output of any iterative simulation; they are designed for researchers primarily interested in the science underlying the data and models they are analyzing, rather than for researchers interested in the probability theory underlying the iterative simulations themselves. Our recommended strategy is to use several independent sequences, with starting points sampled from an overdispersed distribution. At each step of the iterative simulation, we obtain, for each univariate estimand of interest, a distributional estimate and an estimate of how much sharper the distributional estimate might become if the simulations were continued indefinitely. Because our focus is on applied inference for Bayesian posterior distributions in real problems, which often tend toward normality after transformations and marginalization, we derive our results as normal-theory approximations to exact Bayesian inference, conditional on the observed simulations. The methods are illustrated on a random-effects mixture model applied to experimental measurements of reaction times of normal and schizophrenic patients. more

Topics: Bayesian inference (58%), Gibbs sampling (57%), Mixture model (55%) more

Open accessJournal ArticleDOI: 10.1093/SYSBIO/SYS029
01 May 2012-Systematic Biology
Abstract: Since its introduction in 2001, MrBayes has grown in popularity as a software package for Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) methods. With this note, we announce the release of version 3.2, a major upgrade to the latest official release presented in 2003. The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly. The introduction of new proposals and automatic optimization of tuning parameters has improved convergence for many problems. The new version also sports significantly faster likelihood calculations through streaming single-instruction-multiple-data extensions (SSE) and support of the BEAGLE library, allowing likelihood calculations to be delegated to graphics processing units (GPUs) on compatible hardware. Speedup factors range from around 2 with SSE code to more than 50 with BEAGLE for codon problems. Checkpointing across all models allows long runs to be completed even when an analysis is prematurely terminated. New models include relaxed clocks, dating, model averaging across time-reversible substitution models, and support for hard, negative, and partial (backbone) tree constraints. Inference of species trees from gene trees is supported by full incorporation of the Bayesian estimation of species trees (BEST) algorithms. Marginal model likelihoods for Bayes factor tests can be estimated accurately across the entire model space using the stepping stone method. The new version provides more output options than previously, including samples of ancestral states, site rates, site d(N)/d(S) rations, branch rates, and node dates. A wide range of statistics on tree parameters can also be output for visualization in FigTree and compatible software. more

Topics: Bayes factor (56%), Markov chain Monte Carlo (53%), Bayesian probability (52%) more

14,723 Citations

Open accessBook
Kenneth Train1Institutions (1)
01 Jan 2003-
Abstract: This book describes the new generation of discrete choice methods, focusing on the many advances that are made possible by simulation. Researchers use these statistical methods to examine the choices that consumers, households, firms, and other agents make. Each of the major models is covered: logit, generalized extreme value, or GEV (including nested and cross-nested logits), probit, and mixed logit, plus a variety of specifications that build on these basics. Simulation-assisted estimation procedures are investigated and compared, including maximum simulated likelihood, method of simulated moments, and method of simulated scores. Procedures for drawing from densities are described, including variance reduction techniques such as anithetics and Halton draws. Recent advances in Bayesian procedures are explored, including the use of the Metropolis-Hastings algorithm and its variant Gibbs sampling. No other book incorporates all these fields, which have arisen in the past 20 years. The procedures are applicable in many fields, including energy, transportation, environmental studies, health, labor, and marketing. more

Topics: Mixed logit (63%), Discrete choice (56%), Logit (53%) more

7,755 Citations

Open accessJournal ArticleDOI: 10.1086/319501
Abstract: Current routine genotyping methods typically do not provide haplotype information, which is essential for many analyses of fine-scale molecular-genetics data. Haplotypes can be obtained, at considerable cost, experimentally or (partially) through genotyping of additional family members. Alternatively, a statistical method can be used to infer phase and to reconstruct haplotypes. We present a new statistical method, applicable to genotype data at linked loci from a population sample, that improves substantially on current algorithms; often, error rates are reduced by >50%, relative to its nearest competitor. Furthermore, our algorithm performs well in absolute terms, suggesting that reconstructing haplotypes experimentally or by genotyping additional family members may be an inefficient use of resources. more

Topics: Haplotype estimation (59%), Population (54%), Modal haplotype (53%)

7,223 Citations

Open accessBook
24 Aug 2012-
Abstract: Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package--PMTK (probabilistic modeling toolkit)--that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students. more

Topics: Graphical model (57%), Conditional random field (55%), Electronic data (54%) more

7,045 Citations

Open accessJournal ArticleDOI: 10.1051/0004-6361/201321591
Peter A. R. Ade1, Nabila Aghanim2, C. Armitage-Caplan3, Monique Arnaud4  +324 moreInstitutions (70)
Abstract: This paper presents the first cosmological results based on Planck measurements of the cosmic microwave background (CMB) temperature and lensing-potential power spectra. We find that the Planck spectra at high multipoles (l ≳ 40) are extremely well described by the standard spatially-flat six-parameter ΛCDM cosmology with a power-law spectrum of adiabatic scalar perturbations. Within the context of this cosmology, the Planck data determine the cosmological parameters to high precision: the angular size of the sound horizon at recombination, the physical densities of baryons and cold dark matter, and the scalar spectral index are estimated to be θ∗ = (1.04147 ± 0.00062) × 10-2, Ωbh2 = 0.02205 ± 0.00028, Ωch2 = 0.1199 ± 0.0027, and ns = 0.9603 ± 0.0073, respectively(note that in this abstract we quote 68% errors on measured parameters and 95% upper limits on other parameters). For this cosmology, we find a low value of the Hubble constant, H0 = (67.3 ± 1.2) km s-1 Mpc-1, and a high value of the matter density parameter, Ωm = 0.315 ± 0.017. These values are in tension with recent direct measurements of H0 and the magnitude-redshift relation for Type Ia supernovae, but are in excellent agreement with geometrical constraints from baryon acoustic oscillation (BAO) surveys. Including curvature, we find that the Universe is consistent with spatial flatness to percent level precision using Planck CMB data alone. We use high-resolution CMB data together with Planck to provide greater control on extragalactic foreground components in an investigation of extensions to the six-parameter ΛCDM model. We present selected results from a large grid of cosmological models, using a range of additional astrophysical data sets in addition to Planck and high-resolution CMB data. None of these models are favoured over the standard six-parameter ΛCDM cosmology. The deviation of the scalar spectral index from unity isinsensitive to the addition of tensor modes and to changes in the matter content of the Universe. We find an upper limit of r0.002< 0.11 on the tensor-to-scalar ratio. There is no evidence for additional neutrino-like relativistic particles beyond the three families of neutrinos in the standard model. Using BAO and CMB data, we find Neff = 3.30 ± 0.27 for the effective number of relativistic degrees of freedom, and an upper limit of 0.23 eV for the sum of neutrino masses. Our results are in excellent agreement with big bang nucleosynthesis and the standard value of Neff = 3.046. We find no evidence for dynamical dark energy; using BAO and CMB data, the dark energy equation of state parameter is constrained to be w = -1.13-0.10+0.13. We also use the Planck data to set limits on a possible variation of the fine-structure constant, dark matter annihilation and primordial magnetic fields. Despite the success of the six-parameter ΛCDM model in describing the Planck data at high multipoles, we note that this cosmology does not provide a good fit to the temperature power spectrum at low multipoles. The unusual shape of the spectrum in the multipole range 20 ≲ l ≲ 40 was seen previously in the WMAP data and is a real feature of the primordial CMB anisotropies. The poor fit to the spectrum at low multipoles is not of decisive significance, but is an “anomaly” in an otherwise self-consistent analysis of the Planck temperature data. more

Topics: Planck energy (66%), Age of the universe (65%), Planck particle (65%) more

6,641 Citations


Open accessJournal ArticleDOI: 10.1063/1.1699114
Abstract: A general method, suitable for fast computing machines, for investigating such properties as equations of state for substances consisting of interacting individual molecules is described. The method consists of a modified Monte Carlo integration over configuration space. Results for the two‐dimensional rigid‐sphere system have been obtained on the Los Alamos MANIAC and are presented here. These results are compared to the free volume equation of state and to a four‐term virial coefficient expansion. more

32,876 Citations

Journal ArticleDOI: 10.1109/TPAMI.1984.4767596
Stuart Geman1, Donald Geman2Institutions (2)
Abstract: We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios. more

Topics: Gibbs sampling (61%), Gibbs algorithm (59%), Maximum a posteriori estimation (58%) more

18,328 Citations

Journal ArticleDOI: 10.1093/BIOMET/57.1.97
W. K. Hastings1Institutions (1)
01 Apr 1970-Biometrika
Abstract: SUMMARY A generalization of the sampling method introduced by Metropolis et al. (1953) is presented along with an exposition of the relevant theory, techniques of application and methods and difficulties of assessing the error in Monte Carlo estimates. Examples of the methods, including the generation of random orthogonal matrices and potential applications of the methods to numerical problems arising in statistics, are discussed. For numerical problems in a large number of dimensions, Monte Carlo methods are often more efficient than conventional numerical methods. However, implementation of the Monte Carlo methods requires sampling from high dimensional probability distributions and this may be very difficult and expensive in analysis and computer time. General methods for sampling from, or estimating expectations with respect to, such distributions are as follows. (i) If possible, factorize the distribution into the product of one-dimensional conditional distributions from which samples may be obtained. (ii) Use importance sampling, which may also be used for variance reduction. That is, in order to evaluate the integral J = X) p(x)dx = Ev(f), where p(x) is a probability density function, instead of obtaining independent samples XI, ..., Xv from p(x) and using the estimate J, = Zf(xi)/N, we instead obtain the sample from a distribution with density q(x) and use the estimate J2 = Y{f(xj)p(x1)}/{q(xj)N}. This may be advantageous if it is easier to sample from q(x) thanp(x), but it is a difficult method to use in a large number of dimensions, since the values of the weights w(xi) = p(x1)/q(xj) for reasonable values of N may all be extremely small, or a few may be extremely large. In estimating the probability of an event A, however, these difficulties may not be as serious since the only values of w(x) which are important are those for which x -A. Since the methods proposed by Trotter & Tukey (1956) for the estimation of conditional expectations require the use of importance sampling, the same difficulties may be encountered in their use. (iii) Use a simulation technique; that is, if it is difficult to sample directly from p(x) or if p(x) is unknown, sample from some distribution q(y) and obtain the sample x values as some function of the corresponding y values. If we want samples from the conditional dis more

13,481 Citations

Open accessBook
01 Jan 1987-
Abstract: Tables and Figures. Glossary. 1. Introduction. 1.1 Overview. 1.2 Examples of Surveys with Nonresponse. 1.3 Properly Handling Nonresponse. 1.4 Single Imputation. 1.5 Multiple Imputation. 1.6 Numerical Example Using Multiple Imputation. 1.7 Guidance for the Reader. 2. Statistical Background. 2.1 Introduction. 2.2 Variables in the Finite Population. 2.3 Probability Distributions and Related Calculations. 2.4 Probability Specifications for Indicator Variables. 2.5 Probability Specifications for (X,Y). 2.6 Bayesian Inference for a Population Quality. 2.7 Interval Estimation. 2.8 Bayesian Procedures for Constructing Interval Estimates, Including Significance Levels and Point Estimates. 2.9 Evaluating the Performance of Procedures. 2.10 Similarity of Bayesian and Randomization--Based Inferences in Many Practical Cases. 3. Underlying Bayesian Theory. 3.1 Introduction and Summary of Repeated--Imputation Inferences. 3.2 Key Results for Analysis When the Multiple Imputations are Repeated Draws from the Posterior Distribution of the Missing Values. 3.3 Inference for Scalar Estimands from a Modest Number of Repeated Completed--Data Means and Variances. 3.4 Significance Levels for Multicomponent Estimands from a Modest Number of Repeated Completed--Data Means and Variance--Covariance Matrices. 3.5 Significance Levels from Repeated Completed--Data Significance Levels. 3.6 Relating the Completed--Data and Completed--Data Posterior Distributions When the Sampling Mechanism is Ignorable. 4. Randomization--Based Evaluations. 4.1 Introduction. 4.2 General Conditions for the Randomization--Validity of Infinite--m Repeated--Imputation Inferences. 4.3Examples of Proper and Improper Imputation Methods in a Simple Case with Ignorable Nonresponse. 4.4 Further Discussion of Proper Imputation Methods. 4.5 The Asymptotic Distibution of (Qm,Um,Bm) for Proper Imputation Methods. 4.6 Evaluations of Finite--m Inferences with Scalar Estimands. 4.7 Evaluation of Significance Levels from the Moment--Based Statistics Dm and Dm with Multicomponent Estimands. 4.8 Evaluation of Significance Levels Based on Repeated Significance Levels. 5. Procedures with Ignorable Nonresponse. 5.1 Introduction. 5.2 Creating Imputed Values under an Explicit Model. 5.3 Some Explicit Imputation Models with Univariate YI and Covariates. 5.4 Monotone Patterns of Missingness in Multivariate YI. 5.5 Missing Social Security Benefits in the Current Population Survey. 5.6 Beyond Monotone Missingness. 6. Procedures with Nonignorable Nonresponse. 6.1 Introduction. 6.2 Nonignorable Nonresponse with Univariate YI and No XI. 6.3 Formal Tasks with Nonignorable Nonresponse. 6.4 Illustrating Mixture Modeling Using Educational Testing Service Data. 6.5 Illustrating Selection Modeling Using CPS Data. 6.6 Extensions to Surveys with Follow--Ups. 6.7 Follow--Up Response in a Survey of Drinking Behavior Among Men of Retirement Age. References. Author Index. Subject Index. Appendix I. Report Written for the Social Security Administration in 1977. Appendix II. Report Written for the Census Bureau in 1983. more

Topics: Imputation (statistics) (63%), Multiple Imputation Technique (62%), Listwise deletion (53%) more

13,466 Citations

Journal ArticleDOI: 10.2307/2344614
John A. Nelder1, R. W. M. Wedderburn1Institutions (1)
01 May 1972-
Abstract: JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact Blackwell Publishing and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to Journal of the Royal Statistical Society. Series A (General). SUMMARY The technique of iterative weighted linear regression can be used to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation. A generalization of the analysis of variance is given for these models using log-likelihoods. These generalized linear models are illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables) and gamma (variance components). The implications of the approach in designing statistics courses are discussed. more

8,264 Citations

No. of citations received by the Paper in previous years