scispace - formally typeset
Search or ask a question

Stan: A Probabilistic Programming Language.

TL;DR: Stan is a probabilistic programming language for specifying statistical models that provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler and an adaptive form of Hamiltonian Monte Carlo sampling.
Abstract: Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectation propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The brms package implements Bayesian multilevel models in R using the probabilistic programming language Stan, allowing users to fit linear, robust linear, binomial, Poisson, survival, ordinal, zero-inflated, hurdle, and even non-linear models all in a multileVEL context.
Abstract: The brms package implements Bayesian multilevel models in R using the probabilistic programming language Stan. A wide range of distributions and link functions are supported, allowing users to fit - among others - linear, robust linear, binomial, Poisson, survival, ordinal, zero-inflated, hurdle, and even non-linear models all in a multilevel context. Further modeling options include autocorrelation of the response variable, user defined covariance structures, censored data, as well as meta-analytic standard errors. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their beliefs. In addition, model fit can easily be assessed and compared with the Watanabe-Akaike information criterion and leave-one-out cross-validation.

4,353 citations


Cites methods from "Stan: A Probabilistic Programming L..."

  • ...Similar to software packages like WinBugs, Stan comes with its own programming language, allowing for great modeling Ćexibility (cf., Stan Development Team 2017b; Carpenter et al. 2017)....

    [...]

Journal ArticleDOI
TL;DR: In this article, a review of recent progress in cognitive science suggests that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn and how they learn it.
Abstract: Recent progress in artificial intelligence has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats that of humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn and how they learn it. Specifically, we argue that these machines should (1) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (2) ground learning in intuitive theories of physics and psychology to support and enrich the knowledge that is learned; and (3) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes toward these goals that can combine the strengths of recent neural network advances with more structured cognitive models.

2,010 citations

Journal ArticleDOI
TL;DR: Brms provides an intuitive and powerful formula syntax, which extends the well known formula syntax of lme4, which is introduced in detail and demonstrated its usefulness with four examples, each showing other relevant aspects of the syntax.
Abstract: The brms package allows R users to easily specify a wide range of Bayesian single-level and multilevel models, which are fitted with the probabilistic programming language Stan behind the scenes. Several response distributions are supported, of which all parameters (e.g., location, scale, and shape) can be predicted at the same time thus allowing for distributional regression. Non-linear relationships may be specified using non-linear predictor terms or semi-parametric approaches such as splines or Gaussian processes. To make all of these modeling options possible in a multilevel framework, brms provides an intuitive and powerful formula syntax, which extends the well known formula syntax of lme4. The purpose of the present paper is to introduce this syntax in detail and to demonstrate its usefulness with four examples, each showing other relevant aspects of the syntax.

1,463 citations


Cites background or methods from "Stan: A Probabilistic Programming L..."

  • ...Possibly the most powerful program for performing full Bayesian inference available to date is Stan (Stan Development Team, 2017c; Carpenter et al., 2017), which implements Hamiltonian Monte Carlo (Duane et al., 1987; Neal, 2011; Betancourt et al., 2014) and its extension, the No-UTurn (NUTS)…...

    [...]

  • ...Stan comes with its own programming language, allowing for great modeling flexibility (Stan Development Team, 2017c; Carpenter et al., 2017)....

    [...]

Journal ArticleDOI
TL;DR: PM2.5 exposure may be related to additional causes of death than the five considered by the GBD and that incorporation of risk information from other, nonoutdoor, particle sources leads to underestimation of disease burden, especially at higher concentrations.
Abstract: Exposure to ambient fine particulate matter (PM2.5) is a major global health concern. Quantitative estimates of attributable mortality are based on disease-specific hazard ratio models that incorporate risk information from multiple PM2.5 sources (outdoor and indoor air pollution from use of solid fuels and secondhand and active smoking), requiring assumptions about equivalent exposure and toxicity. We relax these contentious assumptions by constructing a PM2.5-mortality hazard ratio function based only on cohort studies of outdoor air pollution that covers the global exposure range. We modeled the shape of the association between PM2.5 and nonaccidental mortality using data from 41 cohorts from 16 countries-the Global Exposure Mortality Model (GEMM). We then constructed GEMMs for five specific causes of death examined by the global burden of disease (GBD). The GEMM predicts 8.9 million [95% confidence interval (CI): 7.5-10.3] deaths in 2015, a figure 30% larger than that predicted by the sum of deaths among the five specific causes (6.9; 95% CI: 4.9-8.5) and 120% larger than the risk function used in the GBD (4.0; 95% CI: 3.3-4.8). Differences between the GEMM and GBD risk functions are larger for a 20% reduction in concentrations, with the GEMM predicting 220% higher excess deaths. These results suggest that PM2.5 exposure may be related to additional causes of death than the five considered by the GBD and that incorporation of risk information from other, nonoutdoor, particle sources leads to underestimation of disease burden, especially at higher concentrations.

1,283 citations


Cites methods from "Stan: A Probabilistic Programming L..."

  • ...US Environmental Protection Agency (2012) Regulatory impact analysis for the final revisions to the national ambient air quality standards for particulate matter (Office of Air Quality Planning and Standards, Health and Environmental Impacts Division, Research Triangle Park, NC), Technical Report EPA-452/R-12-005....

    [...]

  • ...Standard computer software is not available to estimate the unknown IER parameters under a frequentist framework for survival models when examining subject-level cohort data....

    [...]

  • ...Global estimates of mortality associated with longterm exposure to outdoor fine particulate matter Richard Burnetta, Hong Chena,b, Mieczysław Szyszkowicza,1, Neal Fannc, Bryan Hubbelld, C. Arden Pope IIIe, Joshua S. Aptef, Michael Brauerg, Aaron Cohenh, Scott Weichenthali,j, Jay Cogginsk, Qian Dil, Bert Brunekreefm, Joseph Frostadn, Stephen S. Limn, Haidong Kano, Katherine D. Walkerh, George D. Thurstonp, Richard B. Hayesq, Chris C. Limr, Michelle C. Turners, Michael Jerrettt, Daniel Krewskiu, Susan M. Gapsturv, W. Ryan Diverv, Bart Ostrow, Debbie Goldbergx, Daniel L. Crousey, Randall V. Martinz, Paul Petersaa,bb,cc, Lauren Pinaultdd, Michael Tjepkemadd, Aaron van Donkelaarz, Paul J. Villeneuveaa, Anthony B. Milleree, Peng Yinff, Maigeng Zhouff, Lijun Wangff, Nicole A. H. Janssengg, Marten Marragg, Richard W. Atkinsonhh,ii, Hilda Tsangjj, Thuan Quoc Thachjj, John B. Cannone, Ryan T. Allene, Jaime E. Hartkk, Francine Ladenkk, Giulia Cesaronill, Francesco Forastierell, Gudrun Weinmayrmm, Andrea Jaenschmm, Gabriele Nagelmm, Hans Concinnn, and Joseph V. Spadarooo aPopulation Studies Division, Health Canada, Ottawa, ON K1A 0K9, Canada; bDepartment of Environmental and Occupational Health, Public Health Ontario, Toronto, ONM5G 1V2, Canada; cRisk and Benefits Group, Office of Air Quality Planning and Standards, US Environmental Protection Agency,Washington, DC 20460; dOffice of Research and Development, US Environmental Protection Agency, Washington, DC 20460; eDepartment of Economics, Brigham Young University, Provo, UT 84602; fDepartment of Civil, Architectural and Environmental Engineering, University of Texas at Austin, Austin, TX 78712; gSchool of Population and Public Health, University of British Columbia, Vancouver, BC V6T 1Z3, Canada; hHealth Effects Institute, Boston, MA 02110-1817; iDepartment of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, QC H3A 0G4, Canada; jGerald Bronfman Department of Oncology, McGill University, Montreal, QC H3A 0G4, Canada; kDepartment of Applied Economics, University of Minnesota, Minneapolis, MN 55455; lDepartment of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA 02115; mInstitute for Risk Assessment Sciences, Universiteit Utrecht, 3512 JE Utrecht, The Netherlands; nInstitute for Health Metrics and Evaluation, University of Washington, Seattle, WA 98195; oSchool of Public Health, Fudan University, Shanghai 200433, China; pEnvironmental Medicine and Population Health, Program in Human Exposures and Health Effects, New York University School of Medicine, New York, NY 10016; qDepartment of Population Health, NYU Langone Medical Center, New York, NY 10016; rDepartment of Environmental Medicine, New York University School of Medicine, New York, NY 10016; sISGlobal, Barcelona Institute for Global Health, 08036 Barcelona, Spain; tDepartment of Environmental Health Sciences, Fielding School of Public Health, University of California, Los Angeles, CA 90095; uMcLaughlin Centre for Population Health Risk Assessment, University of Ottawa, Ottawa, ON K1N 6N5, Canada; vEpidemiology Research Program, American Cancer Society, Inc., Atlanta, GA 30303; wDepartment of Civil and Environmental Engineering, University of California, Davis, CA 95616; xCancer Prevention Institute of California, Fremont, CA 94538; yDepartment of Sociology, University of New Brunswick, Fredericton, NB E3B 5A3, Canada; zDepartment of Physics and Atmospheric Science, Dalhousie University, Halifax, NS B3H 4R2, Canada; aaDepartment of Health Sciences, Carleton University, Ottawa, ON K1S 5B6, Canada; bbDepartment of Geography and Environment, Carleton University, Ottawa, ON K1S 5B6, Canada; ccNew Brunswick Institute for Research, Data and Training, University of New Brunswick, Fredericton, NB E3B 5A3, Canada; ddHealth Analysis Division, Statistics Canada, Ottawa, ON K1A 0T6, Canada; eeDalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada; ffNational Center for Chronic Noncommunicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100050, China; ggNational Institute for Public Health and the Environment, 3720 BA Bilthoven, The Netherlands; hhPopulation Health Research Institute, St. George’s, University of London, London SW17 0RE, United Kingdom; iiMRC-PHE Centre for Environment and Health, St. George’s, University of London, London SW17 0RE, United Kingdom; jjSchool of Public Health, University of Hong Kong, Hong Kong, China; kkDepartment of Environmental Health, Harvard C.T. Channing School of Public Health, Harvard University, Boston, MA 02115; llDepartment of Epidemiology, Regional Health Service, ASL Roma 1, 00147 Rome, Italy; mmInstitute of Epidemiology and Medical Biometry, Ulm University, 89081 Ulm, Germany; nnAgency for Preventive and Social Medicine, 6900 Bregenz, Austria; and ooSpadaro Environmental Research Consultants (SERC), Philadelphia, PA 19142 Edited by Maureen L. Cropper, University of Maryland, College Park, MD, and approved July 23, 2018 (received for review February 22, 2018) Exposure to ambient fine particulate matter (PM2....

    [...]

  • ...A Bayesian Monte Carlo approach, such as that used in Stan, is not always practical to usewhen the cohort is large due to computer processing limitations....

    [...]

  • ...Carpenter B, et al. (2017) Stan: A probabilistic programming language....

    [...]

Journal ArticleDOI
TL;DR: A practical approach to forecasting “at scale” that combines configurable models with analyst-in-the-loop performance analysis, and a modular regression model with interpretable parameters that can be intuitively adjusted by analysts with domain knowledge about the time series are described.
Abstract: Forecasting is a common data science task that helps organizations with capacity planning, goal setting, and anomaly detection. Despite its importance, there are serious challenges associated with ...

1,166 citations


Cites methods from "Stan: A Probabilistic Programming L..."

  • ...When the seasonality and holiday features for each observation are combined into a matrix X and the changepoint indicators a(t ) in a matrix A, the entire model in (1) can be expressed in a few lines of Stan code (Carpenter et al. 2017), given in Listing 1.1....

    [...]

References
More filters
Journal Article
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

272,030 citations


"Stan: A Probabilistic Programming L..." refers background or methods in this paper

  • ...Stan uses the more conservative estimates based on both within-chain and cross-chain convergence; see (Gelman et al. 2013) and (Stan Development Team 2014) for motivation and definitions....

    [...]

  • .../bernoulli help-all The sampler and its configuration are described at greater length in the manual (Stan Development Team 2014)....

    [...]

  • ...…conservative version of R̂ than is usual in packages such as Coda (Plummer, Best, Cowles, and Vines 2006), first splitting each chain in half to diagnose nonstationary chains; see (Gelman, Carlin, Stern, Dunson, Vehtari, and Rubin 2013) and (Stan Development Team 2014) for detailed definitions....

    [...]

  • ...The mass matrix is estimated, roughly speaking, by regularizing the sample covariance of the latter half of the warmup iterations; see (Stan Development Team 2014) for full details....

    [...]

  • ...…models may still be coded in Stan, but the missing 9The speedup is because coding data variables as double types in C++ is much faster than promoting all values must be declared as parameters; see (Stan Development Team 2014) for examples of missing data, censored data, and truncated data models....

    [...]

Journal ArticleDOI
TL;DR: In this article, a modified Monte Carlo integration over configuration space is used to investigate the properties of a two-dimensional rigid-sphere system with a set of interacting individual molecules, and the results are compared to free volume equations of state and a four-term virial coefficient expansion.
Abstract: A general method, suitable for fast computing machines, for investigating such properties as equations of state for substances consisting of interacting individual molecules is described. The method consists of a modified Monte Carlo integration over configuration space. Results for the two‐dimensional rigid‐sphere system have been obtained on the Los Alamos MANIAC and are presented here. These results are compared to the free volume equation of state and to a four‐term virial coefficient expansion.

35,161 citations

Book
01 Nov 2008
TL;DR: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization, responding to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems.
Abstract: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization. It responds to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems. For this new edition the book has been thoroughly updated throughout. There are new chapters on nonlinear interior methods and derivative-free methods for optimization, both of which are used widely in practice and the focus of much current research. Because of the emphasis on practical methods, as well as the extensive illustrations and exercises, the book is accessible to a wide audience. It can be used as a graduate text in engineering, operations research, mathematics, computer science, and business. It also serves as a handbook for researchers and practitioners in the field. The authors have strived to produce a text that is pleasant to read, informative, and rigorous - one that reveals both the beautiful nature of the discipline and its practical side.

17,420 citations


"Stan: A Probabilistic Programming L..." refers background or methods in this paper

  • ...Stan provides a standard form of conjugate gradient optimization; see (Nocedal and Wright 2006)....

    [...]

  • ...The default optimizer uses the BroydenFletcher-Goldfarb-Shanno (BFGS) algorithm, a quasi-Newton method which employs exactly computed gradients and an efficient approximation to the Hessian; see (Nocedal and Wright 2006) for an exposition of the BFGS algorithm....

    [...]

  • ...The default optimizer uses the Broyden-Fletcher-GoldfarbShanno (BFGS) algorithm, a quasi-Newton method which employs exactly computed gradients and an efficient approximation to the Hessian; see (Nocedal and Wright 2006) for a textbook exposition of the BFGS algorithm....

    [...]

  • ...Nocedal and Wright (2006) cover both BFGS and L-BFGS samplers....

    [...]

Book
01 Jan 1995
TL;DR: Detailed notes on Bayesian Computation Basics of Markov Chain Simulation, Regression Models, and Asymptotic Theorems are provided.
Abstract: FUNDAMENTALS OF BAYESIAN INFERENCE Probability and Inference Single-Parameter Models Introduction to Multiparameter Models Asymptotics and Connections to Non-Bayesian Approaches Hierarchical Models FUNDAMENTALS OF BAYESIAN DATA ANALYSIS Model Checking Evaluating, Comparing, and Expanding Models Modeling Accounting for Data Collection Decision Analysis ADVANCED COMPUTATION Introduction to Bayesian Computation Basics of Markov Chain Simulation Computationally Efficient Markov Chain Simulation Modal and Distributional Approximations REGRESSION MODELS Introduction to Regression Models Hierarchical Linear Models Generalized Linear Models Models for Robust Inference Models for Missing Data NONLINEAR AND NONPARAMETRIC MODELS Parametric Nonlinear Models Basic Function Models Gaussian Process Models Finite Mixture Models Dirichlet Process Models APPENDICES A: Standard Probability Distributions B: Outline of Proofs of Asymptotic Theorems C: Computation in R and Stan Bibliographic Notes and Exercises appear at the end of each chapter.

16,079 citations


"Stan: A Probabilistic Programming L..." refers background or methods in this paper

  • ...This supplies fairly diffuse starting points when transformed back to the constrained scale, and thus help with convergence diagnostics as discussed in (Gelman et al. 2013)....

    [...]

  • ...Stan uses the more conservative estimates based on both within-chain and cross-chain convergence; see (Gelman et al. 2013) and (Stan Development Team 2014) for motivation and definitions....

    [...]

  • ...The generated quantities block may also be used for forward simulations, generating values to make predictions or to perform posterior predictive checks; see (Gelman et al. 2013) for more information....

    [...]

  • ...In order to perform inference on missing data, it must be declared as a parameter and modeled; see (Gelman et al. 2013) for a discussion of statistical models of missing data....

    [...]

  • ...We’d like to particularly single out the students in Andrew Gelman’s Bayesian data analysis courses at Columbia Univesity and Harvard University, who served as trial subjects for both Stan and (Gelman et al. 2013)....

    [...]

Journal ArticleDOI
TL;DR: The focus is on applied inference for Bayesian posterior distributions in real problems, which often tend toward normal- ity after transformations and marginalization, and the results are derived as normal-theory approximations to exact Bayesian inference, conditional on the observed simulations.
Abstract: The Gibbs sampler, the algorithm of Metropolis and similar iterative simulation methods are potentially very helpful for summarizing multivariate distributions. Used naively, however, iterative simulation can give misleading answers. Our methods are simple and generally applicable to the output of any iterative simulation; they are designed for researchers primarily interested in the science underlying the data and models they are analyzing, rather than for researchers interested in the probability theory underlying the iterative simulations themselves. Our recommended strategy is to use several independent sequences, with starting points sampled from an overdispersed distribution. At each step of the iterative simulation, we obtain, for each univariate estimand of interest, a distributional estimate and an estimate of how much sharper the distributional estimate might become if the simulations were continued indefinitely. Because our focus is on applied inference for Bayesian posterior distributions in real problems, which often tend toward normality after transformations and marginalization, we derive our results as normal-theory approximations to exact Bayesian inference, conditional on the observed simulations. The methods are illustrated on a random-effects mixture model applied to experimental measurements of reaction times of normal and schizophrenic patients.

13,884 citations


"Stan: A Probabilistic Programming L..." refers methods in this paper

  • ...Before performing output analysis, we recommend generating multiple independent chains in order to more effectively monitor convergence; see (Gelman and Rubin 1992) for more analysis....

    [...]

Trending Questions (1)
What are some of the key concepts underlying the Stan Druckenmiller Technique?

The provided paper is about Stan, a probabilistic programming language for specifying statistical models. It does not mention the "Stan Druckenmiller Technique" or any key concepts underlying it.