scispace - formally typeset
Search or ask a question
Journal ArticleDOI

brms: An R Package for Bayesian Multilevel Models Using Stan

29 Aug 2017-Journal of Statistical Software (Foundation for Open Access Statistics)-Vol. 80, Iss: 1, pp 1-28
TL;DR: The brms package implements Bayesian multilevel models in R using the probabilistic programming language Stan, allowing users to fit linear, robust linear, binomial, Poisson, survival, ordinal, zero-inflated, hurdle, and even non-linear models all in a multileVEL context.
Abstract: The brms package implements Bayesian multilevel models in R using the probabilistic programming language Stan. A wide range of distributions and link functions are supported, allowing users to fit - among others - linear, robust linear, binomial, Poisson, survival, ordinal, zero-inflated, hurdle, and even non-linear models all in a multilevel context. Further modeling options include autocorrelation of the response variable, user defined covariance structures, censored data, as well as meta-analytic standard errors. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their beliefs. In addition, model fit can easily be assessed and compared with the Watanabe-Akaike information criterion and leave-one-out cross-validation.
Citations
More filters
Journal ArticleDOI
TL;DR: The glmmTMB package fits many types of GLMMs and extensions, including models with continuously distributed responses, but here the authors focus on count responses and its ability to estimate the Conway-Maxwell-Poisson distribution parameterized by the mean is unique.
Abstract: Count data can be analyzed using generalized linear mixed models when observations are correlated in ways that require random effects However, count data are often zero-inflated, containing more zeros than would be expected from the typical error distributions We present a new package, glmmTMB, and compare it to other R packages that fit zero-inflated mixed models The glmmTMB package fits many types of GLMMs and extensions, including models with continuously distributed responses, but here we focus on count responses glmmTMB is faster than glmmADMB, MCMCglmm, and brms, and more flexible than INLA and mgcv for zero-inflated modeling One unique feature of glmmTMB (among packages that fit zero-inflated mixed models) is its ability to estimate the Conway-Maxwell-Poisson distribution parameterized by the mean Overall, its most appealing features for new users may be the combination of speed, flexibility, and its interface’s similarity to lme4

4,497 citations


Cites background or methods from "brms: An R Package for Bayesian Mul..."

  • ...…with predictors of zero-inflation, but they are relatively slow (as we will show) because they rely on Markov chain Monte Carlo (MCMC) sampling (Bürkner, 2017; Hadfield, 2010). gamlss is a flexible package that fits generalized additive models with predictors on all parameters of a…...

    [...]

  • ...Several R packages are available for fitting zero-inflated models: pscl, INLA, MCMCglmm, glmmADMB, mgcv, brms, and gamlss (Table 1; Zeileis et al., 2008; Rue et al., 2009; Hadfield, 2010; Skaug et al., 2012; Wood et al., 2016; Bürkner, 2017; Stasinopoulos et al., 2017)....

    [...]

  • ...We compared the estimates of fixed effects and the amount of time required for fitting the same model in INLA, MCMCglmm, glmmADMB, mgcv, and brms (Rue et al., 2009; Hadfield, 2010; Skaug et al., 2012; Wood et al., 2016; Bürkner, 2017)....

    [...]

  • ...The MCMCglmm and brms packages can fit zero-inflated GLMMs with predictors of zero-inflation, but they are relatively slow (as we will show) because they rely on Markov chain Monte Carlo (MCMC) sampling (Bürkner, 2017; Hadfield, 2010)....

    [...]

  • ...For Bayesian methods, the important aspect of timing is sampling efficiency (minimum effective samples per unit time, Bürkner, 2017), but this is not compatible with the MLE methods, so we limit our presentation of the timings of the Bayesian methods....

    [...]

Journal ArticleDOI
TL;DR: Brms provides an intuitive and powerful formula syntax, which extends the well known formula syntax of lme4, which is introduced in detail and demonstrated its usefulness with four examples, each showing other relevant aspects of the syntax.
Abstract: The brms package allows R users to easily specify a wide range of Bayesian single-level and multilevel models, which are fitted with the probabilistic programming language Stan behind the scenes. Several response distributions are supported, of which all parameters (e.g., location, scale, and shape) can be predicted at the same time thus allowing for distributional regression. Non-linear relationships may be specified using non-linear predictor terms or semi-parametric approaches such as splines or Gaussian processes. To make all of these modeling options possible in a multilevel framework, brms provides an intuitive and powerful formula syntax, which extends the well known formula syntax of lme4. The purpose of the present paper is to introduce this syntax in detail and to demonstrate its usefulness with four examples, each showing other relevant aspects of the syntax.

1,463 citations


Cites background or methods from "brms: An R Package for Bayesian Mul..."

  • ...In the control argument we increase adapt_delta to get rid of a few divergent transitions (cf. Stan Development Team, 2017b; Bürkner, 2017)....

    [...]

  • ...The brms package (Bürkner, 2017) presented in this paper aims to remove these hurdles for a wide range of regression models by allowing the user to benefit from the merits of Stan by using extended lme4-like formula syntax (Bates et al., 2015), with which many R users are familiar....

    [...]

  • ...The first is explained in Bürkner (2017), while the latter three are documented in help(brmsformula)....

    [...]

  • ...A general overview of the package is given in Bürkner (2017)....

    [...]

  • ...The models described in Bürkner (2017) are a sub-class of the models described here....

    [...]

Posted ContentDOI
04 Jan 2021-medRxiv
TL;DR: The SARS-CoV-2 lineage B.7, now designated Variant of Concern 202012/01 (VOC) by Public Health England, originated in the UK in late Summer to early Autumn 2020 as mentioned in this paper.
Abstract: The SARS-CoV-2 lineage B.1.1.7, now designated Variant of Concern 202012/01 (VOC) by Public Health England, originated in the UK in late Summer to early Autumn 2020. We examine epidemiological evidence for this VOC having a transmission advantage from several perspectives. First, whole genome sequence data collected from community-based diagnostic testing provides an indication of changing prevalence of different genetic variants through time. Phylodynamic modelling additionally indicates that genetic diversity of this lineage has changed in a manner consistent with exponential growth. Second, we find that changes in VOC frequency inferred from genetic data correspond closely to changes inferred by S-gene target failures (SGTF) in community-based diagnostic PCR testing. Third, we examine growth trends in SGTF and non-SGTF case numbers at local area level across England, and show that the VOC has higher transmissibility than non-VOC lineages, even if the VOC has a different latent period or generation time. Available SGTF data indicate a shift in the age composition of reported cases, with a larger share of under 20 year olds among reported VOC than non-VOC cases. Fourth, we assess the association of VOC frequency with independent estimates of the overall SARS-CoV-2 reproduction number through time. Finally, we fit a semi-mechanistic model directly to local VOC and non-VOC case incidence to estimate the reproduction numbers over time for each. There is a consensus among all analyses that the VOC has a substantial transmission advantage, with the estimated difference in reproduction numbers between VOC and non-VOC ranging between 0.4 and 0.7, and the ratio of reproduction numbers varying between 1.4 and 1.8. We note that these estimates of transmission advantage apply to a period where high levels of social distancing were in place in England; extrapolation to other transmission contexts therefore requires caution.

547 citations

Journal ArticleDOI
TL;DR: The Bayesian framework for statistics is quickly gaining in popularity among scientists, for reasons such as reliability and accuracy, the possibility of incorporating prior knowledge into the analysis, and the intuitive interpretation of results.
Abstract: The Bayesian framework for statistics is quickly gaining in popularity among scientists, for reasons such as reliability and accuracy (particularly in noisy data and small samples), the possibility of incorporating prior knowledge into the analysis, and the intuitive interpretation of results (Andrews & Baguley, 2013; Etz & Vandekerckhove, 2016; Kruschke, 2010; Kruschke, Aguinis, & Joo, 2012; Wagenmakers et al., 2017). Adopting the Bayesian framework is more of a shift in the paradigm than a change in the methodology; all the common statistical procedures (t-tests, correlations, ANOVAs, regressions, etc.) can also be achieved within the Bayesian framework. One of the core difference is that in the frequentist view, the effects are fixed (but unknown) and data are random. On the other hand, instead of having single estimates of the “true effect”, the Bayesian inference process computes the probability of different effects given the observed data, resulting in a distribution of possible values for the parameters, called the posterior distribution. The bayestestR package provides tools to describe these posterior distributions.

505 citations


Cites methods from "brms: An R Package for Bayesian Mul..."

  • ...…generated by a variety of models objects, including popular modeling packages such as rstanarm (Goodrich, Gabry, Ali, & Brilleman, 2018), brms (Bürkner, 2017), BayesFactor (Morey & Rouder, 2018), and emmeans (Lenth, 2019), thus making it a useful tool supporting the usage and development of…...

    [...]

Journal ArticleDOI
Matteo Dainese1, Emily A. Martin1, Marcelo A. Aizen2, Matthias Albrecht, Ignasi Bartomeus3, Riccardo Bommarco4, Luísa G. Carvalheiro5, Luísa G. Carvalheiro6, Rebecca Chaplin-Kramer7, Vesna Gagic8, Lucas Alejandro Garibaldi9, Jaboury Ghazoul10, Heather Grab11, Mattias Jonsson4, Daniel S. Karp12, Christina M. Kennedy13, David Kleijn14, Claire Kremen15, Douglas A. Landis16, Deborah K. Letourneau17, Lorenzo Marini18, Katja Poveda11, Romina Rader19, Henrik G. Smith20, Teja Tscharntke21, Georg K.S. Andersson20, Isabelle Badenhausser22, Isabelle Badenhausser23, Svenja Baensch21, Antonio Diego M. Bezerra24, Felix J.J.A. Bianchi14, Virginie Boreux25, Virginie Boreux10, Vincent Bretagnolle23, Berta Caballero-López, Pablo Cavigliasso26, Aleksandar Ćetković27, Natacha P. Chacoff28, Alice Classen1, Sarah Cusser29, Felipe D. da Silva e Silva30, G. Arjen de Groot14, Jan H. Dudenhöffer31, Johan Ekroos20, Thijs P.M. Fijen14, Pierre Franck22, Breno Magalhães Freitas24, Michael P.D. Garratt32, Claudio Gratton33, Juliana Hipólito9, Juliana Hipólito34, Andrea Holzschuh1, Lauren Hunt35, Aaron L. Iverson11, Shalene Jha36, Tamar Keasar37, Tania N. Kim38, Miriam Kishinevsky37, Björn K. Klatt21, Björn K. Klatt20, Alexandra-Maria Klein25, Kristin M. Krewenka39, Smitha Krishnan40, Smitha Krishnan10, Ashley E. Larsen41, Claire Lavigne22, Heidi Liere42, Bea Maas43, Rachel E. Mallinger44, Eliana Martinez Pachon, Alejandra Martínez-Salinas45, Timothy D. Meehan46, Matthew G. E. Mitchell15, Gonzalo Alberto Roman Molina47, Maike Nesper10, Lovisa Nilsson20, Megan E. O'Rourke48, Marcell K. Peters1, Milan Plećaš27, Simon G. Potts33, Davi de L. Ramos, Jay A. Rosenheim12, Maj Rundlöf20, Adrien Rusch49, Agustín Sáez2, Jeroen Scheper14, Matthias Schleuning, Julia Schmack50, Amber R. Sciligo51, Colleen L. Seymour, Dara A. Stanley52, Rebecca Stewart20, Jane C. Stout53, Louis Sutter, Mayura B. Takada54, Hisatomo Taki, Giovanni Tamburini25, Matthias Tschumi, Blandina Felipe Viana55, Catrin Westphal21, Bryony K. Willcox19, Stephen D. Wratten56, Akira Yoshioka57, Carlos Zaragoza-Trello3, Wei Zhang58, Yi Zou59, Ingolf Steffan-Dewenter1 
University of Würzburg1, National University of Comahue2, Spanish National Research Council3, Swedish University of Agricultural Sciences4, University of Lisbon5, Universidade Federal de Goiás6, Stanford University7, Commonwealth Scientific and Industrial Research Organisation8, National University of Río Negro9, ETH Zurich10, Cornell University11, University of California, Davis12, The Nature Conservancy13, Wageningen University and Research Centre14, University of British Columbia15, Great Lakes Bioenergy Research Center16, University of California, Santa Cruz17, University of Padua18, University of New England (Australia)19, Lund University20, University of Göttingen21, Institut national de la recherche agronomique22, University of La Rochelle23, Federal University of Ceará24, University of Freiburg25, Concordia University Wisconsin26, University of Belgrade27, National University of Tucumán28, Michigan State University29, University of Brasília30, University of Greenwich31, University of Reading32, University of Wisconsin-Madison33, National Institute of Amazonian Research34, Boise State University35, University of Texas at Austin36, University of Haifa37, Kansas State University38, University of Hamburg39, Bioversity International40, University of California, Santa Barbara41, Seattle University42, University of Vienna43, University of Florida44, Centro Agronómico Tropical de Investigación y Enseñanza45, National Audubon Society46, University of Buenos Aires47, Virginia Tech48, University of Bordeaux49, University of Auckland50, University of California, Berkeley51, University College Dublin52, Trinity College, Dublin53, University of Tokyo54, Federal University of Bahia55, Lincoln University (New Zealand)56, National Institute for Environmental Studies57, International Food Policy Research Institute58, Xi'an Jiaotong-Liverpool University59
TL;DR: Using a global database from 89 studies (with 1475 locations), the relative importance of species richness, abundance, and dominance for pollination; biological pest control; and final yields in the context of ongoing land-use change is partitioned.
Abstract: Human land use threatens global biodiversity and compromises multiple ecosystem functions critical to food production. Whether crop yield-related ecosystem services can be maintained by a few dominant species or rely on high richness remains unclear. Using a global database from 89 studies (with 1475 locations), we partition the relative importance of species richness, abundance, and dominance for pollination; biological pest control; and final yields in the context of ongoing land-use change. Pollinator and enemy richness directly supported ecosystem services in addition to and independent of abundance and dominance. Up to 50% of the negative effects of landscape simplification on ecosystem services was due to richness losses of service-providing organisms, with negative consequences for crop yields. Maintaining the biodiversity of ecosystem service providers is therefore vital to sustain the flow of key agroecosystem benefits to society.

434 citations

References
More filters
Journal Article
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

272,030 citations

Journal ArticleDOI
TL;DR: In this article, a model is described in an lmer call by a formula, in this case including both fixed-and random-effects terms, and the formula and data together determine a numerical representation of the model from which the profiled deviance or the profeatured REML criterion can be evaluated as a function of some of model parameters.
Abstract: Maximum likelihood or restricted maximum likelihood (REML) estimates of the parameters in linear mixed-effects models can be determined using the lmer function in the lme4 package for R. As for most model-fitting functions in R, the model is described in an lmer call by a formula, in this case including both fixed- and random-effects terms. The formula and data together determine a numerical representation of the model from which the profiled deviance or the profiled REML criterion can be evaluated as a function of some of the model parameters. The appropriate criterion is optimized, using one of the constrained optimization functions in R, to provide the parameter estimates. We describe the structure of the model, the steps in evaluating the profiled deviance or REML criterion, and the structure of classes or types that represents such a model. Sufficient detail is included to allow specialization of these structures by users who wish to write functions to fit specialized linear mixed models, such as models incorporating pedigrees or smoothing splines, that are not easily expressible in the formula language used by lmer.

50,607 citations


"brms: An R Package for Bayesian Mul..." refers methods in this paper

  • ...These are lme4 (Bates et al. 2015) and MCMCglmm (HadĄeld 2010), which are possibly the most general and widely applied R packages for MLMs, as well as rstanarm (Gabry and Goodrich 2016) and rethinking (McElreath 2016), which are both based on Stan....

    [...]

Journal ArticleDOI
TL;DR: In this article, a modified Monte Carlo integration over configuration space is used to investigate the properties of a two-dimensional rigid-sphere system with a set of interacting individual molecules, and the results are compared to free volume equations of state and a four-term virial coefficient expansion.
Abstract: A general method, suitable for fast computing machines, for investigating such properties as equations of state for substances consisting of interacting individual molecules is described. The method consists of a modified Monte Carlo integration over configuration space. Results for the two‐dimensional rigid‐sphere system have been obtained on the Los Alamos MANIAC and are presented here. These results are compared to the free volume equation of state and to a four‐term virial coefficient expansion.

35,161 citations

Journal ArticleDOI
TL;DR: The analogy between images and statistical mechanics systems is made and the analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations, creating a highly parallel ``relaxation'' algorithm for MAP estimation.
Abstract: We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

18,761 citations


"brms: An R Package for Bayesian Mul..." refers methods in this paper

  • ...Furthermore, Gibbs-sampling requires priors to be conjugate to the likelihood of parameters in order to work efficiently (Gelman et al. 2014), thus reducing the freedom of the researcher in choosing a prior that reĆects his or her beliefs....

    [...]

  • ...…primarily using combinations of Metropolis-Hastings updates (Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller 1953; Hastings 1970) and Gibbs-sampling (Geman and Geman 1984; Gelfand and Smith 1990), sometimes also coupled with slice-sampling (Damien, WakeĄeld, and Walker 1999; Neal 2003)....

    [...]

  • ...With the exception of the latter, all of these programs are primarily using combinations of Metropolis-Hastings updates (Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller 1953; Hastings 1970) and Gibbs-sampling (Geman and Geman 1984; Gelfand and Smith 1990), sometimes also coupled with slice-sampling (Damien, WakeĄeld, and Walker 1999; Neal 2003)....

    [...]

  • ...With the exception of the latter, all of these programs are primarily using combinations of Metropolis-Hastings updates (Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller 1953; Hastings 1970) and Gibbs-sampling (Geman and Geman 1984; Gelfand and Smith 1990), sometimes also coupled with slice-sampling (Damien, Wakefield, and Walker 1999; Neal 2003)....

    [...]

Journal ArticleDOI
TL;DR: A generalization of the sampling method introduced by Metropolis et al. as mentioned in this paper is presented along with an exposition of the relevant theory, techniques of application and methods and difficulties of assessing the error in Monte Carlo estimates.
Abstract: SUMMARY A generalization of the sampling method introduced by Metropolis et al. (1953) is presented along with an exposition of the relevant theory, techniques of application and methods and difficulties of assessing the error in Monte Carlo estimates. Examples of the methods, including the generation of random orthogonal matrices and potential applications of the methods to numerical problems arising in statistics, are discussed. For numerical problems in a large number of dimensions, Monte Carlo methods are often more efficient than conventional numerical methods. However, implementation of the Monte Carlo methods requires sampling from high dimensional probability distributions and this may be very difficult and expensive in analysis and computer time. General methods for sampling from, or estimating expectations with respect to, such distributions are as follows. (i) If possible, factorize the distribution into the product of one-dimensional conditional distributions from which samples may be obtained. (ii) Use importance sampling, which may also be used for variance reduction. That is, in order to evaluate the integral J = X) p(x)dx = Ev(f), where p(x) is a probability density function, instead of obtaining independent samples XI, ..., Xv from p(x) and using the estimate J, = Zf(xi)/N, we instead obtain the sample from a distribution with density q(x) and use the estimate J2 = Y{f(xj)p(x1)}/{q(xj)N}. This may be advantageous if it is easier to sample from q(x) thanp(x), but it is a difficult method to use in a large number of dimensions, since the values of the weights w(xi) = p(x1)/q(xj) for reasonable values of N may all be extremely small, or a few may be extremely large. In estimating the probability of an event A, however, these difficulties may not be as serious since the only values of w(x) which are important are those for which x -A. Since the methods proposed by Trotter & Tukey (1956) for the estimation of conditional expectations require the use of importance sampling, the same difficulties may be encountered in their use. (iii) Use a simulation technique; that is, if it is difficult to sample directly from p(x) or if p(x) is unknown, sample from some distribution q(y) and obtain the sample x values as some function of the corresponding y values. If we want samples from the conditional dis

14,965 citations


"brms: An R Package for Bayesian Mul..." refers methods in this paper

  • ...…all of these programs are primarily using combinations of Metropolis-Hastings updates (Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller 1953; Hastings 1970) and Gibbs-sampling (Geman and Geman 1984; Gelfand and Smith 1990), sometimes also coupled with slice-sampling (Damien, WakeĄeld, and…...

    [...]