scispace - formally typeset
Search or ask a question
Book

An R Companion to Applied Regression

TL;DR: This tutorial jumps right in to the power of R without dragging you through the basic concepts of the programming language.
Abstract: Preface 1. Getting Started With R 2. Reading and Manipulating Data 3. Exploring and Transforming Data 4. Fitting Linear Models 5. Fitting Generalized Linear Models 6. Diagnosing Problems in Linear and Generalized Linear Models 7. Drawing Graphs 8. Writing Programs References Author Index Subject Index Command Index Data Set Index Package Index About the Authors
Citations
More filters
Journal ArticleDOI
TL;DR: The lmerTest package extends the 'lmerMod' class of the lme4 package, by overloading the anova and summary functions by providing p values for tests for fixed effects, and implementing the Satterthwaite's method for approximating degrees of freedom for the t and F tests.
Abstract: One of the frequent questions by users of the mixed model function lmer of the lme4 package has been: How can I get p values for the F and t tests for objects returned by lmer? The lmerTest package extends the 'lmerMod' class of the lme4 package, by overloading the anova and summary functions by providing p values for tests for fixed effects. We have implemented the Satterthwaite's method for approximating degrees of freedom for the t and F tests. We have also implemented the construction of Type I - III ANOVA tables. Furthermore, one may also obtain the summary as well as the anova table using the Kenward-Roger approximation for denominator degrees of freedom (based on the KRmodcomp function from the pbkrtest package). Some other convenient mixed model analysis tools such as a step method, that performs backward elimination of nonsignificant effects - both random and fixed, calculation of population means and multiple comparison tests together with plot facilities are provided by the package as well.

12,305 citations


Cites methods from "An R Companion to Applied Regressio..."

  • ...The paper is structured in the following way: in Sections 2 and 3 we describe the approach taken by Giesbrecht and Burns (1985); Fai and Cornelius (1996) to address the inference problem and compare the approximation methods to the commonly used LRT....

    [...]

  • ...Type II and III may be also obtained through the Anova function of the car package (Fox and Weisberg 2011)....

    [...]

Journal ArticleDOI
TL;DR: The brms package implements Bayesian multilevel models in R using the probabilistic programming language Stan, allowing users to fit linear, robust linear, binomial, Poisson, survival, ordinal, zero-inflated, hurdle, and even non-linear models all in a multileVEL context.
Abstract: The brms package implements Bayesian multilevel models in R using the probabilistic programming language Stan. A wide range of distributions and link functions are supported, allowing users to fit - among others - linear, robust linear, binomial, Poisson, survival, ordinal, zero-inflated, hurdle, and even non-linear models all in a multilevel context. Further modeling options include autocorrelation of the response variable, user defined covariance structures, censored data, as well as meta-analytic standard errors. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their beliefs. In addition, model fit can easily be assessed and compared with the Watanabe-Akaike information criterion and leave-one-out cross-validation.

4,353 citations


Cites methods from "An R Companion to Applied Regressio..."

  • ...In fact, this is an important advantage of Bayesian MCMC methods as compared to maximum likelihood approaches, which do not treat u as a parameter, but assume that it is part of the error term instead (cf., Fox and Weisberg, 2011)....

    [...]

Journal ArticleDOI
TL;DR: A crucial part of statistical analysis is evaluating a model’s quality and fit, or performance, and investigating the fit of models to data also often involves selecting the best fitting model amongst many competing models.
Abstract: A crucial part of statistical analysis is evaluating a model’s quality and fit, or performance. During analysis, especially with regression models, investigating the fit of models to data also often involves selecting the best fitting model amongst many competing models. Upon investigation, fit indices should also be reported both visually and numerically to bring readers in on the investigative effort.

973 citations


Cites methods from "An R Companion to Applied Regressio..."

  • ...Compared to other packages (e.g., lmtest (Zeileis & Hothorn, 2002), MuMIn (Barton, 2020), car (Fox & Weisberg, 2019), broom (Robinson et al., 2020)), the performance package offers functions for checking validity and model quality systematically and comprehensively for many regression model objects…...

    [...]

01 Jan 2011
TL;DR: This appendix describes how to use several alternative robust-regression estimators, which attempt to down-weight or ignore unusual data: M -estimators; bounded-inuence estimators; MM -estIMators; and quantile- Regression estimator, including L1 regression.
Abstract: Linear least-squares regression can be very sensitive to unusual data. In this appendix to Fox and Weisberg (2011), we describe how to t several alternative robust-regression estimators, which attempt to down-weight or ignore unusual data: M -estimators; bounded-inuence estimators; MM -estimators; and quantile-regression estimators, including L1 regression. All estimation methods rely on assumptions for their validity. We say that an estimator or statistical procedure is robust if it provides useful information even if some of the assumptions used to justify the estimation method are not applicable. Most of this appendix concerns robust regression, estimation methods typically for the linear regression model that are insensitive to outliers and possibly high leverage points. Other types of robustness, for example to model misspecication, are not discussed here. These methods were developed between the mid-1960s and the mid-1980s. With the exception of the L1 methods described in Section 5, they are not widely used today.

754 citations

Journal ArticleDOI
TL;DR: A geo-statistical model is developed to estimate the volume of global lakes with a surface area of at least 10 ha based on the surrounding terrain information and calculates mean and median hydraulic residence times for all lakes to be 1,834 days and 456 days, respectively.
Abstract: Lakes are key components of biogeochemical and ecological processes, thus knowledge about their distribution, volume and residence time is crucial in understanding their properties and interactions within the Earth system. However, global information is scarce and inconsistent across spatial scales and regions. Here we develop a geo-statistical model to estimate the volume of global lakes with a surface area of at least 10 ha based on the surrounding terrain information. Our spatially resolved database shows 1.42 million individual polygons of natural lakes with a total surface area of 2.67 × 106 km2 (1.8% of global land area), a total shoreline length of 7.2 × 106 km (about four times longer than the world’s ocean coastline) and a total volume of 181.9 × 103 km3 (0.8% of total global non-frozen terrestrial water stocks). We also compute mean and median hydraulic residence times for all lakes to be 1,834 days and 456 days, respectively. Lakes play a key role in our ecosystems and thus it is vital to understand their distribution and volume. Here, the authors present a new global lake database (HydroLAKES) and develop a new geo-statistical model to show global lake area, shoreline length, water volume and hydraulic residence times.

729 citations