Random effects structure for confirmatory hypothesis testing: Keep it maximal

doi:10.1016/J.JML.2012.11.001

Home
/
Papers
/
Random effects structure for confirmatory hypothesis testing: Keep it maximal

Journal Article•DOI•

Random effects structure for confirmatory hypothesis testing: Keep it maximal

Dale J. Barr¹, Roger Levy², Christoph Scheepers¹, Harry Tily•Institutions (2)

University of Glasgow¹, University of California, San Diego²

01 Apr 2013-Journal of Memory and Language (NIH Public Access)-Vol. 68, Iss: 3, pp 255-278

TL;DR: It is argued that researchers using LMEMs for confirmatory hypothesis testing should minimally adhere to the standards that have been in place for many decades, and it is shown thatLMEMs generalize best when they include the maximal random effects structure justified by the design.

read less

About: This article is published in Journal of Memory and Language.The article was published on 2013-04-01 and is currently open access. It has received 6878 citations till now. The article focuses on the topics: Random effects model & Statistical hypothesis testing.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Analyzing linguistic data: a practical introduction to statistics using R

[...]

Elisabeth Dévière¹•Institutions (1)

Katholieke Universiteit Leuven¹

16 Apr 2009-Journal of Applied Statistics

TL;DR: The author guides the reader in about 350 pages from descriptive and basic statistical methods over classification and clustering to (generalised) linear and mixed models to enable researchers and students alike to reproduce the analyses and learn by doing.

...read moreread less

Abstract: The complete title of this book runs ‘Analyzing Linguistic Data: A Practical Introduction to Statistics using R’ and as such it very well reflects the purpose and spirit of the book. The author guides the reader in about 350 pages from descriptive and basic statistical methods over classification and clustering to (generalised) linear and mixed models. Each of the methods is introduced in the context of concrete linguistic problems and demonstrated on exciting datasets from current research in the language sciences. In line with its practical orientation, the book focuses primarily on using the methods and interpreting the results. This implies that the mathematical treatment of the techniques is held at a minimum if not absent from the book. In return, the reader is provided with very detailed explanations on how to conduct the analyses using R [1]. The first chapter sets the tone being a 20-page introduction to R. For this and all subsequent chapters, the R code is intertwined with the chapter text and the datasets and functions used are conveniently packaged in the languageR package that is available on the Comprehensive R Archive Network (CRAN). With this approach, the author has done an excellent job in enabling researchers and students alike to reproduce the analyses and learn by doing. Another quality as a textbook is the fact that every chapter ends with Workbook sections where the user is invited to exercise his or her analysis skills on supplemental datasets. Full solutions including code, results and comments are given in Appendix A (30 pages). Instructors are therefore very well served by this text, although they might want to balance the book with some more mathematical treatment depending on the target audience. After the introductory chapter on R, the book opens on graphical data exploration. Chapter 3 treats probability distributions and common sampling distributions. Under basic statistical methods (Chapter 4), distribution tests and tests on means and variances are covered. Chapter 5 deals with clustering and classification. Strangely enough, the clustering section has material on PCA, factor analysis, correspondence analysis and includes only one subsection on clustering, devoted notably to hierarchical partitioning methods. The classification part deals with decision trees, discriminant analysis and support vector machines. The regression chapter (Chapter 6) treats linear models, generalised linear models, piecewise linear models and a substantial section on models for lexical richness. The final chapter on mixed models is particularly interesting as it is one of the few text book accounts that introduce the reader to using the (innovative) lme4 package of Douglas Bates which implements linear mixed-effects models. Moreover, the case studies included in this

...read moreread less

1,679 citations

Journal Article•DOI•

A brief introduction to mixed effects modelling and multi-model inference in ecology.

[...]

Xavier A. Harrison¹, Lynda Donaldson², Lynda Donaldson³, Maria Eugenia Correa-Cano³, Julian C. Evans³, Julian C. Evans⁴, David N. Fisher⁵, David N. Fisher³, Cecily E. D. Goodwin³, Beth S. Robinson³, David J. Hodgson³, Richard Inger³ - Show less +8 more•Institutions (5)

Zoological Society of London¹, Wildfowl & Wetlands Trust², University of Exeter³, University of Ottawa⁴, University of Guelph⁵

23 May 2018-PeerJ

TL;DR: This overview should serve as a widely accessible code of best practice for applying LMMs to complex biological problems and model structures, and in doing so improve the robustness of conclusions drawn from studies investigating ecological and evolutionary questions.

...read moreread less

Abstract: The use of linear mixed effects models (LMMs) is increasingly common in the analysis of biological data. Whilst LMMs offer a flexible approach to modelling a broad range of data types, ecological data are often complex and require complex model structures, and the fitting and interpretation of such models is not always straightforward. The ability to achieve robust biological inference requires that practitioners know how and when to apply these tools. Here, we provide a general overview of current methods for the application of LMMs to biological data, and highlight the typical pitfalls that can be encountered in the statistical modelling process. We tackle several issues regarding methods of model selection, with particular reference to the use of information theory and multi-model inference in ecology. We offer practical solutions and direct the reader to key references that provide further technical detail for those seeking a deeper understanding. This overview should serve as a widely accessible code of best practice for applying LMMs to complex biological problems and model structures, and in doing so improve the robustness of conclusions drawn from studies investigating ecological and evolutionary questions.

...read moreread less

1,210 citations

Cites background from "Random effects structure for confir..."

...Burnham, Anderson & Huyvaert (2011) demonstrate how AIC approximates Kullback–Leibler information and provide some excellent guides for the best practice of applying ITmethods to biological datasets. Vaida & Blanchard (2005) provide details on how AIC should be implemented for the analysis of clustered data....
[...]
...Schielzeth & Forstmeier (2009); Barr et al. (2013) and Aarts et al. (2015) show that constraining groups to share a common slope can inflate Type I and Type II errors....
[...]
...Therefore, the approach of fitting the ‘maximal’ complexity of random effects structure (Barr et al., 2013) is perhaps better phrased as fitting the most complex mixed effects structure allowed by your data (Bates et al., 2015a), which may mean either (i) fitting random slopes but removing the…...
[...]
...Barr et al. (2013) suggest that researchers should fit the maximal random effects structure possible for the data....
[...]
...Therefore, the approach of fitting the ‘maximal’ complexity of random effects structure (Barr et al., 2013) is perhaps better phrased as fitting the most complex mixed effects structure allowed by your data (Bates et al....
[...]

Journal Article•DOI•

Evaluating significance in linear mixed-effects models in R

[...]

Steven G. Luke¹•Institutions (1)

Brigham Young University¹

01 Aug 2017-Behavior Research Methods

TL;DR: Results of simulations show that the two most common methods for evaluating significance, using likelihood ratio tests and applying the z distribution to the Wald t values from the model output (t-as-z), are somewhat anti-conservative, especially for smaller sample sizes.

...read moreread less

Abstract: Mixed-effects models are being used ever more frequently in the analysis of experimental data. However, in the lme4 package in R the standards for evaluating significance of fixed effects in these models (i.e., obtaining p-values) are somewhat vague. There are good reasons for this, but as researchers who are using these models are required in many cases to report p-values, some method for evaluating the significance of the model output is needed. This paper reports the results of simulations showing that the two most common methods for evaluating significance, using likelihood ratio tests and applying the z distribution to the Wald t values from the model output (t-as-z), are somewhat anti-conservative, especially for smaller sample sizes. Other methods for evaluating significance, including parametric bootstrapping and the Kenward-Roger and Satterthwaite approximations for degrees of freedom, were also evaluated. The results of these simulations suggest that Type 1 error rates are closest to .05 when models are fitted using REML and p-values are derived using the Kenward-Roger or Satterthwaite approximations, as these approximations both produced acceptable Type 1 error rates even for smaller samples.

...read moreread less

1,045 citations

Cites methods from "Random effects structure for confir..."

...All simulations were run using the SIMGEN package (Barr et al., 2013) in R to fit models to simulated data, varying number of subjects and items systematically....
[...]
...To test this, corrected power (power’) was computed, as described by Barr et al. (2013); separate simulations were conducted, identi-...
[...]
...To test this, corrected power (power’) was computed, as described by Barr et al. (2013); separate simulations were conducted, identical to those described earlier in the paragraph, but with the null hypothesis set to true....
[...]
...Barr et al. (2013) use this test repeatedly in their simulations, and suggest that for the numbers of subjects and items typical of cognitive research these likelihood ratio tests are not particularly anti-conservative....
[...]

Journal Article•DOI•

Balancing Type I Error and Power in Linear Mixed Models

[...]

Hannes Matuschek¹, Reinhold Kliegl¹, Shravan Vasishth¹, R. Harald Baayen², Douglas M. Bates³ - Show less +1 more•Institutions (3)

University of Potsdam¹, University of Tübingen², University of Wisconsin-Madison³

01 Jun 2017-Journal of Memory and Language

TL;DR: This paper showed that for typical psychological and psycholinguistic data, higher power is achieved without inflating Type I error rate if a model selection criterion is used to select a random effect structure that is supported by the data.

...read moreread less

928 citations

Cites background or methods or result from "Random effects structure for confir..."

...One simple option, when numerically possible, is to t the full variance-covariance structure of random e ects (the maximal model; Barr et al., 2013), presumably to keep Type I error down to the nominal α in the presence of random e ects....
[...]
...…is supported by the data, usually by fixing some of the small variance components or correlation parameters to zero (for example, see discussion in Barr et al., 2013, p. 276).1 Fortunately, with enough data for every subject and every item, the programs may converge and may provide what looks…...
[...]
...Finally, we want to emphasize that we are not proposing a new dogma that is an alternative to the “keep it maximal” proposal of Barr et al. (2013)....
[...]
...Barr et al. (2013) refer to the first two cases as confirmatory and the last case as exploratory hypothesis testing....
[...]
...Fourth, the distinction between design-driven and data-driven, as introduced by Barr et al. (2013), misses an important confirmatory aspect in multivariate statistics: Any hypothesis about the support of variance components by the data requires a model comparison....
[...]

Posted Content•

Parsimonious Mixed Models

[...]

Douglas M. Bates, Reinhold Kliegl, Shravan Vasishth, R. Harald Baayen¹•Institutions (1)

University of Tübingen¹

16 Jun 2015-arXiv: Methodology

TL;DR: This work shows that failure to converge typically is not due to a suboptimal estimation algorithm, but is a consequence of attempting to fit a model that is too complex to be properly supported by the data, irrespective of whether estimation is based on maximum likelihood or on Bayesian hierarchical modeling with uninformative or weakly informative priors.

...read moreread less

Abstract: The analysis of experimental data with mixed-effects models requires decisions about the specification of the appropriate random-effects structure. Recently, Barr, Levy, Scheepers, and Tily, 2013 recommended fitting `maximal' models with all possible random effect components included. Estimation of maximal models, however, may not converge. We show that failure to converge typically is not due to a suboptimal estimation algorithm, but is a consequence of attempting to fit a model that is too complex to be properly supported by the data, irrespective of whether estimation is based on maximum likelihood or on Bayesian hierarchical modeling with uninformative or weakly informative priors. Importantly, even under convergence, overparameterization may lead to uninterpretable models. We provide diagnostic tools for detecting overparameterization and guiding model simplification.

...read moreread less

889 citations

Cites background or methods from "Random effects structure for confir..."

...The online supplement to Barr et al. (2013) fits such a model to data from an experiment described in Kronmüller and Barr (2007)....
[...]
...A full factorial model in the fixed-effects can be described by the formula , 1 + S + P + C + SP + SC + PC + SPC. Barr et al. (2013) analyzed Kronmüller and Barr (2007, Exp. 2) with the maximal model for this design comprising 16 variance components (eight each for the random factors SubjID and ItemID, respectively)....
[...]
...Barr et al. (2013) argue that failure to specify a maximal random effects structure amounts to violating the compound symmetry assumption in classical analysis of variance (anova)....
[...]
...Barr et al. (2013) base their recommendations on simulation studies comparing different procedures for model selection....
[...]
...Inclusion of these random slopes was motivated in part by the wish to provide stringent tests for the significance of main effects (cf. Barr et al., 2013), and in part by interest in individual differences....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•

R: A language and environment for statistical computing.

[...]

R Core Team

01 Jan 2014-MSOR connections

TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.

...read moreread less

Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

...read moreread less

272,030 citations

Book•

Statistical Principles in Experimental Design

[...]

B. J. Winer¹•Institutions (1)

Purdue University¹

01 Jan 1962

TL;DR: In this article, the authors introduce the principles of estimation and inference: means and variance, means and variations, and means and variance of estimators and inferors, and the analysis of factorial experiments having repeated measures on the same element.

...read moreread less

Abstract: CHAPTER 1: Introduction to Design CHAPTER 2: Principles of Estimation and Inference: Means and Variance CHAPTER 3: Design and Analysis of Single-Factor Experiments: Completely Randomized Design CHAPTER 4: Single-Factor Experiments Having Repeated Measures on the Same Element CHAPTER 5: Design and Analysis of Factorial Experiments: Completely-Randomized Design CHAPTER 6: Factorial Experiments: Computational Procedures and Numerical Example CHAPTER 7: Multifactor Experiments Having Repeated Measures on the Same Element CHAPTER 8: Factorial Experiments in which Some of the Interactions are Confounded CHAPTER 9: Latin Squares and Related Designs CHAPTER 10: Analysis of Covariance

...read moreread less

25,607 citations

Journal Article•DOI•

Statistical Principles in Experimental Design

[...]

Paul E. Green, B. J. Winer, D. R. Brown, Kenneth M. Michels

01 Aug 1992-Journal of Marketing Research

TL;DR: This chapter discusses design and analysis of single-Factor Experiments: Completely Randomized Design and Factorial Experiments in which Some of the Interactions are Confounded.

...read moreread less

24,665 citations

Book•

Hierarchical Linear Models: Applications and Data Analysis Methods

[...]

James A. Calvin¹•Institutions (1)

University of Chicago¹

03 Mar 1992

TL;DR: The Logic of Hierarchical Linear Models (LMLM) as discussed by the authors is a general framework for estimating and hypothesis testing for hierarchical linear models, and it has been used in many applications.

...read moreread less

Abstract: Introduction The Logic of Hierarchical Linear Models Principles of Estimation and Hypothesis Testing for Hierarchical Linear Models An Illustration Applications in Organizational Research Applications in the Study of Individual Change Applications in Meta-Analysis and Other Cases Where Level-1 Variances are Known Three-Level Models Assessing the Adequacy of Hierarchical Models Technical Appendix

...read moreread less

23,126 citations

Journal Article•DOI•

Hierarchical Linear Models: Applications and Data Analysis Methods.

[...]

Nicholas T. Longford, Anthony S. Bryk, Stephen W. Raudenbush

01 Mar 1993-Contemporary Sociology

TL;DR: This chapter discusses Hierarchical Linear Models in Applications, Applications in Organizational Research, and Applications in the Study of Individual Change Applications in Meta-Analysis and Other Cases Where Level-1 Variances are Known.

...read moreread less

19,282 citations

"Random effects structure for confir..." refers background in this paper

...For further information about random effect variance–covariance structures, see Baayen (2004, 2008), Gelman and Hill (2007), Goldstein (1995), Raudenbush and Bryk (2002), and Snijders and Bosker (1999a)....
[...]