Bayesian Estimation Supersedes the t Test
Summary (2 min read)
1 Introduction
- The BEST package provides a Bayesian alternative to a t test, providing much richer information about the samples and the difference in means than a simple p value.
- Bayesian estimation for two groups provides complete distributions of credible values for the effect size, group means and their difference, standard deviations and their difference, and the normality of the data.
- For a single group, distributions for the mean, standard deviation and normality are provided.
- The decision rule can accept the null value (unlike traditional t tests) when certainty in the estimate is high (unlike Bayesian model comparison using Bayes factors).
- The package also provides methods to estimate statistical power for various research goals.
2 The Model
- To accommodate outliers the authors describe the data with a distribution that has fatter tails than the normal distribution, namely the t distribution.
- (Note that the authors are using this as a convenient description of the data, not as a sampling distribution from which p values are derived.).
- The data (y) are assumed to be independent and identically distributed (i.i.d.) draws from a t distribution with different mean (µ) and standard deviation (σ) for each population, and with a common normality parameter (ν), as indicated in the lower portion of Figure 1.
- The default priors, with priors = NULL, are minimally informative: normal priors with large standard deviation for (µ), broad uniform priors for (σ), and a shifted-exponential prior for (ν), as described by Kruschke (2013).
- These priors are indicated in the upper portion of Figure 1.
3 Preparing to run BEST
- BEST uses the JAGS package (Plummer, 2003) to produce samples from the posterior distribution of each parameter of interest.
- You will need to download JAGS from http://sourceforge.
- Net/projects/mcmc-jags/ and install it before running BEST.
- BEST also requires the packages rjags and coda, which should normally be installed at the same time as package BEST if you use the install.
- Once installed, the authors need to load the BEST package at the start of each R session, which will also load rjags and coda and link to JAGS: > library(BEST).
4.2 Running the model
- The authors run BESTmcmc and save the result in BESTout.
- The authors do not use parallel processing here, but if your machine has at least 4 cores, parallel processing cuts the time by 50%.
4.3 Basic inferences
- Also shown is the mean of the posterior probability, which is an appropriate point estimate of the true difference in means, the 95% Highest Density Interval (HDI), and the posterior probability that the difference is greater than zero.
- An increase reaction time of 1 unit may indicate that users of the drug should not drive or operate equipment.
- More interesting is the probability that the difference may be too small to matter.
- But if most of the probability mass (say, 95%) lay within the ROPE, the authors would accept the null value for practical purposes.
4.4 Checking convergence and fit
- The output from BESTmcmc has class BEST, which has a print method: > class [1].
- 'Rhat' is the potential scale reduction factor (at convergence, Rhat=1).
- Increase the burnInSteps argument to BESTmcmc if any of the Rhats are too big.
- Values of n.eff around 10,000 are needed for stable estimates of 95% credible intervals.
- The function plotAll puts histograms of all the posterior distributions and the posterior predictive plots onto a single page . > plotAll.
4.5 Working with individual parameters
- Objects of class BEST contain long vectors of simulated draws from the posterior distribution of each of the parameters in the model.
- You may wish to look at the ratio of the variances rather than the difference in the standard deviations.
- You can calculate a vector of draws from the posterior distribution, calculate summary statistics, and plot the distribution with plotPost : > <- BESTout$sigma1^2 / BESTout$sigma2^2 > median [1].
5 An example with a single group
- Applying BEST to a single sample, or for differences in paired observations, works in much the same way as the two-sample method and uses the same function calls.
- To run the model, simply use BESTmcmc with only one vector of observations.
- Standard deviation, the normality parameter and effect size can be plotted individually, or on a single page with plotAll . > plotAll(BESTout1g).
6 What next?
- The package includes functions to estimate the power of experimental designs: see the help pages for BESTpower and makeData for details on implementation and Kruschke (2013) for background.
- If you want to know how the functions in the BEST package work, you can download the R source code from CRAN or from GitHub https://github.com/mikemeredith/BEST.
- For a practical introduction see Kruschke (2015).
7 References
- A tutorial with R, JAGS and Stan, also known as Doing Bayesian data analysis.
- In 3rd International Workshop on Distributed Statistical Computing (DSC 2003).
Did you find this useful? Give us your feedback
Citations
1,496 citations
Cites background or methods from "Bayesian Estimation Supersedes the ..."
...Update the prior to obtain the posterior distribution (see e.g., Kruschke, 2013b, or tools on the website for Dienes, 2008: http://www.lifesci.sussex.ac.uk/home/Zoltan_Dienes/inference/bayes_normalposte rior.swf)....
[...]
...Bayes is a general all-purpose method that can be applied to any specified distribution or to a bootstrapped distribution (e.g., Jackman, 2009; Kruschke, 2010a; Lee and Wagenmakers, 2014; see Kruschke, 2013b, for a Bayesian analysis that allows heavy-tailed distributions)....
[...]
...Kruschke (2013c) recommends specifying the degree to which the Bayesian credibility interval is contained within null regions of different widths so people with different null regions can make their own decisions....
[...]
...Rules (i) and (ii) are not sensitive to stopping rule (given interval width is not much more than that of the null region; cf Kruschke, 2013b)....
[...]
...…have been ignored when they were in fact informative (e.g., believing that an apparent failure to replicate with a non-significant result is more likely to indicate noise produced by sloppy experimenters than a true null hypothesis; cf. Greenwald, 1993; Pashler and Harris, 2012; Kruschke, 2013a)....
[...]
1,190 citations
Cites background from "Bayesian Estimation Supersedes the ..."
...These cases have been discussed many times in the literature, including the well-known and accessible articles by Lindley and Phillips (1976) and J....
[...]
1,172 citations
637 citations
Cites background or methods from "Bayesian Estimation Supersedes the ..."
...It should also be noted that Equation 2 is developed from a frequentist perspective, but Kruschke (2013, 2014) describes a Bayesian sample size approach based on the ROPE....
[...]
...…“actually includes the 95% of parameter values that are most credible” (Kruschke, 2013, p. 592), so “when the 95% HDI [highest density interval] falls within the ROPE, we can conclude that 95% of the credible parameter values are practically equivalent to the null value” (Kruschke, 2013, p. 592)....
[...]
...Interested readers are referred to Kruschke (2013, 2014) for such examples....
[...]
...…a Bayesian highest density interval, unlike a frequentist confidence interval, “actually includes the 95% of parameter values that are most credible” (Kruschke, 2013, p. 592), so “when the 95% HDI [highest density interval] falls within the ROPE, we can conclude that 95% of the credible parameter…...
[...]
...An attractive feature of Bayesian methods is that they often facilitate robust estimation (Kruschke, 2013)....
[...]
562 citations
References
272,030 citations
115,069 citations
"Bayesian Estimation Supersedes the ..." refers background in this paper
...As a generic example, because an effect size of 0.1 is conventionally deemed to be small (Cohen, 1988), a ROPE on effect size might extend from −0.1 to +0.1....
[...]
...Importantly, the result is a different space of possible tnull values than the conventional assumption of fixed sample size, and hence a different p value and different confidence interval....
[...]
49,129 citations
"Bayesian Estimation Supersedes the ..." refers background in this paper
...1 is conventionally deemed to be small (Cohen, 1988), a ROPE on effect size might extend from 0....
[...]
...As a generic example, because an effect size of 0.1 is conventionally deemed to be small (Cohen, 1988), a ROPE on effect size might extend from −0.1 to +0.1....
[...]
7,086 citations
6,081 citations
Related Papers (5)
Frequently Asked Questions (8)
Q2. What packages are required to run BEST?
BEST also requires the packages rjags and coda, which should normally be installed at the same time as package BEST if you use the install.
Q3. What is the way to extract the data?
Since BEST objects are also data frames, the authors can use the $ operator to extract the columns the authors want:> names(BESTout)[1] "mu1" "mu2" "nu" "sigma1" "sigma2"> meanDiff <- (BESTout$mu1 - BESTout$mu2) > meanDiffGTzero <- mean(meanDiff > 0) > meanDiffGTzero[1]
Q4. what is the package for rjags?
Once installed, the authors need to load the BEST package at the start of each R session, which will also load rjags and coda and link to JAGS:> library(BEST)The authors will use hypothetical data for reaction times for two groups (N1 = N2 = 6), Group 1 consumes a drug which may increase reaction times while Group 2 is a control group that consumes a placebo.>
Q5. What is the method for estimating the power of experimental designs?
If you want to know how the functions in the BEST package work, you can download the R source code from CRAN or from GitHub https://github.com/mikemeredith/BEST.Bayesian analysis with computations performed by JAGS is a powerful approach to analysis.
Q6. What is the package for estimating statistical power?
You can specify your own priors by providing a list: population means (µ) have separate normal priors, with mean muM and standard deviation muSD; population standard deviations (σ) have separate gamma priors, with mode sigmaMode and standard deviation sigmaSD; the normality parameter (ν) has a gamma prior with mean nuMean and standard deviation nuSD.
Q7. What is the default prior for the other parameters?
We’ll use the default priors for the other parameters: sigmaMode = sd(y), sigmaSD = sd(y)*5, nuMean = 30, nuSD = 30), where y = c(y1, y2).> priors <- list(muM = 6, muSD = 2)The authors run BESTmcmc and save the result in BESTout.
Q8. What is the way to calculate the reaction times?
y1 <- c(5.77, 5.33, 4.59, 4.33, 3.66, 4.48) > y2 <- c(3.88, 3.55, 3.29, 2.59, 2.33, 3.59)Based on previous experience with these sort of trials, the authors expect reaction times to be approximately 6 secs, but they vary a lot, so we’ll set muM = 6 and muSD = 2.