Journal Article•DOI•

Bayesian Estimation Supersedes the t Test

Q: What is the way to calculate the reaction times?

y1 y2 <- c(3.88, 3.55, 3.29, 2.59, 2.33, 3.59)Based on previous experience with these sort of trials, the authors expect reaction times to be approximately 6 secs, but they vary a lot, so we’ll set muM = 6 and muSD = 2.

John K. Kruschke¹•Institutions (1)

Indiana University¹

01 May 2013-Journal of Experimental Psychology: General (American Psychological Association)-Vol. 142, Iss: 2, pp 573-603

TL;DR: Bayesian estimation for 2 groups provides complete distributions of credible values for the effect size, group means and their difference, standard deviations and their Difference, and the normality of the data.

read less

Abstract: Bayesian estimation for 2 groups provides complete distributions of credible values for the effect size, group means and their difference, standard deviations and their difference, and the normality of the data. The method handles outliers. The decision rule can accept the null value (unlike traditional t tests) when certainty in the estimate is high (unlike Bayesian model comparison using Bayes factors). The method also yields precise estimates of statistical power for various research goals. The software and programs are free and run on Macintosh, Windows, and Linux platforms.

...read moreread less

Summary (2 min read)

Jump to: [1 Introduction] – [2 The Model] – [3 Preparing to run BEST] – [4.2 Running the model] – [4.3 Basic inferences] – [4.4 Checking convergence and fit] – [4.5 Working with individual parameters] – [5 An example with a single group] – [6 What next?] and [7 References]

1 Introduction

The BEST package provides a Bayesian alternative to a t test, providing much richer information about the samples and the difference in means than a simple p value.
Bayesian estimation for two groups provides complete distributions of credible values for the effect size, group means and their difference, standard deviations and their difference, and the normality of the data.
For a single group, distributions for the mean, standard deviation and normality are provided.
The decision rule can accept the null value (unlike traditional t tests) when certainty in the estimate is high (unlike Bayesian model comparison using Bayes factors).
The package also provides methods to estimate statistical power for various research goals.

2 The Model

To accommodate outliers the authors describe the data with a distribution that has fatter tails than the normal distribution, namely the t distribution.
(Note that the authors are using this as a convenient description of the data, not as a sampling distribution from which p values are derived.).
The data (y) are assumed to be independent and identically distributed (i.i.d.) draws from a t distribution with different mean (µ) and standard deviation (σ) for each population, and with a common normality parameter (ν), as indicated in the lower portion of Figure 1.
The default priors, with priors = NULL, are minimally informative: normal priors with large standard deviation for (µ), broad uniform priors for (σ), and a shifted-exponential prior for (ν), as described by Kruschke (2013).
These priors are indicated in the upper portion of Figure 1.

3 Preparing to run BEST

BEST uses the JAGS package (Plummer, 2003) to produce samples from the posterior distribution of each parameter of interest.
You will need to download JAGS from http://sourceforge.
Net/projects/mcmc-jags/ and install it before running BEST.
BEST also requires the packages rjags and coda, which should normally be installed at the same time as package BEST if you use the install.
Once installed, the authors need to load the BEST package at the start of each R session, which will also load rjags and coda and link to JAGS: > library(BEST).

4.2 Running the model

The authors run BESTmcmc and save the result in BESTout.
The authors do not use parallel processing here, but if your machine has at least 4 cores, parallel processing cuts the time by 50%.

4.3 Basic inferences

Also shown is the mean of the posterior probability, which is an appropriate point estimate of the true difference in means, the 95% Highest Density Interval (HDI), and the posterior probability that the difference is greater than zero.
An increase reaction time of 1 unit may indicate that users of the drug should not drive or operate equipment.
More interesting is the probability that the difference may be too small to matter.
But if most of the probability mass (say, 95%) lay within the ROPE, the authors would accept the null value for practical purposes.

4.4 Checking convergence and fit

The output from BESTmcmc has class BEST, which has a print method: > class [1].
'Rhat' is the potential scale reduction factor (at convergence, Rhat=1).
Increase the burnInSteps argument to BESTmcmc if any of the Rhats are too big.
Values of n.eff around 10,000 are needed for stable estimates of 95% credible intervals.
The function plotAll puts histograms of all the posterior distributions and the posterior predictive plots onto a single page . > plotAll.

4.5 Working with individual parameters

Objects of class BEST contain long vectors of simulated draws from the posterior distribution of each of the parameters in the model.
You may wish to look at the ratio of the variances rather than the difference in the standard deviations.
You can calculate a vector of draws from the posterior distribution, calculate summary statistics, and plot the distribution with plotPost : > <- BESTout$sigma1^2 / BESTout$sigma2^2 > median [1].

5 An example with a single group

Applying BEST to a single sample, or for differences in paired observations, works in much the same way as the two-sample method and uses the same function calls.
To run the model, simply use BESTmcmc with only one vector of observations.
Standard deviation, the normality parameter and effect size can be plotted individually, or on a single page with plotAll . > plotAll(BESTout1g).

6 What next?

The package includes functions to estimate the power of experimental designs: see the help pages for BESTpower and makeData for details on implementation and Kruschke (2013) for background.
If you want to know how the functions in the BEST package work, you can download the R source code from CRAN or from GitHub https://github.com/mikemeredith/BEST.
For a practical introduction see Kruschke (2015).

7 References

A tutorial with R, JAGS and Stan, also known as Doing Bayesian data analysis.
In 3rd International Workshop on Distributed Statistical Computing (DSC 2003).

Did you find this useful? Give us your feedback

Figures (10)

Figure 5: Posterior predictive plots together with a histogram of the data.

Figure 2: Default plot: posterior probability of the difference in means.

Figure 8: Default plot: posterior probability distribution for the mean.

Figure 1: Hierarchical diagram of the descriptive model for robust Bayesian estimation.

Figure 7: Posterior distribution of the ratio of the sample variances.

Figure 10: Posterior distribution of the sample variance.

Figure 6: All the posterior distributions and the posterior predictive plots.

Figure 3: Posterior probability of the difference in means with compVal=1.0 and ROPE ± 0.1.

Figure 9: All the posterior distributions and the posterior predictive plots.

Figure 4: Posterior plots for difference in standard deviation.

Content maybe subject to copyright Report

Bayesian Estimation Supersedes the t-Test

Mike Meredith and John Kruschke

October 13, 2021

1 Introduction

The BEST package provides a Bayesian alternative to a t test, providing much richer information

about the samples and the diﬀerence in means than a simple p value.

Bayesian estimation for two groups provides complete distributions of credible values for the

eﬀect size, group means and their diﬀerence, standard deviations and their diﬀerence, and the

normality of the data. For a single group, distributions for the mean, standard deviation and

normality are provided. The method handles outliers.

The decision rule can accept the null value (unlike traditional t tests) when certainty in the

estimate is high (unlike Bayesian model comparison using Bayes factors).

The package also provides methods to estimate statistical power for various research goals.

2 The Model

To accommodate outliers we describe the data with a distribution that has fatter tails than the

normal distribution, namely the t distribution. (Note that we are using this as a convenient

description of the data, not as a sampling distribution from which p values are derived.) The

relative height of the tails of the t distribution is governed by the shape parameter ν: when ν

is small, the distribution has heavy tails, and when it is large (e.g., 100), it is nearly normal.

Here we refer to ν as the normality parameter.

The data (y) are assumed to be independent and identically distributed (i.i.d.) draws from

a t distribution with diﬀerent mean (µ) and standard deviation (σ) for each population, and

with a common normality parameter (ν), as indicated in the lower portion of Figure 1.

The default priors, with priors = NULL, are minimally informative: normal priors with

large standard deviation for (µ), broad uniform priors for (σ), and a shifted-exponential prior

for (ν), as described by Kruschke (2013). You can specify your own priors by providing a

list: population means (µ) have separate normal priors, with mean muM and standard deviation

muSD; population standard deviations (σ) have separate gamma priors, with mode sigmaMode

and standard deviation sigmaSD; the normality parameter (ν) has a gamma prior with mean

nuMean and standard deviation nuSD. These priors are indicated in the upper portion of Figure 1.

For a general discussion see chapters 11 and 12 of Kruschke (2015).

Figure 1: Hierarchical diagram of the descriptive model for robust Bayesian estimation.

3 Preparing to run BEST

BEST uses the JAGS package (Plummer, 2003) to produce samples from the posterior distribu-

tion of each parameter of interest. You will need to download JAGS from http://sourceforge.

net/projects/mcmc-jags/ and install it before running BEST.

BEST also requires the packages rjags and coda, which should normally be installed at the

same time as package BEST if you use the install.packages function in R.

Once installed, we need to load the BEST package at the start of each R session, which will

also load rjags and coda and link to JAGS:

> library(BEST)

4 An example with two groups

4.1 Some example data

We will use hypothetical data for reaction times for two groups (N

= N

= 6), Group 1

consumes a drug which may increase reaction times while Group 2 is a control group that

consumes a placebo.

> y1 <- c(5.77, 5.33, 4.59, 4.33, 3.66, 4.48)

> y2 <- c(3.88, 3.55, 3.29, 2.59, 2.33, 3.59)

Based on previous experience with these sort of trials, we expect reaction times to be approxi-

mately 6 secs, but they vary a lot, so we’ll set muM = 6 and muSD = 2. We’ll use the default priors

for the other parameters: sigmaMode = sd(y), sigmaSD = sd(y)*5, nuMean = 30, nuSD = 30),

where y = c(y1, y2).

> priors <- list(muM = 6, muSD = 2)

4.2 Running the model

We run BESTmcmc and save the result in BESTout. We do not use parallel processing here,

but if your machine has at least 4 cores, parallel processing cuts the time by 50%.

> BESTout <- BESTmcmc(y1, y2, priors=priors, parallel=FALSE)

Compiling model graph

Resolving undeclared variables

Allocating nodes

Graph information:

Observed stochastic nodes: 12

Unobserved stochastic nodes: 5

Total graph size: 51

Initializing model

|++++++++++++++++++++++++++++++++++++++++++++++++++| 100%

Sampling from the posterior distributions:

|**************************************************| 100%

4.3 Basic inferences

The default plot (Figure 2) is a histogram of the posterior distribution of the diﬀerence in

means.

> plot(BESTout)

Difference of Means

− µ

0.0 0.5 1.0 1.5 2.0 2.5 3.0

95% HDI

0.266 2.6

mean = 1.44

1.2% < 0 < 98.8%

Figure 2: Default plot: posterior probability of the diﬀerence in means.

Also shown is the mean of the posterior probability, which is an appropriate point estimate

of the true diﬀerence in means, the 95% Highest Density Interval (HDI), and the posterior

probability that the diﬀerence is greater than zero. The 95% HDI does not include zero, and

Difference of Means

− µ

0.0 0.5 1.0 1.5 2.0 2.5 3.0

95% HDI

0.266 2.6

mean = 1.44

19.9% < 1 < 80.1%

1% in ROPE

Figure 3: Posterior probability of the diﬀerence in means with compVal=1.0 and ROPE ± 0.1.

the probability that the true value is greater than zero is shown as 98.8%. Compare this with

the output from a t test:

> t.test(y1, y2)

Welch Two Sample t-test

data: y1 and y2

t = 3.7624, df = 9.6093, p-value = 0.003977

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

0.6020466 2.3746201

sample estimates:

mean of x mean of y

4.693333 3.205000

Because we are dealing with a Bayesian posterior probability distribution, we can extract

much more information:



We can estimate the probability that the true diﬀerence in means is above (or below) an

arbitrary comparison value. For example, an increase reaction time of 1 unit may indicate

that users of the drug should not drive or operate equipment.



The probability that the diﬀerence in reaction times is precisely zero is zero. More inter-

esting is the probability that the diﬀerence may be too small to matter. We can deﬁne a

region of practical equivalence (ROPE) around zero, and obtain the probability that the

true value lies therein. For the reaction time example, a diﬀerence of ± 0.1 may be too

small to matter.

> plot(BESTout, compVal=1, ROPE=c(-0.1,0.1))

The annotations in (Figure 3) show a high probability that the reaction time increase is > 1.

In this case it’s clear that the eﬀect is large, but if most of the probability mass (say, 95%) lay

within the ROPE, we would accept the null value for practical purposes.

Difference of Std. Dev.s

− σ

−2 −1 0 1 2

95% HDI

−1.08 1.41

mode = 0.1

36.4% < 0 < 63.6%

Figure 4: Posterior plots for diﬀerence in standard deviation.

BEST deals appropriately with diﬀerences in standard deviations between the samples and

departures from normality due to outliers. We can check the diﬀerence in standard deviations

or the normality parameter with plot (Figure 4).

> plot(BESTout, which="sd")

The summary method gives us more information on the parameters of interest, including

derived parameters:

> summary(BESTout)

mean median mode HDI% HDIlo HDIup compVal %>compVal

mu1 4.750 4.735 4.715 95 3.880 5.66

mu2 3.310 3.290 3.266 95 2.592 4.09

muDiff 1.440 1.442 1.435 95 0.266 2.60 0 98.8

sigma1 1.000 0.886 0.736 95 0.379 1.92

sigma2 0.829 0.731 0.615 95 0.313 1.61

sigmaDiff 0.170 0.143 0.100 95 -1.084 1.41 0 63.6

nu 34.927 25.751 9.796 95 0.849 96.97

log10nu 1.375 1.411 1.540 95 0.550 2.11

effSz 1.680 1.658 1.612 95 0.190 3.24 0 98.8

Here we have summaries of posterior distributions for the derived parameters: diﬀerence

in means (muDiff), diﬀerence in standard deviations (sigmaDiff) and eﬀect size (effSz). As

with the plot command, we can set values for compVal and ROPE for each of the parameters of

interest:

> summary(BESTout, credMass=0.8, ROPEm=c(-0.1,0.1), ROPEsd=c(-0.15,0.15),

compValeff=1)

mean median mode HDI% HDIlo HDIup compVal %>compVal ROPElow

mu1 4.750 4.735 4.715 80 4.216 5.235

HTML Viewer

Frequently Asked Questions (8)

Q1. What are the contributions mentioned in the paper "Bayesian estimation supersedes the t-test" ?

The BEST package provides a Bayesian alternative to a t test, providing much richer information about the samples and the difference in means than a simple p value. For a single group, distributions for the mean, standard deviation and normality are provided. The package also provides methods to estimate statistical power for various research goals.

Q2. What packages are required to run BEST?

BEST also requires the packages rjags and coda, which should normally be installed at the same time as package BEST if you use the install.

Q3. What is the way to extract the data?

Since BEST objects are also data frames, the authors can use the $ operator to extract the columns the authors want:> names(BESTout)[1] "mu1" "mu2" "nu" "sigma1" "sigma2"> meanDiff <- (BESTout$mu1 - BESTout$mu2) > meanDiffGTzero <- mean(meanDiff > 0) > meanDiffGTzero[1]

Q4. what is the package for rjags?

Once installed, the authors need to load the BEST package at the start of each R session, which will also load rjags and coda and link to JAGS:> library(BEST)The authors will use hypothetical data for reaction times for two groups (N1 = N2 = 6), Group 1 consumes a drug which may increase reaction times while Group 2 is a control group that consumes a placebo.>

Q5. What is the method for estimating the power of experimental designs?

If you want to know how the functions in the BEST package work, you can download the R source code from CRAN or from GitHub https://github.com/mikemeredith/BEST.Bayesian analysis with computations performed by JAGS is a powerful approach to analysis.

Q6. What is the package for estimating statistical power?

You can specify your own priors by providing a list: population means (µ) have separate normal priors, with mean muM and standard deviation muSD; population standard deviations (σ) have separate gamma priors, with mode sigmaMode and standard deviation sigmaSD; the normality parameter (ν) has a gamma prior with mean nuMean and standard deviation nuSD.

Q7. What is the default prior for the other parameters?

We’ll use the default priors for the other parameters: sigmaMode = sd(y), sigmaSD = sd(y)*5, nuMean = 30, nuSD = 30), where y = c(y1, y2).> priors <- list(muM = 6, muSD = 2)The authors run BESTmcmc and save the result in BESTout.

Q8. What is the way to calculate the reaction times?

y1 <- c(5.77, 5.33, 4.59, 4.33, 3.66, 4.48) > y2 <- c(3.88, 3.55, 3.29, 2.59, 2.33, 3.59)Based on previous experience with these sort of trials, the authors expect reaction times to be approximately 6 secs, but they vary a lot, so we’ll set muM = 6 and muSD = 2.

Bayesian Estimation Supersedes the t Test

Summary (2 min read)

1 Introduction

2 The Model

3 Preparing to run BEST

4.2 Running the model

4.3 Basic inferences

4.4 Checking convergence and fit

4.5 Working with individual parameters

5 An example with a single group

6 What next?

7 References

Figures (10)

Citations

Cites background or methods from "Bayesian Estimation Supersedes the ..."

Cites background from "Bayesian Estimation Supersedes the ..."

Cites background or methods from "Bayesian Estimation Supersedes the ..."

References

"Bayesian Estimation Supersedes the ..." refers background in this paper

"Bayesian Estimation Supersedes the ..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (8)

Q1. What are the contributions mentioned in the paper "Bayesian estimation supersedes the t-test" ?

Q2. What packages are required to run BEST?

Q3. What is the way to extract the data?

Q4. what is the package for rjags?

Q5. What is the method for estimating the power of experimental designs?

Q6. What is the package for estimating statistical power?

Q7. What is the default prior for the other parameters?

Q8. What is the way to calculate the reaction times?