Bayesian First Aid: One Sample and Paired Samples t-test
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Student’s t-test is a staple of statistical analysis. A quick search on Google Scholar for “t-test” results in 170,000 hits in 2013 alone. In comparison, “Bayesian” gives 130,000 hits while “box plot” results in only 12,500 hits. To be honest, if I had to choose I would most of the time prefer a notched boxplot to a t-test. The t-test comes in many flavors: one sample, two-sample, paired samples and Welch’s. We’ll start with the two most simple; here follows the Bayesian First Aid alternatives to the one sample t-test and the paired samples t-test.
Bayesian First Aid is an attempt at implementing reasonable Bayesian alternatives to the classical hypothesis tests in R. For the rationale behind Bayesian First Aid see the original announcement and the description of the alternative to the binomial test. The delopment of Bayesian First Aid can be followed on GitHub. Bayesian First Aid is a work in progress and I’m grateful for any suggestion on how to improve it!
The Model
A straight forward alternative to the t-test would be to assume normality, add some non-informative priors to the mix and be done with it. However, one of great things with Bayesian data analysis is that it is easy to not assume normality. One alternative to the normal distribution, that still will fit normally distributed data well but that is more robust against outliers, is the t distribution. Hang on, you say, isn’t the t-test already using the t-distribution?. Right, the t-test uses the t-distribution as the distribution of the sample mean, here the trick is to assume it as the distribution of the data.
Instead of reinventing the wheel I’ll here piggyback on the work of John K. Kruschke who has developed a Bayesian estimation alternative to the t-test called Bayesian Estimation Supersedes the T-test, or BEST for short. The rationale and the assumptions behind BEST are well explained in a paper published 2013 in the Journal of Experimental Psychology (the paper is also a very pedagogical introduction to Bayesian estimation in general). That paper and more information regarding BEST is available on John Kruschkes web page. He has also made a nice video based on the paper (mostly focused on the two sample version though):
All information regarding BEST is given in the paper and the video, here is just a short rundown of the model for the one sample BEST: BEST assumes the data ($x$) is distributed as a t distribution which is more robust than a normal distribution due to its wider tails. Except for the mean ($\mu$) and the scale ($\sigma$) the t has one additional parameter, the degree-of-freedoms ($\nu$), where the lower $\nu$ is the wider the tails become. When $\nu$ gets larger the t distribution approaches the normal distribution. While it would be possible to fix $\nu$ to a single value BEST instead estimates $\nu$ allowing the t-distribution to become more or less normal depending on the data. Here is the full model for the one sample BEST:
The prior on $\nu$ is an exponential distribution with mean 29 shifted 1 to the right keeping $\nu$ away from zero. From the JEP 2013 paper: “This prior was selected because it balances nearly normal distributions ($\nu$ > 30) with heavy tailed distributions ($\nu$ < 30)”. The priors on $\mu$ and $\sigma$ are decided by the hyper parameters $M_\mu$, $S_\mu$, $L_\sigma$ and $H_\sigma$. By taking a peek at the data these parameters are set so that the resulting priors are extremely wide. While having a look at the data pre-analysis is generally not considered kosher, in practice this gives the same results as putting $\mathrm{Uniform}(-\infty,\infty)$ distributions on $\mu$ and $\sigma$.
The Model for Paired Samples
Here I use the simple solution. Instead of modeling the distribution of both groups and the paired differences the Bayesian First Aid alternative uses the same trick as the original paired samples t-test: Take the difference between each paired sample and model just the paired differences using the one sample procedure. Thus the alternative to the paired samples t-test is the same as the one sample alternative, the only difference is in how the data is prepared and the how the result is presented.
The bayes.t.test
function
The t.test
function is used to run all four versions of the t-test. Here I’ll just show the one sample and paired samples alternatives. The bayes.t.test
runs the Bayesian First Aid alternative to the t-test and has a function signature that is compatible with t.test
function. That is, if you just ran a t-test, say t.test(x, conf.int=0.8, mu=1)
, just prepend bayes.
and it should work out of the box.
The example data I’ll use to show off bayes.t.test
is from the 2002 Nature article The Value of Bees to the Coffee Harvest (doi:10.1038/417708a, pdf). In this article David W. Roubik argues that bees are important to the coffee harvest despite that the “self-pollinating African shrub Coffea arabica, a pillar of tropical agriculture, was considered to gain nothing from insect pollinators”. Supporting the argument is a data set of the mean average coffee yield (in kg / 10,000 m) for new world countries in 1961-1980, before the establishment of African honeybees, and in 1981-2001, when African honeybees had been more or less been naturalized. This data shows an increased yield after the introduction of bees and when analyzed using a paired t-test results in p = 0.04. This is compared with the increase in yield in old world countries, where the bees been busy buzzing all along, were a paired t-test gives p = 0.232 interpreted as “no change”. The full dataset is given in the table below and in this csv file (to match the analysis in the paper the csv does not include the Caribbean islands)
There are a couple of reasons for why it is not proper to use a t-test to analyze this data set. A t-test does not consider the geographical location of the countries nor is it clear what “population” the sample of countries is drawn from. I also feel tempted to mutter the old cliché “correlation does not imply causation”, surely there must have been many things except for the introduction of bees that changed in Bolivia between 1961 and 2001. Being aware of these objections I’m going to use it to nevertheless show off the paired bayes.t.test
.
Let’s first run the original analysis from the paper:
d <- read.csv("roubik_2002_coffe_yield.csv") new_yield_80 <- d$yield_61_to_80[d$world == "new"] new_yield_01 <- d$yield_81_to_01[d$world == "new"] t.test(new_yield_01, new_yield_80, paired = TRUE, alternative = "greater")