Bootstrapping and the boot package in R

Posted on May 21, 2009 by Jeromy Anglim in Uncategorized | 0 Comments

[This article was first published on Jeromy Anglim's Blog: Psychology, Statistics, & Research Design, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I was recently asked about options for bootstrapping. The following post sets out some applications of bootstrapping and strategies for implementing it in R.

I’ve found bootstrapping useful in several settings:

where the statistic I’m interested in is a little unusual: the average R-square across five separate regressions; the difference in the average correlation of a set of variables between two groups
non parametric statistics, such as the median
when assumptions such as normality of homoscedasticity are not satisfied

Bootstrapping in R
R is very cool for bootstrapping. I’ve mainly used the boot package and found it very good. In fact, it is a classic example of something that R makes easy. It’s easy to run loops in R, and R is excellent at taking output from one function and using it as input to another. This is the essence of bootstrapping: taking different samples of your data, getting a statistic for each sample (e.g., the mean, median, correlation, regression coefficient, etc.), and using the variability in the statistic across samples to indicate something about the standard error and confidence intervals for the statistic.

Quick-R has a good introduction to the boot package:
Here’s another introduction to the boot package
And another
Further information on the web can be found in John Fox’s article in relation to regression.

Bootstrapping in SPSS
You can do bootstrapping with SPSS. I seem to remember there being some Python add-on package that’s designed to make bootstrapping easier. I’ve never used it and I don’t imagine that it would be as easy to use as R given how difficult it is in SPSS to take SPSS output and process it further programmatically (even if the OMS is trying to make this easier). For certain specific tests you might be able to find already available macros (e.g., for indirect effects ).

Related Posts:

To leave a comment for the author, please follow the link and comment on their blog: Jeromy Anglim's Blog: Psychology, Statistics, & Research Design.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)