An Application of boot() to IV regression
[This article was first published on Coffee and Econometrics in the Morning, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Bootstrapping standard errors can be a useful technique when obtaining closed form for the standard error formula is difficult or intractable. In this post, I give an example of how to use R to create a bootstrap sampling distribution in the context of IV regression. Specifically, I use boot() to automatically augment a function of mine to resample the indices of my data set with replacement (see the code in the function below).Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In my application, I present a function that uses the boot library to report bootstrap standard errors for an instrumental variables regression. My ivboot() function builds on the iv() command I wrote for the tonymisc library.*
Now, onto the ivboot() example. Start by loading the tonymisc library and some data from the library, as well as the boot library.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(tonymisc) | |
library(boot) | |
data(mktshare) |
We can run the following R script to define the function ivboot().
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ivboot <- function(sec,fir,data,boots=500){ | |
dat <- data | |
sec <- as.formula(sec) | |
fir <- as.formula(fir) | |
thisiv <- iv(sec, fir,data) | |
booter <- function(dat,i){ | |
coef(iv(sec,fir,data=dat[i,])) | |
} | |
boot_coef <-coef(thisiv) | |
myboot <- boot(dat, booter, R=boots,stype="i") | |
thisiv$second$coefficients[,2] <- sd(myboot$t) | |
thisiv$second$coefficients[,3] <- thisiv$second$coefficients[,1]/thisiv$second$coefficients[,2] | |
thisiv$second$coefficients[,4] <- 2*pnorm(-abs(thisiv$second$coefficients[,3])) | |
colnames(thisiv$second$coefficients) = c("Estimate", "Std. Error", "z value", "Pr( >|z|)") | |
return(thisiv) | |
} |
When applied to data and an IV regression model, the ivboot() function creates an IV regression object that — when tonymisc is loaded — is compatible with mtable() output. The only difference between an ivboot()-created object and an iv() object is that the ivboot() object has standard errors that are based on the bootstrap distribution of the coefficient estimates (Statistics other than the second-stage coefficient estimates are not bootstrapped).
Here is some code to compare the bootstrap output to the analytical standard error output:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
set.seed(209) | |
myiv = iv(y~x1+x2+p, p~z1+z2, data=mktshare) | |
myivboot = ivboot(y~x1+x2+p, p~z1+z2, data=mktshare) | |
mtable(myiv, myivboot) |
On this small sample (N=100) of simulated data, the ivboot() command took less than a minute to run on my computer (timing of may vary depending on your computer). For much larger data sets, this will be slower. If you have a larger problem or lower standards (or higher standards, your choice), you can use the boots option to ivboot() to specify the number of bootstrap samples. Currently, I have set the default to 500, but you could specify boots = 200 if you want the command to run faster (boots = 10 will make it run even faster, but I don’t recommend that!).
Here is the mtable() output, which can easily be ported into LaTeX using the toLatex() command.
*This standard output from an mtable() extension to my iv() command provides quite a bit of information in a convenient format. Another nice feature of iv() is that iv()-created objects have first-stage summary information readily stored in the object for extraction and analysis.
To leave a comment for the author, please follow the link and comment on their blog: Coffee and Econometrics in the Morning.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.