A primer on R2OpenBUGS using the simple linear regression example.

mikeksmith's posterous

10 years ago

[This article was first published on mikeksmith's posterous, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I make using OpenBUGS fun (and easier)!

I’ve been a BUGS, WinBUGS and OpenBUGS user for some time now (20 years and counting!). The combination of R and OpenBUGS using the R2OpenBUGS package allows the user to bring together data preparation, model specification, diagnostics and visualisation all in one script. It’s good stuff. The example script I’ve attached shows how to use R2OpenBUGS using the basic linear regression “line” example.

The first step is to specify the model. In R2OpenBUGS this can be done using an R-like function definition. Syntax for the model is the OpenBUGS language but with some minor changes, like if you specify that distributions are to be bounded.

The second step is to prepare the data. This is done by specifying a named list.The beauty of using R and R2OpenBUGS for this is that collating that list is a natural step from R.

Similarly we need to specify initial values for the MCMC chains. Again, this needs to be a named list, however in the case of the inits list, it pays to specify random variable functions for each node. This means that you can specify as many chains as you like with appropriate starting values, rather than having to explicitly give starting values for each chain. Cool huh?

Finally you run the thing. The bugs(…) function takes as input the model, data and inits – either as R objects or as external named files. Specify the total number of MCMC iterations to perform and the burnin number (you do know what we mean by burnin and MCMC iterations, right?). Specify the nodes to monitor… Basically you can monitor anything that isn’t data. That means that within the model file you can monitor stochastic nodes / random variables, function results e.g. predicted values and, critically, predictions. One common misconception about MCMC is that it takes too long compared to least squares maximum likelihood methods. However in MCMC you can do proper imputation of missing values, get interval estimates where normally you might need bootstrapping, get proper posterior predictive distributions, simulate outcomes from future trial designs all within one step.

One cool feature of OpenBUGS is the ability to save the state of the MCMC chains for future reuse – you can restart the MCMC from where you left off. This matters because you can update / sample, check convergence, if you need to sample further then restart and sample some more. It also means that you can run 10,000+ samples for inference about model parameters, but then run 100s or 1000 samples for predictions, fitted values or residuals. Save the size of your output by only sampling what you need!

The attached script walks through the simple example looking at a basic run, using CODA for model diagnostics, setting up predictions from the model and saving, restarting the MCMC chain.

Code available here: http://dl.dropbox.com/u/16406775/R2OpenBUGSline.R

R2OpenBUGSline.pdf Download this file

Permalink | Leave a comment »

To leave a comment for the author, please follow the link and comment on their blog: mikeksmith's posterous.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.